Sequence Design for a Test Tube of Interacting Nucleic Acid Strands
- Creators
- Wolfe, Brian R.
- Pierce, Niles A.
Abstract
We describe an algorithm for designing the equilibrium base-pairing properties of a test tube of interacting nucleic acid strands. A target test tube is specified as a set of desired "on-target" complexes, each with a target secondary structure and target concentration, and a set of undesired "off-target" complexes, each with vanishing target concentration. Sequence design is performed by optimizing the test tube ensemble defect, corresponding to the concentration of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of the test tube. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, the structural ensemble of each on-target complex is hierarchically decomposed into a tree of conditional subensembles, yielding a forest of decomposition trees. Candidate sequences are evaluated efficiently at the leaf level of the decomposition forest by estimating the test tube ensemble defect from conditional physical properties calculated over the leaf subensembles. As optimized subsequences are merged toward the root level of the forest, any emergent defects are eliminated via ensemble redecomposition and sequence reoptimization. After successfully merging subsequences to the root level, the exact test tube ensemble defect is calculated for the first time, explicitly checking for the effect of the previously neglected off-target complexes. Any off-target complexes that form at appreciable concentration are hierarchically decomposed, added to the decomposition forest, and actively destabilized during subsequent forest reoptimization. For target test tubes representative of design challenges in the molecular programming and synthetic biology communities, our test tube design algorithm typically succeeds in achieving a normalized test tube ensemble defect ≤1% at a design cost within an order of magnitude of the cost of test tube analysis.
Additional Information
© 2014 American Chemical Society. Received: March 19, 2014; Published: October 20, 2014. The authors thank J. S. Bois, J. N. Zadeh, and N. J. Porubsky for helpful discussions and M. Kirk for assistance with bibliography data entry. This work was funded by the National Science Foundation via the Molecular Programming Project (NSFCCF-0832824 and NSF-CCF-1317694), by the Gordon and Betty Moore Foundation (GBMF2809), by the John Simon Guggenheim Memorial Foundation, and by the Beckman Institute at Caltech.Attached Files
Supplemental Material - sb5002196_si_001.pdf
Supplemental Material - sb5002196_si_002.zip
Files
Name | Size | Download all |
---|---|---|
md5:786919f0c4bac0507b83f49ade4cca72
|
1.5 MB | Preview Download |
md5:5ba69e6fa2c6ce461eae4afc27f4f0c8
|
35.2 kB | Preview Download |
Additional details
- Eprint ID
- 50826
- DOI
- 10.1021/sb5002196
- Resolver ID
- CaltechAUTHORS:20141027-090434819
- CCF-0832824
- NSF
- CCF-1317694
- NSF
- GBMF2809
- Gordon and Betty Moore Foundation
- John Simon Guggenheim Foundation
- Caltech Beckman Institute
- Created
-
2014-10-27Created from EPrint's datestamp field
- Updated
-
2021-11-10Created from EPrint's last_modified field