Sequence Design for a Test Tube of Interacting Nucleic Acid Strands

Creators: Wolfe, Brian R.; Pierce, Niles A.

Abstract

We describe an algorithm for designing the equilibrium base-pairing properties of a test tube of interacting nucleic acid strands. A target test tube is specified as a set of desired "on-target" complexes, each with a target secondary structure and target concentration, and a set of undesired "off-target" complexes, each with vanishing target concentration. Sequence design is performed by optimizing the test tube ensemble defect, corresponding to the concentration of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of the test tube. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, the structural ensemble of each on-target complex is hierarchically decomposed into a tree of conditional subensembles, yielding a forest of decomposition trees. Candidate sequences are evaluated efficiently at the leaf level of the decomposition forest by estimating the test tube ensemble defect from conditional physical properties calculated over the leaf subensembles. As optimized subsequences are merged toward the root level of the forest, any emergent defects are eliminated via ensemble redecomposition and sequence reoptimization. After successfully merging subsequences to the root level, the exact test tube ensemble defect is calculated for the first time, explicitly checking for the effect of the previously neglected off-target complexes. Any off-target complexes that form at appreciable concentration are hierarchically decomposed, added to the decomposition forest, and actively destabilized during subsequent forest reoptimization. For target test tubes representative of design challenges in the molecular programming and synthetic biology communities, our test tube design algorithm typically succeeds in achieving a normalized test tube ensemble defect ≤1% at a design cost within an order of magnitude of the cost of test tube analysis.

Additional Information

© 2014 American Chemical Society. Received: March 19, 2014; Published: October 20, 2014. The authors thank J. S. Bois, J. N. Zadeh, and N. J. Porubsky for helpful discussions and M. Kirk for assistance with bibliography data entry. This work was funded by the National Science Foundation via the Molecular Programming Project (NSFCCF-0832824 and NSF-CCF-1317694), by the Gordon and Betty Moore Foundation (GBMF2809), by the John Simon Guggenheim Memorial Foundation, and by the Beckman Institute at Caltech.

Attached Files

Supplemental Material - sb5002196_si_001.pdf

Supplemental Material - sb5002196_si_002.zip

Files

sb5002196_si_001.pdf

Files (1.5 MB)

Name	Size	Download all
sb5002196_si_001.pdf md5:786919f0c4bac0507b83f49ade4cca72	1.5 MB	Preview Download
sb5002196_si_002.zip md5:5ba69e6fa2c6ce461eae4afc27f4f0c8	35.2 kB	Preview Download

Additional details

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes