The Capacity of String-Duplication Systems
Abstract
It is known that the majority of the human genome consists of duplicated sequences. Furthermore, it is believed that a significant part of the rest of the genome also originated from duplicated sequences and has mutated to its current form. In this paper, we investigate the possibility of constructing an exponentially large number of sequences from a short initial sequence using simple duplication rules, including those resembling genomic-duplication processes. In other words, our goal is to find the capacity, or the expressive power, of these string-duplication systems. Our results include exact capacities, and bounds on the capacities, of four fundamental string-duplication systems. The study of these fundamental biologically inspired systems is an important step toward modeling and analyzing more complex biological processes.
Additional Information
© 2016 IEEE. Manuscript received November 24, 2014; revised July 10, 2015; accepted October 26, 2015. Date of publication December 4, 2015; date of current version January 18, 2016. This work was supported by the National Science Foundation within the Expeditions in Computing Program through the Molecular Programming Project. This paper was presented in part at the 2014 IEEE International Symposium on Information Theory.Attached Files
Submitted - 1401.4634v1.pdf
Files
Name | Size | Download all |
---|---|---|
md5:015dc0b70d09ea6fd97b0f7a2aa12e6c
|
156.0 kB | Preview Download |
Additional details
- Eprint ID
- 63773
- Resolver ID
- CaltechAUTHORS:20160119-142638953
- NSF
- Created
-
2016-01-19Created from EPrint's datestamp field
- Updated
-
2021-11-10Created from EPrint's last_modified field