The Capacity of Some Pólya String Models
Abstract
We study random string-duplication systems, which we call Pólya string models. These are motivated by DNA storage in living organisms, and certain random mutation processes that affect their genome. Unlike previous works that study the combinatorial capacity of string-duplication systems, or various string statistics, this work provides exact capacity or bounds on it, for several probabilistic models. In particular, we study the capacity of noisy string-duplication systems, including the tandem-duplication, end-duplication, and interspersed-duplication systems. Interesting connections are drawn between some systems and the signature of random permutations, as well as to the beta distribution common in population genetics.
Additional Information
The material in this paper was presented in part at the 2016 IEEE International Symposium on Information Theory.Attached Files
Submitted - etr142.pdf
Files
Name | Size | Download all |
---|---|---|
md5:434c0e69e7438d944b0fd26b37901320
|
357.2 kB | Preview Download |
Additional details
- Eprint ID
- 88948
- Resolver ID
- CaltechAUTHORS:20180820-100255874
- Created
-
2018-08-20Created from EPrint's datestamp field
- Updated
-
2021-08-18Created from EPrint's last_modified field
- Caltech groups
- Parallel and Distributed Systems Group
- Series Name
- Parallel and Distributed Systems Group Technical Reports
- Series Volume or Issue Number
- 142
- Other Numbering System Name
- PARADISE
- Other Numbering System Identifier
- etr-142