A Stochastic Model for Genomic Interspersed Duplication
Abstract
Mutation processes such as point mutation, insertion, deletion, and duplication (including tandem and interspersed duplication) have an important role in evolution, as they lead to genomic diversity, and thus to phenotypic variation. In this work, we study the expressive power of interspersed duplication, i.e., its ability to generate diversity, via a simple but fundamental stochastic model, where the length and the location of the subsequence that is duplicated and the point of insertion of the copy are chosen randomly. In contrast to combinatorial models, where the goal is to determine the set of possible outcomes regardless of their likelihood, in stochastic systems, we investigate the properties of the set of high-probability sequences. In particular we provide results regarding the asymptotic behavior of frequencies of symbols and short words in a sequence evolving through interspersed duplication. The study of such a systems is an important step towards the design and analysis of more realistic and sophisticated models of genomic mutation processes.
Attached Files
Submitted - etr129.pdf
Files
Name | Size | Download all |
---|---|---|
md5:d868d4edb6ab800e84978a616e71e65c
|
640.6 kB | Preview Download |
Additional details
- Eprint ID
- 54604
- Resolver ID
- CaltechAUTHORS:20150209-161532302
- Created
-
2015-02-10Created from EPrint's datestamp field
- Updated
-
2021-08-18Created from EPrint's last_modified field
- Caltech groups
- Parallel and Distributed Systems Group
- Other Numbering System Name
- Paradise
- Other Numbering System Identifier
- ETR129