Viral Population Estimation Using Pyrosequencing
Abstract
The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate-based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis of such sequence data and apply these techniques to pyrosequencing data obtained from HIV populations within patients harboring drug-resistant virus strains. Our main result is the estimation of the population structure of the sample from the pyrosequencing reads. This inference is based on a statistical approach to error correction, followed by a combinatorial algorithm for constructing a minimal set of haplotypes that explain the data. Using this set of explaining haplotypes, we apply a statistical model to infer the frequencies of the haplotypes in the population via an expectation–maximization (EM) algorithm. We demonstrate that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations. Thus, pyrosequencing can be used for cost-effective estimation of the structure of virus populations, promising new insights into viral evolutionary dynamics and disease control strategies.
Additional Information
© 2008 Eriksson et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Received: July 2, 2007; Accepted: March 27, 2008; Published: May 9, 2008. N. Eriksson and L. Pachter were partially supported by the NSF (grants DMS-0603448 and CCF-0347992, respectively). N. Beerenwinkel was funded by a grant from the Bill and Melinda Gates Foundation through the Grand Challenges in Global Health Initiative. The NSF has played no role in any part of this work. The authors have declared that no competing interests exist. Author Contributions. Performed the experiments: YM SR CW BG MR RS. Analyzed the data: NE LP NB. Wrote the paper: NE LP NBAttached Files
Published - journal.pcbi.1000074_1_.PDF
Submitted - 0707.0114.pdf
Supplemental Material - Figure_S3_1_.pdf
Supplemental Material - Figure_S4.pdf
Supplemental Material - Figure_Supp2.pdf
Supplemental Material - upp.1.pdf
Files
Name | Size | Download all |
---|---|---|
md5:68213148e40be649ef1317f9e595fb47
|
9.3 kB | Preview Download |
md5:61905ed4cce36b090ff4455664f33c2c
|
10.0 kB | Preview Download |
md5:d96d2c0e6a80e81886c048551cdf277d
|
442.6 kB | Preview Download |
md5:6192f13b10c61279e8fcd75b0a30d162
|
24.5 kB | Preview Download |
md5:ef336f214472e0f136d529a563f7efc8
|
2.2 MB | Preview Download |
md5:25524a372b58313f57f7f4c73200ae3f
|
12.1 kB | Preview Download |
Additional details
- PMCID
- PMC2323617
- Eprint ID
- 74797
- Resolver ID
- CaltechAUTHORS:20170306-133352205
- NSF
- DMS-0603448
- NSF
- CCF-0347992
- Bill and Melinda Gates Foundation
- Created
-
2017-03-06Created from EPrint's datestamp field
- Updated
-
2021-11-11Created from EPrint's last_modified field