Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published May 2008 | Supplemental Material + Submitted + Published
Journal Article Open

Viral Population Estimation Using Pyrosequencing

Abstract

The diversity of virus populations within single infected hosts presents a major difficulty for the natural immune response as well as for vaccine design and antiviral drug therapy. Recently developed pyrophosphate-based sequencing technologies (pyrosequencing) can be used for quantifying this diversity by ultra-deep sequencing of virus samples. We present computational methods for the analysis of such sequence data and apply these techniques to pyrosequencing data obtained from HIV populations within patients harboring drug-resistant virus strains. Our main result is the estimation of the population structure of the sample from the pyrosequencing reads. This inference is based on a statistical approach to error correction, followed by a combinatorial algorithm for constructing a minimal set of haplotypes that explain the data. Using this set of explaining haplotypes, we apply a statistical model to infer the frequencies of the haplotypes in the population via an expectation–maximization (EM) algorithm. We demonstrate that pyrosequencing reads allow for effective population reconstruction by extensive simulations and by comparison to 165 sequences obtained directly from clonal sequencing of four independent, diverse HIV populations. Thus, pyrosequencing can be used for cost-effective estimation of the structure of virus populations, promising new insights into viral evolutionary dynamics and disease control strategies.

Additional Information

© 2008 Eriksson et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Received: July 2, 2007; Accepted: March 27, 2008; Published: May 9, 2008. N. Eriksson and L. Pachter were partially supported by the NSF (grants DMS-0603448 and CCF-0347992, respectively). N. Beerenwinkel was funded by a grant from the Bill and Melinda Gates Foundation through the Grand Challenges in Global Health Initiative. The NSF has played no role in any part of this work. The authors have declared that no competing interests exist. Author Contributions. Performed the experiments: YM SR CW BG MR RS. Analyzed the data: NE LP NB. Wrote the paper: NE LP NB

Attached Files

Published - journal.pcbi.1000074_1_.PDF

Submitted - 0707.0114.pdf

Supplemental Material - Figure_S3_1_.pdf

Supplemental Material - Figure_S4.pdf

Supplemental Material - Figure_Supp2.pdf

Supplemental Material - upp.1.pdf

Files

Figure_S3_1_.pdf
Files (2.7 MB)
Name Size Download all
md5:68213148e40be649ef1317f9e595fb47
9.3 kB Preview Download
md5:61905ed4cce36b090ff4455664f33c2c
10.0 kB Preview Download
md5:d96d2c0e6a80e81886c048551cdf277d
442.6 kB Preview Download
md5:6192f13b10c61279e8fcd75b0a30d162
24.5 kB Preview Download
md5:ef336f214472e0f136d529a563f7efc8
2.2 MB Preview Download
md5:25524a372b58313f57f7f4c73200ae3f
12.1 kB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 24, 2023