Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published July 15, 2017 | Submitted
Journal Article Open

Pseudoalignment for metagenomic read assignment

Abstract

Motivation: Read assignment is an important first step in many metagenomic analysis workflows, providing the basis for identification and quantification of species. However ambiguity among the sequences of many strains makes it difficult to assign reads at the lowest level of taxonomy, and reads are typically assigned to taxonomic levels where they are unambiguous. We explore connections between metagenomic read assignment and the quantification of transcripts from RNA-Seq data in order to develop novel methods for rapid and accurate quantification of metagenomic strains. Results: We find that the recent idea of pseudoalignment introduced in the RNA-Seq context is highly applicable in the metagenomics setting. When coupled with the Expectation-Maximization (EM) algorithm, reads can be assigned far more accurately and quickly than is currently possible with state of the art software, making it possible and practical for the first time to analyze abundances of individual genomes in metagenomics projects.

Additional Information

© The Author 2017. Published by Oxford University Press. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices). Received on October 18, 2016; revised on January 23, 2017; editorial decision on February 15, 2017; accepted on February 17, 2017. Published: 21 February 2017. We thank readers of preprints of this manuscript for helpful suggestions that have improved our method and its description in the paper. H.P. was supported by an NSF graduate research fellowship. P.M. was partially supported by a Fulbright fellowship. L.S and L.P. were partially supported by NIH R01 HG006129 and NIH R01 DK094699. Conflict of Interest: none declared.

Attached Files

Submitted - 1510.07371.pdf

Files

1510.07371.pdf
Files (1.6 MB)
Name Size Download all
md5:afa537b3beb55a7ae0ec1b01424475ae
1.6 MB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 24, 2023