Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published April 6, 2018 | Published
Journal Article Open

Research: A comprehensive and quantitative exploration of thousands of viral genomes

Abstract

The complete assembly of viral genomes from metagenomic datasets (short genomic sequences gathered from environmental samples) has proven to be challenging, so there are significant blind spots when we view viral genomes through the lens of metagenomics. One approach to overcoming this problem is to leverage the thousands of complete viral genomes that are publicly available. Here we describe our efforts to assemble a comprehensive resource that provides a quantitative snapshot of viral genomic trends – such as gene density, noncoding percentage, and abundances of functional gene categories – across thousands of viral genomes. We have also developed a coarse-grained method for visualizing viral genome organization for hundreds of genomes at once, and have explored the extent of the overlap between bacterial and bacteriophage gene pools. Existing viral classification systems were developed prior to the sequencing era, so we present our analysis in a way that allows us to assess the utility of the different classification systems for capturing genomic trends.

Additional Information

© 2018 eLife Sciences Publications Ltd. Subject to a Creative Commons Attribution license, except where otherwise noted. Received 17 September 2017; Accepted 30 March 2018; Published 19 April 2018. We thank Arup Chakraborty, Markus Covert and Richard Neher for their helpful suggestions through the review process. We would additionally like to thank Bill Gelbart, Eddy Rubin, Forest Rohwer, Eugene Shakhnovich, Matt Morgan, and members of the Phillips Lab and the Boundaries of Life Initiative for helpful discussions. We would like to especially thank Helen Foley for helping us run BLAST using Amazon cloud computing services. This study was supported by the National Science Foundation (Graduate Research Fellowship; DGE-1144469), the John Templeton Foundation (Boundaries of Life Initiative; 51250), the National Institute of Health (Maximizing Investigator's Research Award; RFA-GM-17-002), the National Institute of Health (Exceptional Unconventional Research Enabling Knowledge Acceleration; R01- GM098465), and the National Science Foundation (NSF PHY11-25915) through the 2015 Cellular Evolution course at the Kavli Institute for Theoretical Physics. Author contributions: Gita Mahmoudabadi, Conceptualization, Data curation, Formal analysis, Supervision, Validation, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing; Rob Phillips, Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Project administration, Writing—review and editing. The authors declare that no competing interests exist.

Attached Files

Published - elife-31955-v2.pdf

Files

elife-31955-v2.pdf
Files (6.4 MB)
Name Size Download all
md5:252a79196a80f51e9ec67eb589202789
6.4 MB Preview Download

Additional details

Created:
August 19, 2023
Modified:
October 18, 2023