Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published May 20, 2022 | Supplemental Material + Submitted
Report Open

Efficient querying of genomic reference databases with gget

Abstract

A recurring challenge in interpreting genomic data is the assessment of results in the context of existing reference databases. Currently, there is no tool implementing automated, easy programmatic access to curated reference information stored in a diverse collection of large, public genomic databases. gget is a free and open-source command-line tool and Python package that enables efficient querying of genomic reference databases, such as Ensembl. gget consists of a collection of separate but interoperable modules, each designed to facilitate one type of database querying required for genomic data analysis in a single line of code. The manual and source code are available at https://github.com/pachterlab/gget.

Additional Information

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license. Version 1 - May 19, 2022; Version 2 - May 25, 2022; Version 3 - May 27, 2022. We thank Kyung Hoi (Joseph) Min for advice on the command-line interface, Matteo Guareschi for advice on Windows operability, and A. Sina Booeshaghi, Kristján Eldjárn Hjörleifsson, and Ángel Gálvez-Merchán for insightful discussions about gget. Illustrations in Figure 1 and Supplementary Figure 1 were created with BioRender.com. Thanks to the wonderful staff at Dash Coffee Bar in Pasadena, who occasionally gave LL free banana bread to sustain this work. LL was supported by funding from the Biology and Bioengineering Division at the California Institute of Technology and the Chen Graduate Innovator Grant CHEN.SYS3.CGIAFY21. LP was supported in part by NIH U19MH114830. Conflict of Interest: none declared.

Attached Files

Submitted - 2022.05.17.492392v3.full.pdf

Supplemental Material - media-1.pdf

Files

2022.05.17.492392v3.full.pdf
Files (2.9 MB)
Name Size Download all
md5:4a4b95a50d282a75ec6be37453fd11bc
1.7 MB Preview Download
md5:8f8d7d102903cda1bf1e0bebc78f0a7c
1.2 MB Preview Download

Additional details

Created:
August 20, 2023
Modified:
December 22, 2023