Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published January 2023 | Published
Journal Article Open

Metadata retrieval from sequence databases with ffq

Abstract

Motivation: Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction. Results: We present a command-line tool, called ffq, for querying user-generated data and metadata from sequence databases. Given an accession or a paper's DOI, ffq efficiently fetches metadata and links to raw data in JSON format. ffq's modularity and simplicity make it extensible to any genomic database exposing its data for programmatic access. Availability and implementation: ffq is free and open source, and the code can be found here: https://github.com/pachterlab/ffq.

Additional Information

© The Author(s) 2023. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. This work was motivated by the need to obtain metadata for Booeshaghi and Pachter (2020). We thank Ali Mortazavi for his suggestion to include ffq querying of the ENCODE database and Anders Goncalves da Silva, Andrea Telatin, Laura Luebbert and Phil Ewels for their contributions to the code base. This work was supported in part by National Institutes of Health (NIH) [U19MH114830]. Data availability. All data and code associated with this manuscript is available at https://github.com/pachterlab/ffq. Conflict of Interest: none declared.

Attached Files

Published - btac667.pdf

Files

btac667.pdf
Files (1.6 MB)
Name Size Download all
md5:28d847f26f23e2cf0729d82f72d203cd
1.6 MB Preview Download

Additional details

Created:
August 22, 2023
Modified:
December 22, 2023