Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published February 14, 2022 | Submitted
Report Open

RNA velocity unraveled

Abstract

We perform a thorough analysis of RNA velocity methods, with a view towards understanding the suitability of the various assumptions underlying popular implementations. In addition to providing a self-contained exposition of the underlying mathematics, we undertake simulations and perform controlled experiments on biological datasets to assess workflow sensitivity to parameter choices and underlying biology. Finally, we argue for a more rigorous approach to RNA velocity, and present a framework for Markovian analysis that points to directions for improvement and mitigation of current problems.

Additional Information

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license. This version posted February 13, 2022. G.G., M.F., T.C., and L.P. were partially funded by NIH U19MH114830. The DNA and RNA illustrations used in Figure 3 are derived from the DNA Twemoji by Twitter, Inc., used under CC-BY 4.0. The palette used in Figures 6, 8, 10, S3, and S5 is derived from dutchmasters by EdwinTh. G.G. thanks Dr. John J. Vastola for fruitful discussions about landscape representations of biophysical systems. Data Availability: The datasets analyzed for Section 3.1, as outlined in Section 6.1, are listed in Table 1. The datasets released by Desai et al. were collated from the Sequence Read Archive (runs SRR14713295 for dmso and SRR14713295 for idu) [155]. The datasets released by 10x Genomics were obtained from https://support.10xgenomics.com/single-cell-gene-expression/datasets. The processed human forebrain dataset generated by La Manno et al. [1] was obtained from http://pklab.med.harvard.edu/velocyto/hgForebrainGlut/hgForebrainGlut.loom, as used in the velocyto documentation. The processed loom files generated by the three work flows are available at the CaltechData repository, at https://data.caltech.edu/records/20030. All Python scripts and notebooks necessary to reproduce the results of this study are available at https://github.com/pachterlab/GFCP_2022. Author Contributions: Conceived of the project: G.G. and L.P. Wrote scripts/notebooks for pre-processing, simulation, and analysis: G.G., M.F., and T.C. Analyzed and interpreted the data: G.G., M.F., T.C., and L.P. Wrote and edited the manuscript: G.G., M.F., T.C. and L.P. The authors have declared no competing interest.

Attached Files

Submitted - 2022.02.12.480214v1.full.pdf

Files

2022.02.12.480214v1.full.pdf
Files (14.6 MB)
Name Size Download all
md5:7d293db6252608681b8d119bbed90244
14.6 MB Preview Download

Additional details

Created:
August 20, 2023
Modified:
December 13, 2023