Published January 14, 2023
| Supplemental Material + Submitted
Discussion Paper
Open
Mechanistic modeling with a variational autoencoder for multimodal single-cell RNA sequencing data
Chicago
Abstract
We motivate and present biVI, which combines the variational autoencoder framework of scVI with biophysically motivated, bivariate models for nascent and mature RNA distributions. In simulated benchmarking, biVI accurately recapitulates key properties of interest, including cell type structure, parameter values, and copy number distributions. In biological datasets, biVI provides a route for the identification of the biophysical mechanisms underlying differential expression. The analytical approach outlines a generalizable strategy for representing multimodal datasets generated by single-cell RNA sequencing.
Additional Information
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license. M.C., G.G., T.C., and L.P. were partially funded by IGVF-1-UCI.IGVF and NIH U19MH114830. Y.C. was partially funded by T32 GM007377. G.G. thanks Drs. Ido Golding and Heng Xu for the inspiration leading to the explanatory model for the zero-inflated negative binomial distribution in Section S1.4. The RNA illustrations used in Figures 1, 2, S1, and S2 were derived from the DNA Twemoji by Twitter, Inc., used under the CC-BY 4.0 license. We thank the Caltech Bioinformatics Resource Center for GPU resources that helped in performing the analyses. Data availability. Simulated datasets, simulated parameters used to generate them, and Allen dataset B08 and its associated metadata are available in the Zenodo package 7497222. All analysis scripts and notebooks are available at https://github.com/pachterlab/CGCCP_2023. The repository also contains a Google Colaboratory demonstration notebook applying the methods to a small human blood cell dataset. The authors have declared no competing interest.Attached Files
Submitted - 2023.01.13.523995v1.full.pdf
Supplemental Material - media-1.pdf
Files
2023.01.13.523995v1.full.pdf
Files
(12.3 MB)
Name | Size | Download all |
---|---|---|
md5:29f55873752dcb65b857a22e4b5c8c9a
|
5.0 MB | Preview Download |
md5:45b7525ab6332258610c9d1a8397d051
|
7.3 MB | Preview Download |
Additional details
- Eprint ID
- 120153
- Resolver ID
- CaltechAUTHORS:20230316-182533000.39
- NIH
- U19MH114830
- Impact of Genomic Variation on Function (IGVF) Consortium
- NIH Predoctoral Fellowship
- T32 GM007377
- Created
-
2023-03-18Created from EPrint's datestamp field
- Updated
-
2023-03-18Created from EPrint's last_modified field
- Caltech groups
- Division of Biology and Biological Engineering (BBE)