Ab initio reconstruction of cell type–specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs
Abstract
Massively parallel cDNA sequencing (RNA-Seq) provides an unbiased way to study a transcriptome, including both coding and noncoding genes. Until now, most RNA-Seq studies have depended crucially on existing annotations and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We applied it to mouse embryonic stem cells, neuronal precursor cells and lung fibroblasts to accurately reconstruct the full-length gene structures for most known expressed genes. We identified substantial variation in protein coding genes, including thousands of novel 5′ start sites, 3′ ends and internal coding exons. We then determined the gene structures of more than a thousand large intergenic noncoding RNA (lincRNA) and antisense loci. Our results open the way to direct experimental manipulation of thousands of noncoding RNAs and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes.
Additional Information
© 2010 Macmillan Publishers Limited. Received 10 March; accepted 6 April; published online 2 May 2010; corrected after print 9 July 2010. We thank M. Wernig (MIT) for providing NPC; M. Lin and M. Kellis (MIT) for CSF code; the Broad Sequencing Platform for sample sequencing; L. Gaffney for assistance with graphics; and C. Burge, J. Merkin, R. Bradley and members of Lander and Regev laboratories—in particular, M. Yassour, T. Mikkelsen and I. Amit—for discussions. A.R. and J.L.R. were supported by the Merkin Family Foundation for Stem Cell Research at the Broad Institute. M. Guttman was supported by a Vertex scholarship. Work was supported by a Burroughs Wellcome Fund Career Award at the Scientific Interface, a US National Institutes of Health PIONEER award, a US National Human Genome Research Institute (NHGRI) R01 grant and the Howard Hughes Medical Institute (A.R.), and NHGRI and the Broad Institute of MIT and Harvard (E.S.L.). Author Contributions: M. Guttman and M. Garber conceived the project, designed research, implemented Scripture, performed computational analysis and wrote the paper. A.G., C.N. and J.Z.L. oversaw cDNA sequencing, provided molecular biology advice and helped to edit the manuscript. J.D. constructed cDNA libraries, performed validation experiments and helped to edit the manuscript. J.R. implemented components of Scripture and provided computational support and technical advice. X.A., L.F. and M.J.K. constructed cDNA libraries. J.L.R. provided reagents and helped edit the manuscript. E.S.L. designed research direction and wrote the paper. A.R. provided cDNA sequencing guidance, conceived the project, designed research direction and wrote the paper. The authors declare no competing financial interests.Attached Files
Accepted Version - nihms194494.pdf
Supplemental Material - nbt.1633-S1.pdf
Supplemental Material - nbt.1633-S2.xls
Supplemental Material - nbt.1633-S3.xls
Supplemental Material - nbt.1633-S4.zip
Supplemental Material - nbt.1633-S5.zip
Supplemental Material - nbt.1633-S6.zip
Supplemental Material - nbt.1633-S7.zip
Erratum - nbt0710-756b.pdf
Files
Name | Size | Download all |
---|---|---|
md5:0ce6a9fe8ef9c29d57e9bc74bd2e469c
|
3.2 MB | Preview Download |
md5:143a48395f3a030de45b18e7313a9e95
|
15.8 MB | Preview Download |
md5:3c9f062a208ef861b247dbec76604612
|
15.2 MB | Preview Download |
md5:467aff5e994371e9eb2648278e5904b5
|
63.4 kB | Preview Download |
md5:39f8410f2855536e9fbabfc35472641e
|
13.8 kB | Download |
md5:0e2c60353c7dcd0694bdfbcaf7451941
|
43.4 MB | Preview Download |
md5:e5a9f162413f82e7c5607b2dec42f2d8
|
38.6 MB | Preview Download |
md5:ca1d6a8abd1d80e3ed10eed6b414514a
|
10.2 kB | Download |
md5:b603e4b25d0fe726f18c284406041540
|
1.3 MB | Preview Download |
Additional details
- PMCID
- PMC2868100
- Eprint ID
- 72232
- Resolver ID
- CaltechAUTHORS:20161122-073633977
- Broad Institute of MIT and Harvard
- Vertex Scholarship
- Burroughs Wellcome Fund
- NIH
- National Human Genome Research Institute
- Howard Hughes Medical Institute (HHMI)
- Created
-
2016-11-22Created from EPrint's datestamp field
- Updated
-
2023-06-01Created from EPrint's last_modified field