Integrative genome modeling platform reveals essentiality of rare contact events in 3D genome organizations
Abstract
A multitude of sequencing-based and microscopy technologies provide the means to unravel the relationship between the three-dimensional organization of genomes and key regulatory processes of genome function. Here, we develop a multimodal data integration approach to produce populations of single-cell genome structures that are highly predictive for nuclear locations of genes and nuclear bodies, local chromatin compaction and spatial segregation of functionally related chromatin. We demonstrate that multimodal data integration can compensate for systematic errors in some of the data and can greatly increase accuracy and coverage of genome structure models. We also show that alternative combinations of different orthogonal data sources can converge to models with similar predictive power. Moreover, our study reveals the key contributions of low-frequency ('rare') interchromosomal contacts to accurately predicting the global nuclear architecture, including the positioning of genes and chromosomes. Overall, our results highlight the benefits of multimodal data integration for genome structure analysis, available through the Integrative Genome Modeling software package.
Additional Information
© The Author(s) 2022. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Received 22 August 2021; Accepted 18 May 2022; Published 11 July 2022. This work was supported by the National Institutes of Health (NIH; grants U54DK107981 and UM1HG011593 to F.A.), and an NSF CAREER grant (1150287 to F.A.). We thank the laboratories of J. Dekker (University of Massachusetts Medical School), B. Van Steensel (Netherlands Cancer Institute), T. Misteli (NIH) and A. Belmont (University of Illinois Urbana-Champaign) for kindly providing the experimental data (in situ Hi-C, lamina DamID, 3D HIPMap FISH, DNA SPRITE and SON TSA-seq) used for generating and validating our genome models. We thank W. Li for proofreading the section about the probability functions. Data availability: The following datasets were used to generate or validate the structures: ensemble Hi-C (4DN portal; accession code 4DNES2R6PUEK), lamin B1 DamID (4DN portal; accession code 4DNESXZ4FW4T), 3D HIPMap FISH (4DN portal; https://data.4dnucleome.org/publications/80007b23-7748-4492-9e49-c38400acbe60), single-cell SPRITE (4DN portal identifier: 4DNESJYGTI8S, private), SON TSA-seq (4DN portal; 4DNES85R9TIB), transcription data (ENCODE; accession code ENCSR735JKB). Super-resolution single-cell imaging data are available at the referenced papers. The pre-processed experimental inputs of different data sources (Hi-C, lamin B1 DamID, 3D HIPMap FISH and single-cell SPRITE) for the HFF cell line and the simulated HDSF population are available at https://doi.org/10.5281/zenodo.6540731. Other data (including configuration files and synthetic data input files) are available upon request. The configuration files and pre-processed data input files are sufficient to reproduce the structure populations with the IGM software. Code availability: The IGM platform is available at www.github.com/alberlab/igm/. This includes, but is not limited to, the source code, a README file detailing code installation and execution, accompanying documentation, and a demo that uses a reduced data input for users to familiarize with the input, expected outputs and execution steps. Contributions: L.B. and F.A. designed research. L.B., A.Y. and Y.Z. performed all calculations and data analysis. L.B., A.Y. and F.A. interpreted results and data analysis with input from X.J.Z. G.P., L.B. and A.Y. wrote software and documentation. S.A.Q. and M.G. contributed new data sources. E.H.F. provided data and help in data interpretation. L.B., A.Y. and F.A. wrote the manuscript with input from X.J.Z. All authors approved the final manuscript. The authors declare no competing interests. Peer review information: Nature Methods thanks Ming Hu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Lin Tang, in collaboration with the Nature Methods team.Attached Files
Published - s41592-022-01527-x.pdf
Submitted - 2021.08.22.457288v1.full.pdf
Supplemental Material - 41592_2022_1527_Fig10_ESM.webp
Supplemental Material - 41592_2022_1527_Fig11_ESM.webp
Supplemental Material - 41592_2022_1527_Fig12_ESM.webp
Supplemental Material - 41592_2022_1527_Fig13_ESM.webp
Supplemental Material - 41592_2022_1527_Fig14_ESM.webp
Supplemental Material - 41592_2022_1527_Fig7_ESM.webp
Supplemental Material - 41592_2022_1527_Fig8_ESM.webp
Supplemental Material - 41592_2022_1527_Fig9_ESM.webp
Supplemental Material - 41592_2022_1527_MOESM1_ESM.pdf
Supplemental Material - 41592_2022_1527_MOESM2_ESM.pdf
Files
Name | Size | Download all |
---|---|---|
md5:0a5a673acb7d82096beb7a4eb678f174
|
123.7 kB | Download |
md5:1e6f3018cd937a8ca0a07cfbccb3b9a9
|
216.6 kB | Download |
md5:b772bda88e511939ead310ba0f1d4d3a
|
5.2 MB | Preview Download |
md5:dd03117ad0ea93d16cf1d13114c3b76c
|
80.0 kB | Download |
md5:2f516b076b176c71f6ef9c8d0c476a21
|
161.4 kB | Download |
md5:d607d6ef9718703e3ec7dab2eea6da43
|
783.1 kB | Preview Download |
md5:54268e8741e0f798b644df34c3583646
|
1.1 MB | Preview Download |
md5:503198b1752e16368c1c878a468c4118
|
110.9 kB | Download |
md5:5e1139c19d1897e3c5230d69ec22d278
|
294.9 kB | Download |
md5:14cba596d3928e361129fdf3e2866d49
|
17.2 MB | Preview Download |
md5:420d44972109f1bbe526df3ba6693d7b
|
687.9 kB | Download |
md5:64720fa4549a407855023e6982e19e48
|
299.2 kB | Download |
Additional details
- Eprint ID
- 110401
- Resolver ID
- CaltechAUTHORS:20210824-174746931
- NIH
- U54DK107981
- NIH
- 1UM1HG011593
- NSF
- DBI-1150287
- Created
-
2021-08-24Created from EPrint's datestamp field
- Updated
-
2022-08-04Created from EPrint's last_modified field
- Caltech groups
- Division of Biology and Biological Engineering (BBE)