Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published July 2020 | Submitted + Published + Supplemental Material
Journal Article Open

Dissecting the regulatory activity and sequence content of loci with exceptional numbers of transcription factor associations

Abstract

DNA-associated proteins (DAPs) classically regulate gene expression by binding to regulatory loci such as enhancers or promoters. As expanding catalogs of genome-wide DAP binding maps reveal thousands of loci that, unlike the majority of conventional enhancers and promoters, associate with dozens of different DAPs with apparently little regard for motif preference, an understanding of DAP association and coordination at such regulatory loci is essential to deciphering how these regions contribute to normal development and disease. In this study, we aggregated publicly available ChIP-seq data from 469 human DAPs assayed in three cell lines and integrated these data with an orthogonal data set of 352 nonredundant, in vitro–derived motifs mapped to the genome within DNase I hypersensitivity footprints to characterize regions with high numbers of DAP associations. We establish a generalizable definition for high occupancy target (HOT) loci and identify putative driver DAP motifs in HepG2 cells, including HNF4A, SP1, SP5, and ETV4, that are highly prevalent and show sequence conservation at HOT loci. The number of different DAPs associated with an element is positively associated with evidence of regulatory activity, and by systematically mutating 245 HOT loci with a massively parallel mutagenesis assay, we localized regulatory activity to a central core region that depends on the motif sequences of our previously nominated driver DAPs. In sum, this work leverages the increasingly large number of DAP motif and ChIP-seq data publicly available to explore how DAP associations contribute to genome-wide transcriptional regulation.

Additional Information

© 2020 Ramaker et al.; Published by Cold Spring Harbor Laboratory Press. This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/. Received December 26, 2019; accepted in revised form June 24, 2020. We thank the Yijun Ruan and Struan Grant laboratories for uniform processing of the ENCODE ChIA-PET and Promoter Capture-C data, respectively. We also thank Eric Mendenhall and Surya Chhetri for their assistance with the alignment and quality control analysis of ChIP-seq experiments in HepG2, and particularly thank them and the Myers/Mendenhall ENCODE group members, including Mark Mackiewicz, Kim Newberry, Dianna Moore, Laurel Brandsmeier, Sarah Meadows, and Megan McEown, for generating the high-quality ChIP-seq data used in this paper. We thank Alessandra Chesi and the Struan F.A. Grant lab for generously providing their processed HepG2 Capture-C data. This work was supported by National Institutes of Health (NIH) grants U54 HG006998-0 (to R.M.M. and E. Mendenhall) and 5T32GM008361-21 (to R.C.R. and A.A.H.). Data Access: All raw and processed sequencing data generated in this study have been submitted to the NCBI Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) under the accession number GSE142566. Author contributions: R.C.R., A.A.H., and E.C.P, conducted reporter assay experiments; R.C.R., A.A.H., and S.T.G. performed computational analysis of ChIP-seq, DFM, and 3D-chromatin interaction data; and R.C.R., A.A.H., E.C.P., S.T.G., S.J.C., B.W., and R.M.M. performed data interpretation and wrote the manuscript. The authors declare no competing interests.

Attached Files

Published - 939.full.pdf

Submitted - 2019.12.21.885830v2.full.pdf

Supplemental Material - SupplementalTables_061120.xlsx

Supplemental Material - Supplemental_Materials_revised.pdf

Supplemental Material - Supplemental_Scripts.zip

Files

939.full.pdf
Files (139.8 MB)
Name Size Download all
md5:0be2f4014e45dff07558d31dd9a788a8
1.5 MB Preview Download
md5:29b228a2a56b0ce2266da513d9e4cb40
4.4 MB Preview Download
md5:70d02943f1697f06abb2bbf349e711ba
5.5 MB Preview Download
md5:fb61f581b98a6521bfe059fd7a42da44
99.5 MB Download
md5:24cab03f5fcb38fc514f7f2051ceb7ce
28.9 MB Preview Download

Additional details

Created:
August 19, 2023
Modified:
December 22, 2023