1. Introduction
Unraveling the role of the non-coding regulatory elements in gene regulation is important to gain mechanistic insight into key biological processes such as cell fate determination and disease pathogenesis [1]. Accordingly, tremendous effort has been deployed to annotate these regulatory elements in different tissues thanks to innovative sequencing techniques and computational tools [2,3,4]. These techniques include chromatin immunoprecipitation with sequencing (ChIP-seq), assay for transposase accessible chromatin with sequencing (ATAC-seq), and chromatin capture conformation with high-throughput sequencing (Hi-C) [5,6]. In addition, computational methods such as the activity-by-contact (ABC) algorithm [3] and other tools [7,8] have been developed to integrate these datasets and identify enhancer-gene (E-G) pairs required for development and disease [9,10,11].
Besides the direct identification of E-G pairs, other studies have demonstrated that enhancers and promoters interact dynamically within a network called cis-regulatory hubs (CRH), which allows for multiple regulatory connections [12,13,14,15]. These CRHs can be defined as 3D regulatory networks that constitute a complex organization of multi-loci connections of enhancers and promoters connected in 3D space within the nucleus. They highlight direct and indirect contact between genes and distal regulatory elements and can help predict target genes for disease-relevant non-coding single-nucleotide polymorphisms (SNPs) [14]. They are often bound by a cluster of transcription factors (TF) and have been implicated in gene co-regulation, lineage specification and disease development [13,14,16]. These hubs differ from other known 3D structures, such as genomic compartments, which constitute euchromatin or heterochromatin regions. They also differ from topologically associated domains (TADs) defined as units within which most enhancer–promoter interactions occur [17]. Although the identification of these CRHs is often performed solely through detailed analysis of 3D contacts from Hi-C data, [13,18,19,20,21], a recent study proposed building these CRHs using the ABC algorithm [14]. The ABC algorithm combines ATAC-seq, ChiP-seq and Hi-C to rank E-G pairs based on their regulatory impact [3,22]. Thus, in addition to identifying and ranking enhancers, the ABC algorithm can be used to generate a more functionally relevant CRH [14]. CRHs built using ABC were strongly enriched for disease-relevant genes and also helped explain disease heritability [14,23].
Given that the 3D genome architecture through HI-C is a key component of the ABC algorithm, which is used to construct the CRHs [14], it is of huge importance to understand the role of architectural proteins like CTCF in these CRHs. The CTCF protein, often called the “genome weaver,” plays a crucial function in the “folding” of the genome and brings enhancers into proximity to their distant target genes [17,24]. It thus plays key roles in regulating gene expression in various tissues by facilitating enhancer–promoter interactions and by preventing ectopic enhancer–promoter interactions [17]. We and others have shown the importance of CTCF in regulating cardiac chromatin architecture, gene expression and heart disease [25,26]. However, CTCF’s role in regulating CRHs has not been explored. Thus, this study aims to understand the role of CTCF in determining the composition and boundaries of ABC-derived CRH. In this study, using mouse cardiomyocytes as a model of terminally differentiated non-dividing cells, we employ the ABC model to identify top mouse cardiomyocyte gene-enhancers and then annotate the CRHs. We analyze the characteristics of the CRHs containing tissue-specific genes, showing that these genes are often found in multi-enhancer CRHs. We also show that in disease, there is a positive regulation of genes and enhancers within a CRH. Finally, we show that loss of CTCF leads to a merging of CRHs and the formation of new CRHs in the Ctcf-KO cells with resultant drastic change in the number and distribution of enhancers that cross the ABC threshold.
2. Results
2.1. Mouse Cardiomyocyte Enhancer Landscape Identified Through Activity-by-Contact Algorithm
We performed ATAC-seq, H3K27ac ChiP-seq and HiChIP to generate the ABC scores for the enhancer-gene pairs in the control adult mouse cardiomyocytes (CM). The generation of the ABC scores is a first step in building the ABC-derived CRHs and unraveling the importance of CTCF in this process. Using an ABC cut-off of 0.01 [14], we identified ~7000 out of 156,988 putative distal regulatory regions marked by H3K27ac peaks and/or ATAC peaks in the adult mouse CM that crossed the threshold. This represents about 5% of all putative regulatory regions predicted to have strong regulatory effects on their target genes and is similar to what was observed in another study [3]. With regard to the enhancer-gene (E-G) pairs, globally there were 34,496 E-G interactions with an average of 4.19 connections per gene and 5.29 connections per enhancer in these healthy adult mouse CMs. Figure 1A shows examples of the top enhancers for 2 CM disease-relevant genes, Mybpc3 and Myh6. The comprehensive list of ABC-linked enhancers and target genes can be found in Supplementary Table S1. A literature search confirmed that at least three of the identified enhancers in our study have been validated. These include the Nppa/Nppb super-enhancer located upstream of both genes, which has been shown to play a critical role in the stress gene response of Nppa/Nppb during pressure-overload-induced CM stress [25,27,28]. A separate enhancer for Myh7 identified from the ABC scoring has also been validated previously in the mouse heart [29]. Deletion of this enhancer region led to the downregulation of Myh7 in mouse hearts [29]. Taken together, these previous studies provide support for the validity of the ABC model.
For a better understanding of some of the features of highly ranked enhancers, we focused our attention on the top 1% and bottom 1% of enhancers based on the ABC score. The enhancers in the top 1% had scores of 0.9–1, while those in the bottom 1% had scores of 0.01. First, we analyzed the average distance of these enhancers to the identified target genes. As expected, enhancers with high scores were more likely to be closer to the target genes, with mean and median distances of 83 kb and 54 kb, respectively, as compared to the mean and median distances of the bottom 1% of enhancers, which were 950 kb and 699 kb, respectively (p-value < 0.01). Next, we performed a gene ontology (GO) analysis of the genes linked to these enhancers. The GO biological processes found in the genes which were linked to the top 1% of enhancers included terms such as striated muscle cell differentiation, regulation of blood circulation and actin cytoskeletal organization (Supplementary Table S2). These are processes that are important for CM function and include genes such as Gata4, Mybpc2, Cox7a1 and Myl3. On the other hand, the GO for genes in the bottom 1% had generic terms like RNA processing, protein modification and cellular metabolic processes (Supplementary Table S2).
To provide further evidence that these ABC-linked enhancers regulate their putative target genes, we studied the correlation between changes in gene expression versus changes in peak height of the ABC-linked enhancers during pressure-overload-induced CM hypertrophy. For this analysis, we reanalyzed our previous work which detailed transcriptomic and epigenetic changes in a mouse model of heart failure through transverse aortic constriction [25]. Using RNA-seq and H3K27ac ChIP-seq datasets from that study, we found that changes in gene expression correlated modestly with changes in H3K27ac peak signals (Pearson correlation 0.34, p < 2.2 × 10−16) (Figure 1B). The correlation was stronger when focusing only on genes and enhancers with greater or less than a log2 0.5-fold change after transverse aortic constriction (Supplementary Figure S1A) (Pearson correlation = 0.51, p < 2.2 × 10−16). This result shows that globally, upon external stimuli, gene expression is regulated in the same direction as its ABC-linked enhancers.
2.2. Loss of CTCF Leads to Changes in the Enhancer Interactome and Changes in the ABC Scores of Putative Enhancers
Next, to elucidate the role of CTCF in ABC scores and the regulation of the CRHs, we also applied the ABC algorithm on the Ctcf-KO mouse CMs. By analyzing the ABC scores and CRHs in Ctcf-KO mouse CMs, we could begin to infer the importance of CTCF in the ABC-derived CRH. From our analysis, we observed that 36,226 regulatory elements crossed the 0.01 threshold. This resulted in 71,080 E-G interactions with a similar number of expressed genes (~8000) (Supplementary Table S1) in control and Ctcf-KO samples. This led to an increase in the average number of connections per gene to 8.14 in Ctcf-KO samples. Of the 71,080 enhancer-gene pairs, only 20% (14,200 E-G pairs) were shared between control and Ctcf-KO mouse CMs. This increase in the number of enhancers with an ABC score greater than 0.01 could be due to increased interactions between genes and ectopic enhancers upon CTCF loss. Interestingly, we observed that while the highest-ranked ABC enhancer in the control had a score of 0.99, the highest-ranked enhancer in the Ctcf-KO sample had a score of 0.78 (scale of 0–1). Indeed, the average score in the control was 0.054 while the average score in the Ctcf-KO was 0.028 (t-test p-value < 2.2 × 10−16). This suggests that the loss of CTCF led to a gain in ectopic enhancers in the Ctcf-KO, with a redistribution of the ABC scores of the enhancers.
2.3. Loss of CTCF Alters the Cis-Regulatory Hubs
We integrated the ABC results to generate the CRHs, focusing first on the control samples. We identified 1522 hubs in these control cardiomyocytes with about 70% of the hubs containing fewer than 5 elements (Figure 2A). Figure 2B shows 2 examples of such hubs containing CM genes: Myom1 is a gene involved in the formation of myofibrils [30] and is found in a simple hub. In contrast, Tnni3, a CM-specific sarcomeric gene, is found in a more complex hub. Consistent with the correlation between gene expression and ABC-linked enhancers in heart disease, we observed that genes and enhancers within the same hubs were positively correlated in the same direction of change during CM stress (Pearson correlation of 0.32 (p < 2.2 × 10−16)) (Figure 2C). This suggests a co-regulation of genes and enhancers found within the same hub upon external stimuli. We then ranked CM genes by FPKM and selected the top 50 genes, which were mostly CM-specific genes, to glean biological insights about the characteristics of CRHs that harbor these genes. Our data revealed that highly expressed CM genes were more likely to be in hubs with more regulatory elements than non-highly expressed genes (Figure 2D). Such multi-enhancer hubs have been proposed as a mechanism to buffer against external stresses and genetic perturbations and provide phenotypic robustness to disease-relevant genes [23,31].
Next, we analyzed the CRHs in the Ctcf-KO CMs and found a marked reduction in the total number of CRHs, from 1522 in the control to 660 CRHs in the KO, accompanied by an increase in the number of elements per hub. Figure 3A shows an example of the same hubs containing two CM genes, shown in Figure 2A, showing an increase in the number of elements within each hub. Figure 3B shows the distribution of elements per hub, confirming a striking increase in the percentage of hubs with 10 or more elements. Associated with this increase in the percentage of complex hubs, there was an increase in the average number of connections per gene in the Ctcf-KO cells and a decrease in the number of connections per enhancer (Figure 3C, Supplementary Figure S1B). The decrease in the number of connections per enhancer was due to the overall increase in the number of enhancers that crossed the ABC threshold while maintaining the same number of expressed genes. Next, we asked if this reduction in the total number of CRHs in the Ctcf-KO was merely due to the merging of CRHs or the formation of new CRHs and observed both cases. The merging of CRHs was observed primarily in the simple hubs (2–3 elements per hub) located next to each other as 403 (82%) of the 492 simple hubs merged with 1 or more other hubs to form larger hubs. For the larger hubs, there was redistribution and formation of new hubs as genes gained new interactions and lost other interactions. To further confirm the importance of CTCF in the organization of these hubs, we compared CTCF binding sites in the control hubs vs. Ctcf-KO hubs. Our analysis showed an enrichment of CTCF in the control hubs (Fisher’s exact test odds ratio of 1.69, two-sided p < 2 × 10−16). Next, we analyzed the relationship between CRHs and TADs, since TAD boundaries are enriched for CTCF [25], and earlier studies have shown that CRHs are generally constrained within TAD boundaries [14]. Indeed, using the TADs in the control as a reference, our data showed that about 80% of CRHs in the control were contained within the same TAD, while only 20% spanned more than one TAD. In contrast, about 45% of CRHs in KO were contained within a TAD, while 50% of CRHs in the KO spanned more than one TAD (Figure 3D); furthermore, 5% were not within TAD boundaries, showing that the loss of CTCF leads to a reorganization of the CRHs with the formation of cross-TAD enhancer–promoter interactions.
Finally, we sought to identify if CRH changes were directly correlated to gene expression after CTCF loss. We have previously shown that CTCF KO leads to CM dysfunction and global transcriptomic changes after 2 weeks [25]; here, we wanted to gain deeper insight into how that related to CRH. Our analysis showed that despite the changes in the structure of CRHs, these changes were not directly reflected in gene expression. Indeed, there was no correlation between gain or loss of CRH connections and gene expression, suggesting that other secondary effects of the CTCF KO may have come into play to regulate the gene expression profile at 2 weeks when the RNA-seq was performed. This has been observed in other studies that showed that the direct effect of CTCF or Cohesin (another architectural protein) on RNA-seq is observed within a few hours of deletion [32,33]. As time progresses, other secondary factors come into play and obscure the direct effect of these proteins on gene expression.
3. Discussion
Non-coding cis-regulatory elements (CRE) play crucial roles in development and disease and hold promise for next-generation therapeutic targets. While different studies by us and others have annotated these CREs and CRHs in different cellular models, the role of CTCF in ABC-derived CRHs has yet to be studied. First, our study applied the ABC algorithm to the mouse heart, generating, to the best of our knowledge, the first such ranked enhancer scores in mice hearts. Our findings confirm a positive correlation between genes and their ABC-linked enhancers in cardiac disease and affirm that genes and enhancers within the same hub are co-regulated during pressure-overload-induced CM hypertrophy. Integrating ABC to identify CRHs may thus represent another method to analyze the effect of regulatory SNPs on target genes, as the SNP may affect a target gene when they are in the same hub, even when there is no direct link through pair-wise E-G analysis [14]. Analysis of the CRHs also suggests that tissue-specific genes are more likely to be contained in hubs with high connectivity and rich in distal elements. This increased number of E-P networks provides redundancy and phenotypic robustness while guaranteeing increased transcriptional activity thanks to multi-enhancer networks.
Loss of CTCF affects the enhancer interactome, particularly impacting the H3K27ac interactions co-bound by CTCF. This leads to E-Ps that span larger distances and gain of new interactions. The loss of CTCF also leads to a re-organization of ABC-selected enhancers with a direct impact on the CRHs, resulting in fewer hubs but more elements per hub. This implies a crucial role for CTCF in determining which enhancers cross the ABC threshold and the organization of ABC-derived CRHs. While this study focused on the effect of loss of CTCF on overall CRH structure and its relationship with TAD boundaries, future studies will examine individual CRHs and their relationship to CTCF binding to ascertain how loss of CTCF can lead to local dysregulation of CRHs. Indeed, this finding has implications for mutations that affect local CTCF binding at regions that are not TAD boundaries, as it may lead to the merging of two or more CRHs and, thus, the establishment of an indirect connection between genes and enhancers within the newly formed hubs. In addition, given that the binding of CTCF to DNA is methylation-sensitive [34,35], this also supports findings that conditions that lead to DNA hypermethylation will reduce local or global CTCF binding [36]. This loss of CTCF binding will thus lead to the loss of insulation at TAD boundaries with a consequential gain of ectopic enhancers and reorganization of ABC enhancers [36].
4. Materials and Methods
4.1. Animal Experiments
Animal experiments were performed under a license approved by the Institutional Animal Care and Use Committee (National University of Singapore). Ctcfflox/flox mice harboring LoxP sites flanking exons 3 and 12 of the Ctcf gene against a C57/bl6 strain background were used as previously published. AAV9-cTnt-eGFP and AAV9-cTnt-Cre-tdTomato vectors encoding codon improved Cre recombinase or eGFP under the control of cardiac troponin T promoter were used for the control and Ctcf knock-out, respectively [25]. Experiments were performed on adult 6- to 8-week-old mice. Cardiomyocytes were isolated from mouse hearts after 2 weeks of AAV9 injections for control and Ctcf-KO mice. Adeno-associated virus 9 (AAV9) virus for the delivery of Cre recombinase was generated as previously described [25].
4.2. H3k27ac HiChip Library Prep and Data Analysis
Mouse adult cardiomyocytes were isolated following previously published protocol [25]. MNase-based HiChIP assay was performed on 2 × 106 isolated CMs using the Dovetail Genomics HiChIP MNase kit protocol [37]. Frozen cells were resuspended in 1× PBS and crosslinked with 3 mM DSG and 1% formaldehyde. Washed cells were digested with 0.75 µl MNase in 100 μl of nuclease digest buffer with MgCl2. Cells were lysed with 1× RIPA, and clarified lysate was used for ChIP. H3K27ac antibody (Abcam 04729, Cambridge, UK) was used to perform chromatin immunoprecipitation. The protein A/G bead pulldown, proximity ligation and libraries were prepared as described in the Dovetail protocol (Dovetail HiChIP MNase Kit, Cantata Bio, Scotts Valley, CA, USA). Libraries were sequenced on an Illumina Novaseq (Supplementary Table S3 shows the sequencing QC).
HiChIP paired-end reads were aligned with BWA MEM [38] (version 0.7.17r1198-dirty) with the -5SP flag enabled with an index containing only the canonical chromosomes of the mm10 genome (available from the UCSC genome). The resulting alignments were then parsed with pairtools (versions 0.3.0) with the following options –min-mapq 40 –walks-policy 5unque –max-inter-align gap 30 and the thre –chroms-path file corresponding to the size of the chromosomes used for the alignment index. Parse pairs were deduplicated and sorted with pairtools. Valid pairs were identified and through pairtools selected (pair type = ‘UU’, ‘UR’, ‘RU’, ‘uu’, ‘uU’ and ‘Uu’) and downsampled to the lowest number of valid pairs in a sample (126 million in LACZ) with pairtools sample. Contact matrices in .hic format were generated with the juicetools pre function (version 1.22.01).
FitHiChIP [39] (version 9.1) was used to identify significant interactions (“loops”) from the valid subsampled pairs at 10 kb resolution with the following settings: loop type = all-to-all, coverage = bias correction, merge redundant loops = Yes, background model = FithHiChIP(L), FDR < 0.1, minimum interaction size = 20 kb, maximum interaction size = 2 mb. Conditionally unique (meaning only occurring in LACZ or CRE) and shared loops were identified with bedtools pairToPair (version 2.28.0), requiring both loop anchors to overlap the same coordinates be flagged as shared. Loop size was assessed by performing a two-sided Wilcoxson’s rank-sum test on the distribution of loop ranges (distance between the two anchors). Loop anchors were then intersected with the LACZ CTCF binding sites to determine the proportion of loops resulting from CTCF presence. p-values for CTCF mediated loops were obtained through a two-sided Fischer’s Exact test. Aggregate Peak Analysis (APA) [32] was computed with juicetools APA function. Loop anchors were used to as the sites to aggregate over at 10 kb resolution. APA enrichment scores, loop center from the lower-left (LL) corner, are shown as both an APA score (ratio of the mean of center to mean of LL) and a Z-score. APA-normed output was used to plot the APA matrices.
H3K27ac HiChIP data were summarized and visualized with R; the package eulurr was used for the shared and unique loop Venn diagrams and ggplot2 for loop size and proportion of loops with CTCF binding site at the anchor. Additionally, ggplot2 was used to plot APA matrices with the geom_raster function, with the color scale set to the same min–max limits across all APA plots. At sites of interest, HiChIP loops files (in longrange format) were used to visualize loops along with ChIP-seq coverage and RNA-seq activity in the WashU Epigenome Browser (
4.3. RNA-Seq
RNA was extracted from 3 biological replicates of control and CTCF KO ventricular CMs (1 million cells each). The paired-end libraries were constructed using Tru-seq kits (Illumina, San Diego, CA, USA) and resulting libraries were sequenced on the Novaseq platform, generating 2 × 150 bp paired-end reads. The RNA reads were mapped to the mouse genome with Tophat version 2.0.11 with default parameters, and gene count was computed with htseq-count. Differential gene expression analysis was performed with EdgeR [40].
4.4. Chip-Seq Experiment
ChIP experiments on isolated CMs (1 million cells each) were performed as described previously [25]. Briefly, CMs were cross-linked with 1% formaldehyde for 10 min at room temperature and quenched with glycine (0.125 mol/L final concentration) [25]. Cells were then washed in ice-cold PBS and pelleted and lysed in FA lysis buffer (10 mmol/L Tris-HCl, pH 8.0, 0.25% Triton X-100, 10 mmol/L EDTA, 0.1 mol/L NaCl). To facilitate cell lysis, the cell pellet was passed through a 27.5-gauge needle gently 20 times. Nuclei were pelleted by centrifugation resuspended in sonication buffer, and chromatin was fragmented via sonication to an average size of 200 to 300 bp (Bioruptor Plus, Diagenode, Denville, NJ, USA). Chromatin was immunoprecipitated against H3K27ac (Abcam ab4729) or CTCF (EMD Millipore, catalog No. 07–729, Burlington, MA, USA) overnight. Antibody–chromatin complexes were pulled down with Protein G Dynabeads (Invitrogen, catalog No. 10003D, Carlsbad, CA, USA), washed with 0.1% SDS lysis buffer and eluted with elution buffer (1% SDS, 10 mmol/L EDTA, 50 mmol/L Tris-HCl, pH 8). After cross-link reversal (4 h of incubation at 65 °C) and proteinase K treatment, immunoprecipitated DNA was extracted. ChIP DNA was quantified by fluorometric quantification (Thermo Fisher Scientific, Waltham, MA, USA, Qubit dsDNA HS assay kit, catalog No. 32851). Library preparation was performed with the New England Biolabs Ultra II Kit (New England Biolabs, Ipswich, MA, USA) according to the manufacturer’s specifications and sequenced on the Illumina NextSeq High platform.
4.5. ATAC-Seq Experiment and Analysis
The ATAC-seq was performed for both control and Ctcf KO CMs according to the previously published Omni-ATAC protocol [41]. Briefly, 50,000 adult cardiomyocytes were pelleted at 500 g for 5 min. The cells were resuspended in 100 µL ATAC-resuspension buffer containing 0.5% NP40, 0.5 % Tween-20 and 0.01% Digitonin. The cells were left on ice for 5 min, after which 1 mL of cold ATAC-RSB containing 0.1% Tween 20 was added to wash out the lysis buffer. The nuclei were pelleted at 500 g for 5 min, the supernatant was carefully removed and the nuclei were resuspended in 50 µL transposition mixture containing 25 µL of 2x TD buffer, 16.5 µL of PBS, 5 µL nuclease free water, 0.5 µL of 1% digitonin, 0.5 µL of 10% Tween-20 and 2.5 µL of transposase (Illumina Tagment DNA enzyme 1, catalog number 20034198). The reaction was incubated at 37 °C for 30 min in a thermoshaker with 1000 RPM. After the reaction, DNA was extracted using the NEB Monarch® PCR & DNA cleanup kit (catalog number T1030L, New England Biolabs), PCR was performed using the Illumina/Nextera primers after which Ampure XP beads (Beckman Coulter, Indianapolis, IN, USA) were used for library cleanup. The resulting library was sequenced on Nextseq.
4.6. ChIP-Seq and ATAC-Seq Data Analysis
For both H3K27ac ChIP-Seq and ATAC-Seq, fastq files were aligned to the mm10 reference genome using the Burrows-Wheeler alignment tool in bwa software v0.7.19 [38]. The corresponding bam files were obtained using samtools v1.22.1 [42], while peak calling was performed with macS2 software v2.2.9.1 [43]. All peaks with a q-value less or equal to 0.1 were considered as statistically significant and used in the activity-by-contact score [3] for CRH construction [14]. Supplementary Table S5 lists H3K27ac peaks in Sham and TAC-treated mouse cardiomyocytes, as well as those in control vs. Ctcf-KO cardiomyocytes.
4.7. ABC-Score and CRH Analysis
The ABC model defines active enhancers based on a quantitative score of DNAse or ATAC-seq, H3K27ac and normalized Hi-C contact number [3]. This score is computed relative to a background activity over a 5 Mb window around a candidate element. Then, we set the threshold to 0.01, beyond which a candidate element is considered as a distal element. As an extension of the ABC score, CRHs [14] were defined as bipartite networks between promoters and distal elements (igraph R package) [44]. TADs were called using the directionality index as previously described. Pearson correlation analysis was performed using an R package (v2.1.4).
5. Conclusions
Our findings demonstrate that CTCF plays a critical role in determining the regulatory strength of enhancers as well as the membership and organization of the CRHs. Given that the CRHs play a role in disease and cell-type-specific gene expression, processes that affect CTCF expression and genomic binding will lead to a disorganization of CRHs and a subsequent effect on transcriptional regulation upon external stimuli. Although we did not find a correlation between the reorganization of CRHs and gene expression upon the loss of CTCF, this was mostly due to other secondary effects in the regulation of gene expression. However, we believe that our findings reveal important principles of the role of CTCF in regulating CRHs and ABC scores. A major limitation of our study is that the resolution of 10 kb for our HiChIP is not optimal, as each 10 kb window may contain various enhancers, and hence our analysis may miss out on some enhancer-gene interactions. In addition, as the various components of the ABC analysis, namely ATAC, ChIP and HiC, may have technical variations, an independent ABC analysis would be valuable to confirm our findings. However, we believe that our study forms the foundation for future analysis on the role of CTCF in ABC enhancer scores and as such the relative contribution of enhancers to each gene. Our future studies will delve deeper into these questions.
M.L., D.P.L., W.H.Z., L.H.G., C.K.C., Y.P.L., R.M.Q.W., P.Y.L., Y.Z. and C.G.A.-N. performed the experiments, M.L., L.M., C.C.P., W.T., S.B., A.B. and C.G.A.-N. provided data analysis. S.B., A.B., R.S.-Y.F. and C.G.A.-N. Provided supervision for experiments and data analysis. All authors have read and agreed to the published version of the manuscript.
Animal experiments were performed under a license approved by the Institutional Animal Care and Use Committee (National University of Singapore). The license number is R20-1204 (approval date: December 2020).
Not applicable.
The raw data supporting the conclusion of this article has been deposited in a public repository.
We thank Cantata Bio for providing the HiChIP kits. We thank Charles Joly-Beauparlant and Tania Cuppens for helping with data analysis and result interpretation.
Cory C. Padilla works for Cantata Bio, the developers of the HiChIP kit used in the study. The other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 1 (A) USCS screenshot showing two examples of CM genes and their top ABC-linked enhancers; (B) Pearson correlation coefficient showing correlation between differentially expressed genes and differential enhancer peaks for ABC-linked enhancers in cardiac disease.
Figure 2 CRH in control cardiomyocytes. (A) Bar chart showing distribution of number of elements per CRH. (B) Examples of 2 CRHs containing CM genes Myom1 and Tnni3 in the control cardiomyocytes. (C) Pearson correlation coefficient between differential expression of genes and enhancer peaks contained within the same CRHs during CM stress. (D) Box plot showing characteristics of CRHs containing CM-specific genes. These CRHs tend to have more enhancers and promoters.
Figure 3 CRHs in the KO cells. (A) Examples of two CRHs containing the CM genes Myom1 and Tnni3 in the Ctcf-KO cardiomyocytes. Compared to the same CRHs in control, there is an increase in the number of elements in each hub. (B) Bar chart showing the distribution of elements per CRH in Control and Ctcf-KO. There is an increase in the percentage of CRHs with 10 or more elements from 15% in the control to 60% in the CTCF KO cardiomyocytes. (C) Bar chart showing the number of connections per gene promoters and per enhancers. There is an increase in the average number of connections per gene in the Ctcf-KO CMs. p < 0.0001. (D) Bar chart showing overlap of CRHs with TADs, 80% of TADs are contained in one TAD in the control, while only 40% of CRHs are contained within one TAD in the CTCF KO. In contrast, 20% of CRHs span two or more TADs in the control, while 50% of CRHs span two or more TADs in CTCF KO.
Supplementary Materials
The following supporting information can be downloaded at:
1. Maurano, M.T.; Humbert, R.; Rynes, E.; Thurman, R.E.; Haugen, E.; Wang, H.; Reynolds, A.P.; Sandstrom, R.; Qu, H.; Brody, J.
2. Boix, C.A.; James, B.T.; Park, Y.P.; Meuleman, W.; Kellis, M. Regulatory genomic circuitry of human disease loci by integrative epigenomics. Nature; 2021; 590, pp. 300-307. Erratum in Nature 2025, 643, E11 [DOI: https://dx.doi.org/10.1038/s41586-020-03145-z]
3. Fulco, C.P.; Nasser, J.; Jones, T.R.; Munson, G.; Bergman, D.T.; Subramanian, V.; Grossman, S.R.; Anyoha, R.; Doughty, B.R.; Patwardhan, T.A.
4. Yao, L.; Liang, J.; Ozer, A.; Leung, A.K.; Lis, J.T.; Yu, H. A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers. Nat. Biotechnol.; 2022; 40, pp. 1056-1065. [DOI: https://dx.doi.org/10.1038/s41587-022-01211-7] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/35177836]
5. Anene-Nzelu, C.G.; Lee, M.C.J.; Tan, W.L.W.; Dashi, A.; Foo, R.S.Y. Genomic enhancers in cardiac development and disease. Nat. Rev. Cardiol.; 2021; 19, pp. 7-25. [DOI: https://dx.doi.org/10.1038/s41569-021-00597-2] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/34381190]
6. Gasperini, M.; Tome, J.M.; Shendure, J. Towards a comprehensive catalogue of validated and target-linked human enhancers. Nat. Rev. Genet.; 2020; 21, pp. 292-310. [DOI: https://dx.doi.org/10.1038/s41576-019-0209-0]
7. Whalen, S.; Truty, R.M.; Pollard, K.S. Enhancer-promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat. Genet.; 2016; 48, pp. 488-496. [DOI: https://dx.doi.org/10.1038/ng.3539] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/27064255]
8. Yao, L.; Shen, H.; Laird, P.W.; Farnham, P.J.; Berman, B.P. Inferring regulatory element landscapes and transcription factor networks from cancer methylomes. Genome Biol.; 2015; 16, 105. [DOI: https://dx.doi.org/10.1186/s13059-015-0668-3]
9. Alexanian, M.; Przytycki, P.F.; Micheletti, R.; Padmanabhan, A.; Ye, L.; Travers, J.G.; Gonzalez-Teran, B.; Silva, A.C.; Duan, Q.; Ranade, S.S.
10. Frangoul, H.; Altshuler, D.; Cappellini, M.D.; Chen, Y.S.; Domm, J.; Eustace, B.K.; Foell, J.; de la Fuente, J.; Grupp, S.; Handgretinger, R.
11. Matharu, N.; Rattanasopha, S.; Tamura, S.; Maliskova, L.; Wang, Y.; Bernard, A.; Hardin, A.; Eckalbar, W.L.; Vaisse, C.; Ahituv, N. CRISPR-mediated activation of a promoter or enhancer rescues obesity caused by haploinsufficiency. Science; 2019; 363, 6424. [DOI: https://dx.doi.org/10.1126/science.aau0629]
12. Di Giammartino, D.C.; Kloetgen, A.; Polyzos, A.; Liu, Y.; Kim, D.; Murphy, D.; Abuhashem, A.; Cavaliere, P.; Aronson, B.; Shah, V.
13. Espinola, S.M.; Gotz, M.; Bellec, M.; Messina, O.; Fiche, J.B.; Houbron, C.; Dejean, M.; Reim, I.; Cardozo Gizzi, A.M.; Lagha, M.
14. Mangnier, L.; Joly-Beauparlant, C.; Droit, A.; Bilodeau, S.; Bureau, A. Cis-regulatory hubs: A new 3D model of complex disease genetics with an application to schizophrenia. Life Sci. Alliance; 2022; 5, [DOI: https://dx.doi.org/10.26508/lsa.202101156] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/35086934]
15. Weintraub, A.S.; Li, C.H.; Zamudio, A.V.; Sigova, A.A.; Hannett, N.M.; Day, D.S.; Abraham, B.J.; Cohen, M.A.; Nabet, B.; Buckley, D.L.
16. Di Giammartino, D.C.; Polyzos, A.; Apostolou, E. Transcription factors: Building hubs in the 3D space. Cell Cycle; 2020; 19, pp. 2395-2410. [DOI: https://dx.doi.org/10.1080/15384101.2020.1805238]
17. Rao, S.S.; Huntley, M.H.; Durand, N.C.; Stamenova, E.K.; Bochkov, I.D.; Robinson, J.T.; Sanborn, A.L.; Machol, I.; Omer, A.D.; Lander, E.S.
18. Li, T.; Jia, L.; Cao, Y.; Chen, Q.; Li, C. OCEAN-C: Mapping hubs of open chromatin interactions across the genome reveals gene regulatory networks. Genome Biol.; 2018; 19, 54. [DOI: https://dx.doi.org/10.1186/s13059-018-1430-4]
19. Madsen, J.G.S.; Madsen, M.S.; Rauch, A.; Traynor, S.; Van Hauwaert, E.L.; Haakonsson, A.K.; Javierre, B.M.; Hyldahl, M.; Fraser, P.; Mandrup, S. Highly interconnected enhancer communities control lineage-determining genes in human mesenchymal stem cells. Nat. Genet.; 2020; 52, pp. 1227-1238. [DOI: https://dx.doi.org/10.1038/s41588-020-0709-z] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/33020665]
20. Oudelaar, A.M.; Harrold, C.L.; Hanssen, L.L.P.; Telenius, J.M.; Higgs, D.R.; Hughes, J.R. A revised model for promoter competition based on multi-way chromatin interactions at the alpha-globin locus. Nat. Commun.; 2019; 10, 5412. [DOI: https://dx.doi.org/10.1038/s41467-019-13404-x]
21. Pliner, H.A.; Packer, J.S.; McFaline-Figueroa, J.L.; Cusanovich, D.A.; Daza, R.M.; Aghamirzaie, D.; Srivatsan, S.; Qiu, X.; Jackson, D.; Minkina, A.
22. Anene-Nzelu, C.G.; Tan, W.L.W.; Lee, C.J.M.; Wenhao, Z.; Perrin, A.; Dashi, A.; Tiang, Z.; Autio, M.I.; Lim, B.; Wong, E.
23. Tsai, A.; Alves, M.R.; Crocker, J. Multi-enhancer transcriptional hubs confer phenotypic robustness. Elife; 2019; 8, e45325. [DOI: https://dx.doi.org/10.7554/eLife.45325] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/31294690]
24. Ong, C.T.; Corces, V.G. Enhancer function: New insights into the regulation of tissue-specific gene expression. Nat. Rev. Genet.; 2011; 12, pp. 283-293. [DOI: https://dx.doi.org/10.1038/nrg2957]
25. Lee, D.P.; Tan, W.L.W.; Anene-Nzelu, C.G.; Lee, C.J.M.; Li, P.Y.; Luu, T.D.A.; Chan, C.X.; Tiang, Z.; Ng, S.L.; Huang, X.
26. Rosa-Garrido, M.; Chapski, D.J.; Schmitt, A.D.; Kimball, T.H.; Karbassi, E.; Monte, E.; Balderas, E.; Pellegrini, M.; Shih, T.T.; Soehalim, E.
27. Man, J.C.K.; van Duijvenboden, K.; Krijger, P.H.L.; Hooijkaas, I.B.; van der Made, I.; de Gier-de Vries, C.; Wakker, V.; Creemers, E.E.; de Laat, W.; Boukens, B.J.
28. Sergeeva, I.A.; Hooijkaas, I.B.; Ruijter, J.M.; van der Made, I.; de Groot, N.E.; van de Werken, H.J.; Creemers, E.E.; Christoffels, V.M. Identification of a regulatory domain controlling the Nppa-Nppb gene cluster during heart development and stress. Development; 2016; 143, pp. 2135-2146. [DOI: https://dx.doi.org/10.1242/dev.132019] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/27048739]
29. Dickel, D.E.; Barozzi, I.; Zhu, Y.; Fukuda-Yuzawa, Y.; Osterwalder, M.; Mannion, B.J.; May, D.; Spurrell, C.H.; Plajzer-Frick, I.; Pickle, C.S.
30. Hang, C.; Song, Y.; Li, Y.; Zhang, S.; Chang, Y.; Bai, R.; Saleem, A.; Jiang, M.; Lu, W.; Lan, F.
31. Osterwalder, M.; Barozzi, I.; Tissieres, V.; Fukuda-Yuzawa, Y.; Mannion, B.J.; Afzal, S.Y.; Lee, E.A.; Zhu, Y.; Plajzer-Frick, I.; Pickle, C.S.
32. Nora, E.P.; Goloborodko, A.; Valton, A.L.; Gibcus, J.H.; Uebersohn, A.; Abdennur, N.; Dekker, J.; Mirny, L.A.; Bruneau, B.G. Targeted Degradation of CTCF Decouples Local Insulation of Chromosome Domains from Genomic Compartmentalization. Cell; 2017; 169, pp. 930-944.e922. [DOI: https://dx.doi.org/10.1016/j.cell.2017.05.004]
33. Rao, S.S.P.; Huang, S.C.; Glenn St Hilaire, B.; Engreitz, J.M.; Perez, E.M.; Kieffer-Kwon, K.R.; Sanborn, A.L.; Johnstone, S.E.; Bascom, G.D.; Bochkov, I.D.
34. Hark, A.T.; Schoenherr, C.J.; Katz, D.J.; Ingram, R.S.; Levorse, J.M.; Tilghman, S.M. CTCF mediates methylation-sensitive enhancer-blocking activity at the H19/Igf2 locus. Nature; 2000; 405, pp. 486-489. [DOI: https://dx.doi.org/10.1038/35013106]
35. Wang, H.; Maurano, M.T.; Qu, H.; Varley, K.E.; Gertz, J.; Pauli, F.; Lee, K.; Canfield, T.; Weaver, M.; Sandstrom, R.
36. Flavahan, W.A.; Drier, Y.; Liau, B.B.; Gillespie, S.M.; Venteicher, A.S.; Stemmer-Rachamimov, A.O.; Suva, M.L.; Bernstein, B.E. Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature; 2016; 529, pp. 110-114. [DOI: https://dx.doi.org/10.1038/nature16490]
37. Sept, C.E.; Tak, Y.E.; Cerda-Smith, C.G.; Hutchinson, H.M.; Goel, V.; Blanchette, M.; Bhakta, M.S.; Hansen, A.S.; Joung, J.K.; Johnstone, S.
38. Li, H.; Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics; 2010; 26, pp. 589-595. [DOI: https://dx.doi.org/10.1093/bioinformatics/btp698] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/20080505]
39. Bhattacharyya, S.; Chandra, V.; Vijayanand, P.; Ay, F. Identification of significant chromatin contacts from HiChIP data by FitHiChIP. Nat. Commun.; 2019; 10, 4221. [DOI: https://dx.doi.org/10.1038/s41467-019-11950-y]
40. Robinson, M.D.; McCarthy, D.J.; Smyth, G.K. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics; 2010; 26, pp. 139-140. [DOI: https://dx.doi.org/10.1093/bioinformatics/btp616]
41. Corces, M.R.; Trevino, A.E.; Hamilton, E.G.; Greenside, P.G.; Sinnott-Armstrong, N.A.; Vesuna, S.; Satpathy, A.T.; Rubin, A.J.; Montine, K.S.; Wu, B.
42. Li, H.; Handsaker, B.; Wysoker, A.; Fennell, T.; Ruan, J.; Homer, N.; Marth, G.; Abecasis, G.; Durbin, R.; Genome Project Data Processing, S. The Sequence Alignment/Map format and SAMtools. Bioinformatics; 2009; 25, pp. 2078-2079. [DOI: https://dx.doi.org/10.1093/bioinformatics/btp352] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/19505943]
43. Zhang, Y.; Liu, T.; Meyer, C.A.; Eeckhoute, J.; Johnson, D.S.; Bernstein, B.E.; Nusbaum, C.; Myers, R.M.; Brown, M.; Li, W.
44. Csardi, G.N.T. The igraph software package for complex network research. Int. J. Complex Syst.; 2006; 1695, pp. 1-9.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The 3D chromatin architecture establishes a complex network of genes and regulatory elements necessary for transcriptomic regulation in development and disease. This network can be modeled by cis-regulatory hubs (CRH), which underscore the local functional interactions between enhancers and promoter regions and differ from other higher-order chromatin structures such as topologically associated domains (TAD). The activity-by-contact (ABC) model of enhancer–promoter regulation has been recently used in the identification of these CRHs, but little is known about the role of transcription factor CCTC binding factor (CTCF) on ABC scores and their consequent impact on CRHs. Here, we show that the loss of CTCF leads to a reorganization of the ABC-derived rankings of putative enhancers in the mouse heart, a global reduction in the total number of CRHs and an increase in the size of CRHs. Furthermore, CTCF loss leads to a higher percentage of CRHs that cross TAD boundaries. These results provide additional evidence to support the importance of CTCF in forming the regulatory networks necessary for gene regulation.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
; Mangnier Loïc 2 ; Padilla, Cory C 3 ; Lee, Dominic Paul 1 ; Wilson, Tan 1 ; Zheng Wen Hao 1 ; Gan, Louis Hanqiang 1 ; Chen, Ching Kit 4
; Lim, Yee Phong 5 ; Wang Rina Miao Qin 5 ; Li, Peter Yiqing 1 ; Zhu Yonglin 6 ; Bilodeau, Steve 7 ; Bureau, Alexandre 8 ; Foo, Roger Sik-Yin 1 ; George, Anene-Nzelu Chukwuemeka 9
1 Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117599, Singapore, Genome Institute of Singapore, A*STAR, Singapore 138672, Singapore
2 Centre de Recherche CERVO, Québec, QC G1E 1T2, Canada, Département de Médecine Sociale et Préventive, Université Laval, Québec, QC G1V 0A6, Canada, Centre de Recherche en Données Massives de l’Université Laval, Québec, QC G1V 0A6, Canada, Centre de Recherche du Centre Hospitalier Universitaire de Québec, Université Laval, Axe Oncologie, Québec, QC G1R 3S3, Canada
3 Cantata Bio, 100 Enterprise Way Suite A-101, Scotts Valley, CA 95066, USA
4 Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117599, Singapore, Genome Institute of Singapore, A*STAR, Singapore 138672, Singapore, Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore, Khoo Teck Puat-National University Children’s Medical Institute, National University of Singapore, Singapore 119074, Singapore
5 Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117599, Singapore, Genome Institute of Singapore, A*STAR, Singapore 138672, Singapore, Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore
6 Montreal Heart Institute, Montreal, QC H1T 1C8, Canada, Department of Medicine, University of Montreal, Montreal, QC H3T 1J4, Canada
7 Centre de Recherche en Données Massives de l’Université Laval, Québec, QC G1V 0A6, Canada, Centre de Recherche du Centre Hospitalier Universitaire de Québec, Université Laval, Axe Oncologie, Québec, QC G1R 3S3, Canada, Centre de Recherche sur le Cancer de l’Université Laval, Québec, QC G1R 3S3, Canada, Département de Biologie Moléculaire, Biochimie Médicale et Pathologie, Faculté de Médecine, Université Laval, Québec, QC G1V 0A6, Canada
8 Centre de Recherche CERVO, Québec, QC G1E 1T2, Canada, Département de Médecine Sociale et Préventive, Université Laval, Québec, QC G1V 0A6, Canada, Centre de Recherche en Données Massives de l’Université Laval, Québec, QC G1V 0A6, Canada
9 Cardiovascular Disease Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117599, Singapore, Genome Institute of Singapore, A*STAR, Singapore 138672, Singapore, Montreal Heart Institute, Montreal, QC H1T 1C8, Canada, Department of Medicine, University of Montreal, Montreal, QC H3T 1J4, Canada





