Content area
Full Text
The majority ofgene transcripts generated by RNA polymerase II in mammalian genomes initiate at CpG island (CGI) promoters1,2, yet our understanding of their regulation remains limited. This is in part due to the incomplete information that we have on transcription factors, their DNA-binding motifs and which genomic binding sites are functional in any given cell type3-5. In addition, there are orphan motifs without known binders, such as the CGCG element, which is associated with highly expressed genes across human tissues and enriched near the transcription start site of a subset of CGI promoters6-8. Here we combine single-molecule footprinting with interaction proteomics to identify BTG3-associated nuclear protein (BANP) as the transcription factor that binds this element in the mouse and human genome. We show that BANP is a strong CGI activator that controls essential metabolic genes in pluripotent stem and terminally differentiated neuronal cells. BANP binding is repelled by DNA methylation of its motif in vitro and in vivo, which epigenetically restricts most binding to CGIs and accounts for differential binding at aberrantly methylated CGI promoters in cancer cells. Upon binding to an unmethylated motif, BANP opens chromatin and phases nucleosomes. These findings establish BANP as a critical activator of a set of essential genes and suggest a model in which the activity of CGI promoters relies on methylation-sensitive transcription factors that are capable of chromatin opening.
Binding and chromatin sensitivity of individual transcription factors, or their respective motifs, cannot be studied within regulatory regions as these usually consist of assemblies of multiple motifs. To test individual motifs, we developed a reductionist approach in which single transcription factor motifs are placed within an in-vitro-derived sequence and inserted into a specific genomic locus in mouse embryonic stem (ES) cells using recombinase-mediated cassette exchange (RMCE)9,10. The occupancy of these motifs is monitored by single-molecule footprinting (SMF), which uses methyltransferase footprinting and read-out by bisulfite sequencing11,12 (Fig. 1a). When applied to the motif for the REl-silencing transcription factor (REST), a prominent footprint was detected that resembled genomic loci bound by REST12 (Extended Data Fig. 1a-d). As SMF does not require prior knowledge of the binding transcription factor, it should enable us to determine whether an orphan motif such as the CGCG element is occupied. Indeed, when testing the...