Content area

Abstract

Epigenetic regulation orchestrates mammalian transcription, but functional links between them remain elusive. To tackle this problem, we use epigenomic and transcriptomic data from 13 ENCODE cell types to train machine learning models to predict gene expression from histone post-translational modifications (PTMs), achieving transcriptome-wide correlations of ~0.70-0.79 for most cell types. Our models recapitulate known associations between histone PTMs and expression patterns, including predicting that acetylation of histone subunit H3 lysine residue 27 (H3K27ac) near the transcription start site (TSS) significantly increases expression levels. To validate this prediction experimentally and investigate how natural vs. engineered deposition of H3K27ac might differentially affect expression, we apply the synthetic dCas9-p300 histone acetyltransferase system to 8 genes in the HEK293T cell line and to 5 genes in the K562 cell line. Further, to facilitate model building, we perform MNase-seq to map genome-wide nucleosome occupancy levels in HEK293T. We observe that our models perform well in accurately ranking relative fold-changes among genes in response to the dCas9-p300 system; however, their ability to rank fold-changes within individual genes is noticeably diminished compared to predicting expression across cell types from their native epigenetic signatures. Our findings highlight the need for more comprehensive genome-scale epigenome editing datasets, better understanding of the actual modifications made by epigenome editing tools, and improved causal models that transfer better from endogenous cellular measurements to perturbation experiments. Together these improvements would facilitate the ability to understand and predictably control the dynamic human epigenome with consequences for human health.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* Included new benchmarking against other models. Made improvements to increase readability and clarity, and added further details of perturb-seq experiments.

Details

1009240
Title
Predicting the effect of CRISPR-Cas9-based epigenome editing
Publication title
bioRxiv; Cold Spring Harbor
Publication year
2025
Publication date
Feb 28, 2025
Section
New Results
Publisher
Cold Spring Harbor Laboratory Press
Source
BioRxiv
Place of publication
Cold Spring Harbor
Country of publication
United States
University/institution
Cold Spring Harbor Laboratory Press
Publication subject
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
Document type
Working Paper
Publication history
 
 
Milestone dates
2023-10-03 (Version 1); 2024-10-16 (Version 2)
ProQuest document ID
3172351561
Document URL
https://www.proquest.com/working-papers/predicting-effect-crispr-cas9-based-epigenome/docview/3172351561/se-2?accountid=208611
Copyright
© 2025. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-03-01
Database
ProQuest One Academic