Abstract

In the past decade, there has been a growing appreciation for R-loop structures as important regulators of the epigenome, telomere maintenance, DNA repair, and replication. Given these numerous functions, dozens, or potentially hundreds, of proteins could serve as direct or indirect regulators of R-loop writing, reading, and erasing. In order to understand common properties shared amongst potential R-loop binding proteins, we mined published proteomic studies and distilled 10 features that were enriched in R-loop binding proteins compared with the rest of the proteome. Applying an easy-ensemble machine learning approach, we used these R-loop binding protein-specific features along with their amino acid composition to create random forest classifiers that predict the likelihood of a protein to bind to R-loops. Known R-loop regulating pathways such as splicing, DNA damage repair and chromatin remodeling are highly enriched in our datasets, and we validate 2 new R-loop binding proteins LIG1 and FXR1 in human cells. Together these datasets provide a reference to pursue analyses of novel R-loop regulatory proteins.

Details

Title
Integrative analysis and prediction of human R-loop binding proteins
Author
Kumar, Arun 1   VIAFID ORCID Logo  ; Louis-Alexandre Fournier 1   VIAFID ORCID Logo  ; Stirling, Peter C 1   VIAFID ORCID Logo 

 Terry Fox Laboratory, BC Cancer , Vancouver, BC V5Z1L3, Canada 
Publication year
2022
Publication date
Aug 2022
Publisher
Oxford University Press
e-ISSN
21601836
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3169738176
Copyright
© The Author(s) 2022. Published by Oxford University Press on behalf of Genetics Society of America. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.