Abstract

Understanding the genetic regulatory code governing gene expression is an important challenge in molecular biology. However, how individual coding and non-coding regions of the gene regulatory structure interact and contribute to mRNA expression levels remains unclear. Here we apply deep learning on over 20,000 mRNA datasets to examine the genetic regulatory code controlling mRNA abundance in 7 model organisms ranging from bacteria to Human. In all organisms, we can predict mRNA abundance directly from DNA sequence, with up to 82% of the variation of transcript levels encoded in the gene regulatory structure. By searching for DNA regulatory motifs across the gene regulatory structure, we discover that motif interactions could explain the whole dynamic range of mRNA levels. Co-evolution across coding and non-coding regions suggests that it is not single motifs or regions, but the entire gene regulatory structure and specific combination of regulatory elements that define gene expression levels.

Regulatory and coding regions of genes are shaped by evolution to control expression levels. Here, the authors use deep learning to identify rules controlling gene expression levels and suggest that all parts of the gene regulatory structure interact in this.

Details

Title
Deep learning suggests that gene expression is encoded in all parts of a co-evolving interacting gene regulatory structure
Author
Zrimec, Jan 1   VIAFID ORCID Logo  ; Börlin, Christoph S 2 ; Buric Filip 1   VIAFID ORCID Logo  ; Muhammad Azam Sheikh 3   VIAFID ORCID Logo  ; Chen Rhongzen 3 ; Siewers Verena 2   VIAFID ORCID Logo  ; Verendel Vilhelm 3 ; Nielsen, Jens 2   VIAFID ORCID Logo  ; Töpel Mats 4   VIAFID ORCID Logo  ; Zelezniak Aleksej 5   VIAFID ORCID Logo 

 Chalmers University of Technology, Department of Biology and Biological Engineering, Gothenburg, Sweden (GRID:grid.5371.0) (ISNI:0000 0001 0775 6028) 
 Chalmers University of Technology, Department of Biology and Biological Engineering, Gothenburg, Sweden (GRID:grid.5371.0) (ISNI:0000 0001 0775 6028); Chalmers University of Technology, Novo Nordisk Foundation Center for Biosustainability, Gothenburg, Sweden (GRID:grid.5371.0) (ISNI:0000 0001 0775 6028) 
 Chalmers University of Technology, Computer Science and Engineering, Gothenburg, Sweden (GRID:grid.5371.0) (ISNI:0000 0001 0775 6028) 
 University of Gothenburg, Department of Marine Sciences, Gothenburg, Sweden (GRID:grid.8761.8) (ISNI:0000 0000 9919 9582); Gothenburg Global Biodiversity Center (GGBC), Gothenburg, Sweden (GRID:grid.8761.8) 
 Chalmers University of Technology, Department of Biology and Biological Engineering, Gothenburg, Sweden (GRID:grid.5371.0) (ISNI:0000 0001 0775 6028); Science for Life Laboratory, Stockholm, Sweden (GRID:grid.452834.c) 
Publication year
2020
Publication date
2020
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2473271440
Copyright
© The Author(s) 2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.