Full text

Turn on search term navigation

© 2020 Onimaru et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

With advances in sequencing technology, a vast amount of genomic sequence information has become available. However, annotating biological functions particularly of non-protein-coding regions in genome sequences without experiments is still a challenging task. Recently deep learning–based methods were shown to have the ability to predict gene regulatory regions from genome sequences, promising to aid the interpretation of genomic sequence data. Here, we report an improvement of the prediction accuracy for gene regulatory regions by using the design of convolution layers that efficiently process genomic sequence information, and developed a software, DeepGMAP, to train and compare different deep learning–based models (https://github.com/koonimaru/DeepGMAP). First, we demonstrate that our convolution layers, termed forward- and reverse-sequence scan (FRSS) layers, integrate both forward and reverse strand information, and enhance the power to predict gene regulatory regions. Second, we assessed previous studies and identified problems associated with data structures that caused overfitting. Finally, we introduce visualization methods to examine what the program learned. Together, our FRSS layers improve the prediction accuracy for gene regulatory regions.

Details

Title
Predicting gene regulatory regions with a convolutional neural network for processing double-strand genome sequence information
Author
Koh Onimaru; Nishimura, Osamu; Kuraku, Shigehiro
First page
e0235748
Section
Research Article
Publication year
2020
Publication date
Jul 2020
Publisher
Public Library of Science
e-ISSN
19326203
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2426534239
Copyright
© 2020 Onimaru et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.