Abstract

BioFeatureFinder is a novel algorithm which allows analyses of many biological genomic landmarks (including alternatively spliced exons, DNA/RNA-binding protein binding sites, and gene/transcript functional elements, nucleotide content, conservation, k-mers, secondary structure) to identify distinguishing features. BFF uses a flexible underlying model that combines classical statistical tests with Big Data machine-learning strategies. The model is created using thousands of biological characteristics (features) that are used to build a feature map and interpret category labels in genomic ranges. Our results show that BFF is a reliable platform for analyzing large-scale datasets. We evaluated the RNA binding feature map of 110 eCLIP-seq datasets and were able to recover several well-known features from the literature for RNA-binding proteins; we were also able to uncover novel associations. BioFeatureFinder is available at https://github.com/kbmlab/BioFeatureFinder/.

Details

Title
BioFeatureFinder: Flexible, unbiased analysis of biological characteristics associated with genomic regions
Author
Ciamponi, Felipe E; Lovci, Michael T; Cruz, Pedro Rs; Massirer, Katlin B
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2018
Publication date
Mar 22, 2018
Publisher
Cold Spring Harbor Laboratory Press
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
ProQuest document ID
2068979287
Copyright
�� 2018. This article is published under http://creativecommons.org/licenses/by-nd/4.0/ (���the License���). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.