Content area

Abstract

We find that current computational methods for estimating transcript abundance from RNA-seq data can lead to hundreds of false-positive results. We show that these systematic errors stem largely from a failure to model fragment GC content bias. Sample-specific biases associated with fragment sequence features lead to misidentification of transcript isoforms. We introduce alpine, a method for estimating sample-specific bias-corrected transcript abundance. By incorporating fragment sequence features, alpine greatly increases the accuracy of transcript abundance estimates, enabling a fourfold reduction in the number of false positives for reported changes in expression compared with Cufflinks. Using simulated data, we also show that alpine retains the ability to discover true positives, similar to other approaches.

Details

Title
Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation
Author
Love, Michael I; Hogenesch, John B; Irizarry, Rafael A
Pages
1287-1291
Publication year
2016
Publication date
Dec 2016
Publisher
Nature Publishing Group
ISSN
10870156
e-ISSN
15461696
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
1846712667
Copyright
Copyright Nature Publishing Group Dec 2016