Abstract

Despite the relative maturity of bulk RNA-sequencing, compared to more recent developments such as single-cell and spatial RNA-sequencing, biases which impact data analysis continue to surface. One such bias, termed "Differential Allelic Representation" (DAR), is particularly evident when experimental samples are taken from non-isogenic genetic backgrounds. DAR is an uneven distribution of polymorphic loci between groups of experimental samples undergoing differential gene expression (DGE) analysis. When unequally represented polymorphic loci are also expression quantitative trait loci (eQTLs), DAR can lead to differences in gene expression which are not directly relevant to the primary research objectives. To mitigate DAR in both new and existing datsets, we introduce tadar, a Bioconductor package designed to facilitate transcriptome analysis by accounting for differential allelic representation. tadar implements a methodology that calculates a DAR metric at each polymorphic locus across the genome, which then serves as a predictive measure of a locus' potential to cause eQTL-driven expression differences. This metric is then used to reduce eQTL noise in bulk RNA-Sequencing data.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* https://doi.org/10.5281/zenodo.14769085

Details

Title
tadar: an R/Bioconductor package to reduce eQTL noise in differential expression analysis
Author
Baer, Lachlan; Lardelli, Michael M; Pederson, Stevie M
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2025
Publication date
Feb 13, 2025
Publisher
Cold Spring Harbor Laboratory Press
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
ProQuest document ID
3166352295
Copyright
© 2025. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.