Content area

Abstract

Quality control is a fundamental but often neglected step in any NGS pipeline. Detecting issues like cross-sample contamination and sample swaps is essential to control the data integrity. Here, we present NGSTroubleFinder, a novel python tool to detect cross-sample contamination in human Whole-Genome and Whole-Transcriptome Sequencing data, sample swaps and mismatches between the reported and the inferred genetic and transcriptomic sexes. NGSTroubleFinder is implemented in Python and incorporates a custom-built parallelized pileup engine written in C. The tool reports extensive information on the samples both in textual and HTML format including key plots for easy interpretation of the results. Availability and Implementation NGSTroubleFinder is written in Python and C, and it can be easily installed with pip. The tool source code and the models are freely available on github (https://github.com/STALICLA-RnD/NGSTroubleFinder) and a containerized version is available on dockerhub (https://hub.docker.com/r/staliclarnd/ngstroublefinder).

Competing Interest Statement

Authors are employees of STALICLA DDS.

Footnotes

* https://github.com/STALICLA-RnD/NGSTroubleFinder

Details

1009240
Title
NGSTroubleFinder: A tool for detection and quantification of contamination and kinship across human NGS data
Publication title
bioRxiv; Cold Spring Harbor
Publication year
2025
Publication date
Feb 5, 2025
Section
New Results
Publisher
Cold Spring Harbor Laboratory Press
Source
BioRxiv
Place of publication
Cold Spring Harbor
Country of publication
United States
University/institution
Cold Spring Harbor Laboratory Press
Publication subject
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
Document type
Working Paper
ProQuest document ID
3163596310
Document URL
https://www.proquest.com/working-papers/ngstroublefinder-tool-detection-quantification/docview/3163596310/se-2?accountid=208611
Copyright
© 2025. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-02-06
Database
ProQuest One Academic