Abstract

Background

Transcriptome sequencing (RNA-seq) is a powerful technology for gene expression profiling. Selection of optimal parameters for cDNA library generation is crucial for acquisition of high-quality data. In this study, we investigate the impact of the amount of RNA and the number of PCR cycles used for sample amplification on the rate of PCR duplication and, in consequence, on the RNA-seq data quality.

Results

For broader applicability, we sequenced the data on four short-read sequencing platforms: Illumina NovaSeq 6000, Illumina NovaSeq X, Element Biosciences AVITI, and Singular Genomics G4. The native Illumina libraries were converted for sequencing on AVITI and G4 to assess the effect of library conversion, containing additional PCR cycles. We find that the rate of PCR duplicates depends on the combined effect of RNA input material and the number of PCR cycles used for amplification. For input amounts lower than 125 ng, 34–96% of reads were discarded via deduplication with the percentage increasing with lower input amount and decreasing with increasing PCR cycles. The reduced read diversity for low input amounts leads to fewer genes detected and increased noise in expression counts.

Conclusions

Data generated with each of the four sequencing platforms presents similar associations between starting material amount and the number of PCR cycles on PCR duplicates, a similar number of detected genes, and comparable gene expression profiles.

Details

Title
The impact of PCR duplication on RNAseq data generated using NovaSeq 6000, NovaSeq X, AVITI, and G4 sequencers
Author
Zajac, Natalia; Vlachos, Ioannis S; Sajibu, Sija; Opitz, Lennart; Wang, Shuoshuo; Chittur, Sridar V; Mason, Christopher E; Knudtson, Kevin L; Ashton, John M; Rehrauer, Hubert; Aquino, Catharine
Pages
1-17
Section
Research
Publication year
2025
Publication date
2025
Publisher
BioMed Central
ISSN
14747596
e-ISSN
1474760X
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3216564484
Copyright
© 2025. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.