Abstract

RNA abundance quantification has become routine and affordable thanks to high-throughput short-read technologies that provide accurate molecule counts at the gene level. Similarly accurate and affordable quantification of definitive full-length, transcript isoforms has remained a stubborn challenge, despite its obvious biological significance across a wide range of problems. Long-read sequencing platforms now produce data-types that can, in principle, drive routine definitive isoform quantification. However some particulars of contemporary long-read datatypes, together with isoform complexity and genetic variation, present bioinformatic challenges. We show here, using ONT data, that fast and accurate quantification of long-read data is possible and that it is improved by exome capture. To perform quantifications we developed lr-kallisto, which adapts the kallisto bulk and single-cell RNA-seq quantification methods for long-read technologies.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* We added Table 2 with the IGVF accessions for the processed data for Figure 1 for both lr-kallisto's pseudobulk and single-cell processing. In revisions of single-cell processing pipeline, we mistakenly omitted the addition of an author in our previous revision. Therefore, we are now adding A Sina Booeshaghi as an author on this manuscript.

* https://github.com/pachterlab/LSRRSRLFKOTWMWMP_2024

* https://registry.opendata.aws/sg-nex-data

* https://zenodo.org/records/11201284

* https://zenodo.org/records/13733737

Details

Title
Long-read sequencing transcriptome quantification with lr-kallisto
Author
Loving, Rebekah K; Sullivan, Delaney K; A Sina Booeshaghi; Fairlie Reese; Rebboah, Elisabeth; Sakr, Jasmine; Rezaie, Narges; Liang, Heidi Y; Filimban, Ghassan; Kawauchi, Shimako; Oakes, Conrad; Trout, Diane; Williams, Brian A; Macgregor, Grant; Wold, Barbara; Mortazavi, Ali; Pachter, Lior
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2025
Publication date
Jan 29, 2025
Publisher
Cold Spring Harbor Laboratory Press
Source type
Working Paper
Language of publication
English
ProQuest document ID
3145260319
Copyright
© 2025. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.