Abstract

RNA abundance quantification has become routine and affordable thanks to high-throughput short-read technologies that provide accurate molecule counts at the gene level. Similarly accurate and affordable quantification of definitive full-length, transcript isoforms has remained a stubborn challenge, despite its obvious biological significance across a wide range of problems. Long-read sequencing platforms now produce data-types that can, in principle, drive routine definitive isoform quantification. However some particulars of contemporary long-read datatypes, together with isoform complexity and genetic variation, present bioinformatic challenges. We show here, using ONT data, that fast and accurate quantification of long-read data is possible and that it is improved by exome capture. To perform quantifications we developed lr-kallisto, which adapts the kallisto bulk and single-cell RNA-seq quantification methods for long-read technologies.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* We added Table 2 with the IGVF accessions for the processed data for Figure 1 for both lr-kallisto's pseudobulk and single-cell processing. In revisions of single-cell processing pipeline, we mistakenly omitted the addition of an author in our previous revision. Therefore, we are now adding A Sina Booeshaghi as an author on this manuscript.

* https://github.com/pachterlab/LSRRSRLFKOTWMWMP_2024

* https://registry.opendata.aws/sg-nex-data

* https://zenodo.org/records/11201284

* https://zenodo.org/records/13733737

Details

Title
Long-read sequencing transcriptome quantification with lr-kallisto
Author
Loving, Rebekah K; Sullivan, Delaney K; A Sina Booeshaghi; Fairlie Reese; Rebboah, Elisabeth; Sakr, Jasmine; Rezaie, Narges; Liang, Heidi Y; Filimban, Ghassan; Kawauchi, Shimako; Oakes, Conrad; Trout, Diane; Williams, Brian A; Macgregor, Grant; Wold, Barbara; Mortazavi, Ali; Pachter, Lior
University/institution
Cold Spring Harbor Laboratory Press
Section
New Results
Publication year
2025
Publication date
Jan 29, 2025
Publisher
Cold Spring Harbor Laboratory Press
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
ProQuest document ID
3145260319
Copyright
© 2025. This article is published under http://creativecommons.org/licenses/by/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.