Content area

Abstract

Background

The advent of Single Molecule Real-Time (SMRT) sequencing has overcome many limitations of second-generation sequencing, such as limited read lengths, PCR amplification biases. However, longer reads increase data volume exponentially and high error rates make many existing alignment tools inapplicable. Additionally, a single CPU’s performance bottleneck restricts the effectiveness of alignment algorithms for SMRT sequencing.

Results

To address these challenges, we introduce ParaHAT, a parallel alignment algorithm for noisy long reads. ParaHAT utilizes vector-level, thread-level, process-level, and heterogeneous parallelism. We redesign the dynamic programming matrices layouts to eliminate data dependency in the base-level alignment, enabling effective vectorization. We further enhance computational speed through heterogeneous parallel technology and implement the algorithm for multi-node computing using MPI, overcoming the computational limits of a single node.

Conclusions

Performance evaluations show that ParaHAT got a 10.03x speedup in base-level alignment, with a parallel acceleration ratio and weak scalability metric of 94.61 and 98.98% on 128 nodes, respectively.

Details

1009240
Title
Fast noisy long read alignment with multi-level parallelism
Publication title
Volume
26
Pages
1-31
Publication year
2025
Publication date
2025
Section
Research
Publisher
Springer Nature B.V.
Place of publication
London
Country of publication
Netherlands
Publication subject
e-ISSN
14712105
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-05-02
Milestone dates
2024-10-30 (Received); 2025-04-01 (Accepted); 2025-05-02 (Published)
Publication history
 
 
   First posting date
02 May 2025
ProQuest document ID
3201517842
Document URL
https://www.proquest.com/scholarly-journals/fast-noisy-long-read-alignment-with-multi-level/docview/3201517842/se-2?accountid=208611
Copyright
© 2025. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-05-09
Database
ProQuest One Academic