Content area
The growing adoption of supercomputers across various scientific disciplines, particularly by researchers without a background in computer science, has intensified the demand for parallel applications. These applications are typically developed using a combination of programming models within languages such as C, C++, and Fortran. However, modern multi-core processors and accelerators necessitate fine-grained control to achieve effective parallelism, complicating the development process. To address this, developers commonly utilize high-level programming models such as Open Multi-Processing (OpenMP), Open Accelerators (OpenACCs), Message Passing Interface (MPI), and Compute Unified Device Architecture (CUDA). These models may be used independently or combined into dual- or tri-model applications to leverage their complementary strengths. However, integrating multiple models introduces subtle and difficult-to-detect runtime errors such as data races, deadlocks, and livelocks that often elude conventional compilers. This complexity is exacerbated in applications that simultaneously incorporate MPI, OpenMP, and CUDA, where the origin of runtime errors, whether from individual models, user logic, or their interactions, becomes ambiguous. Moreover, existing tools are inadequate for detecting such errors in tri-model applications, leaving a critical gap in development support. To address this gap, the present study introduces a static analysis tool designed specifically for tri-model applications combining MPI, OpenMP, and CUDA in C++-based environments. The tool analyzes source code to identify both actual and potential runtime errors prior to execution. Central to this approach is the introduction of error dependency graphs, a novel mechanism for systematically representing and analyzing error correlations in hybrid applications. By offering both error classification and comprehensive static detection, the proposed tool enhances error visibility and reduces manual testing effort. This contributes significantly to the development of more robust parallel applications for high-performance computing (HPC) and future exascale systems.
Details
; Eassa Fathy Elbouraey 2
; Sharaf, Sanaa Abdullah 2
; Alghamdi, Ahmed Mohammed 3
; Almarhabi, Khalid Ali 4
; Khalid Rana Ahmad Bilal 5
1 Department of Computer Science, Faculty of Computing, and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia, Department of Computer Science and Artificial Intelligence, Umm Al-Qura University, Makkah 21955, Saudi Arabia
2 Department of Computer Science, Faculty of Computing, and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
3 Department of Software Engineering, College of Computer Science and Engineering, University of Jeddah, Jeddah 21493, Saudi Arabia
4 Department of Computer Science, College of Computing at Alqunfudah, Umm Al-Qura University, Makkah 21514, Saudi Arabia
5 College of Engineering and Physical Sciences, Aston University, Aston Triangle, Birmingham B4 7ET, UK