Content area
The rise of heterogeneous computing architectures in High-Performance Computing (HPC) has revolutionized computing capabilities, but introduced significant programming complexities. Among these challenges, efficient data movement and memory management between CPUs and connected accelerators has proven to be critical for achieving optimal performance. Existing solutions require significant development effort to attain maximal performance while maintaining portability. This thesis presents complementary methodologies for addressing these challenges through static and dynamic analysis of heterogeneous OpenMP applications. The first contribution is OMPDart, a static analysis tool that automates the generation of efficient OpenMP data mappings. OMPDart implements interprocedural and context-sensitive static analysis techniques to model data dependencies across host and device memory spaces and transforms source code to optimize data transfers. The second contribution employs dynamic analysis to detect, profile, and estimate optimization potential of inefficient data mapping patterns in OpenMP applications. We implement the proposed pattern detection approaches in OMPDataPerf, which provides source code attribution and actionable insights for the optimization of inefficient data mappings. These tools and techniques advance the state of the art in automating and optimizing data mappings in heterogeneous applications, reducing programmer burden and enabling more efficient use of modern HPC platforms.
