Content area
Conference Title: 2025 34th International Conference on Parallel Architectures and Compilation Techniques (PACT)
Conference Start Date: 2025 Nov. 3
Conference End Date: 2025 Nov. 6
Conference Location: Irvine, CA, USA
Modern high-performance computing (HPC) platforms rely on hybrid combinations of shared-memory and distributed-memory programming models to achieve scalable parallelism. Despite their success, these hybrid approaches remain challenging for many programmers trained primarily in shared-memory models such as OpenMP. For decades, researchers have sought ways to automatically generate distributedmemory parallelism from shared-memory programs to bridge this productivity gap. The Partitioned Global Address Space (PGAS) approach, exemplified by UPC and OpenSHMEM, represents one such effort, but its reliance on Bulk Synchronous Parallel (BSP) execution often results in poor performance for irregular algorithms, such as those found in graph applications. This paper presents a compiler-driven approach that automatically generates high-performance distributed-memory code from a simple PGAS-style extension to OpenMP, referred to as PGAS-OpenMP. The generated code employs an actor-based distributed-memory execution model, which has been shown to significantly outperform BSP implementations for irregular workloads. While BSP-style PGAS code can always be generated in a straightforward manner from PGAS-OpenMP, actor-based translation requires careful analysis to ensure correctness. Our experimental results demonstrate that the proposed compiler safely converts a wide range of PGAS-OpenMP graph applications into asynchronous actor-based code, yielding substantial performance improvements. Unlike prior work on translating shared-memory programs to distributed-memory systems, this is the first approach to automatically target actor-based execution for programs with irregular parallelism. The generated code supports fine-grained asynchronous one-sided messaging with automatic message aggregation, enabling efficient communication and computation overlap. This research addresses a long-standing challenge in HPC combining programmer productivity with high performance for irregular distributed-memory applications and provides a promising pathway toward unifying shared-memory programmability with actor-based distributed-memory execution for current and future HPC systems.
Details
1 Georgia Institute of Technology,College of Computing,Atlanta,Georgia,USA