Content area

Abstract

Modern computer architectures are complex, with a wide range of features that can be leveraged to optimize software performance. However, efficiently optimizing software for these architectures remains a challenging task. Traditional optimization techniques, such as manual optimization by human experts or optimization by compilers, may not fully exploit the unique features of novel architectures, leading to suboptimal performance.

Program synthesis is a promising approach to address this challenge. At its core, program synthesis searches for programs that meet a specified set of requirements. Recent advances in SMT (Satisfiability Modulo Theories) solvers and increased computation power have made program synthesis a viable choice for code generation. Program synthesis can generate code specifically tailored to the target architecture, leveraging domain-specific knowledge and advanced search techniques to create highly optimized code.

The overall goal of this dissertation is to develop program synthesizers that efficiently optimize software for emerging architectures. The thesis statement of my dissertation is that program synthesis can be used to generate highly optimized code for novel architectures, outperforming traditional optimization techniques. To achieve this goal, this dissertation introduces program synthesizers that can generate optimized code for a variety of hardware platforms. The effectiveness of these synthesizers is evaluated by comparing the generated code with manually optimized code and code generated by traditional compilers. The results show that program synthesis can produce code that is faster and more efficient than code generated by traditional optimization techniques.

This dissertation presents Minotaur, a superoptimizer that uses program synthesis to optimize LLVM IR (Low Level Virtual Machine Intermediate Representation) code. Minotaur extracts program slices from LLVM IR code, and uses an SMT solver to find optimized versions of these slices. Minotaur is designed to work within the LLVM optimization pipeline, and can be used to discover new optimizations that are missed by commodity compilers.

This dissertation introduces SCCL (Synthesized Collective Communication Library), a program synthesizer that optimizes collective algorithms for parallel computation. SCCL uses domain-specific knowledge about collective algorithms to generate highly optimized code for specific architectures. SCCL is designed to be a drop-in replacement for existing collective communication libraries, such as NVIDIA NCCL (NVIDIA Collective Communications Library) and AMD RCCL (ROCm Collective Communications Library). SCCL synthesize novel latency and bandwidth optimal algorithms not seen in the literature on two popular hardware topologies. This dissertation also shows how SCCL efficiently lowers algorithms to implementations on two GPU (Graphics Processing Unit) hardware architectures and demonstrate competitive performance with hand optimized collective communication libraries.

Details

1010268
Title
Program Synthesis for Performance Optimizations
Number of pages
98
Publication year
2025
Degree date
2025
School code
0240
Source
DAI-B 87/3(E), Dissertation Abstracts International
ISBN
9798293837922
Advisor
Committee member
Gopalakrishnan, Ganesh; Hall, Mary W.; Rakamaric, Zvonimir; Sampson, Adrian
University/institution
The University of Utah
Department
School of Computing
University location
United States -- Utah
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
31931656
ProQuest document ID
3250316310
Document URL
https://www.proquest.com/dissertations-theses/program-synthesis-performance-optimizations/docview/3250316310/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic