Synergizing High Performance Computing and Machine Learning for Co-Designing Computing Systems

Abstract

The rapid evolution of computing demands from Machine Learning (ML) and High Performance Computing (HPC) applications has exposed the limitations of general-purpose architectures, necessitating a shift towards domain-specific computing. This surge in demand is driven by the need to train massive ML models, process large-scale data in real-time, and execute highly parallelized computations, which general-purpose architectures struggle to handle efficiently. This transition, however, presents significant challenges in system design, particularly in terms of complexity, cost, and human efforts. System design encompasses design and optimization of computing stack ranging from computer architecture, architecture-specific compiler, program optimization and co-designing hardware and software that balances unique constraint requirements like latency, throughput and energy consumption. On one hand, designing an efficient computing system is challenging. On the other hand, there are also opportunities to improve the system design itself. This thesis presents two part work towards designing computing systems for domain-specific applications: system design for HPC applications and ML/HPC for the system design.

Towards the first thrust, this thesis introduces C-SAW, a GPU-accelerated HPC system for efficient graph sampling and random walks. C-SAW is the first framework to support a diverse set of mainstream and emerging graph sampling algorithms on GPUs. C-SAW devises a MapReduce-style, bias-centric programming interface that generalizes to diverse algorithms. Towards the second thrust, this thesis lays the groundwork for applying ML in the system microarchitecture by introducing an ML-based microarchitecture performance modeling and performance analysis framework. This thesis introduces ML techniques to accurately model the performance of a microarchitecture, builds an HPC framework for making ML-based microarchitecture simulation efficient, and redesigns ML-based simulation to make it more reusable and adaptable.

By building a GPU-accelerated HPC system optimized for graph applications and creating an ML-driven tool for microarchitecture performance analysis, this thesis contributes to both high-efficiency computing systems and improving system evaluation methodologies.

Details

Subject

Computer engineering;
Computer science;
Design

Classification

0464: Computer Engineering
0984: Computer science
0389: Design

Identifier / keyword

Computer systems; High Performance Computing; Machine Learning; Microarchitecture; Computer architecture

Title

Synergizing High Performance Computing and Machine Learning for Co-Designing Computing Systems

Author

Pandey, Santosh

Number of pages

130

Publication year

2025

Degree date

2025

School code

0190

Source

DAI-A 87/1(E), Dissertation Abstracts International

ISBN

9798290614465

Advisor

Liu, Hang

Committee member

Zhang, Zhao; Yazdanbakhsh, Amir; Chen, Yingying

University/institution

Rutgers The State University of New Jersey, School of Graduate Studies

Department

Electrical and Computer Engineering

University location

United States -- New Jersey

Degree

Ph.D.

Source type

Dissertation or Thesis

Language

English

Document type

Dissertation/Thesis

Dissertation/thesis number

32166807

ProQuest document ID

3234535465

Document URL

https://www.proquest.com/dissertations-theses/synergizing-high-performance-computing-machine/docview/3234535465/se-2?accountid=208611

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

Database

ProQuest One Academic

Synergizing High Performance Computing and Machine Learning for Co-Designing Computing Systems

Content area

Abstract

Details