Content area

Abstract

The rapid evolution of computing demands from Machine Learning (ML) and High Performance Computing (HPC) applications has exposed the limitations of general-purpose architectures, necessitating a shift towards domain-specific computing. This surge in demand is driven by the need to train massive ML models, process large-scale data in real-time, and execute highly parallelized computations, which general-purpose architectures struggle to handle efficiently. This transition, however, presents significant challenges in system design, particularly in terms of complexity, cost, and human efforts. System design encompasses design and optimization of computing stack ranging from computer architecture, architecture-specific compiler, program optimization and co-designing hardware and software that balances unique constraint requirements like latency, throughput and energy consumption. On one hand, designing an efficient computing system is challenging. On the other hand, there are also opportunities to improve the system design itself. This thesis presents two part work towards designing computing systems for domain-specific applications: system design for HPC applications and ML/HPC for the system design.

Towards the first thrust, this thesis introduces C-SAW, a GPU-accelerated HPC system for efficient graph sampling and random walks. C-SAW is the first framework to support a diverse set of mainstream and emerging graph sampling algorithms on GPUs. C-SAW devises a MapReduce-style, bias-centric programming interface that generalizes to diverse algorithms. Towards the second thrust, this thesis lays the groundwork for applying ML in the system microarchitecture by introducing an ML-based microarchitecture performance modeling and performance analysis framework. This thesis introduces ML techniques to accurately model the performance of a microarchitecture, builds an HPC framework for making ML-based microarchitecture simulation efficient, and redesigns ML-based simulation to make it more reusable and adaptable.

By building a GPU-accelerated HPC system optimized for graph applications and creating an ML-driven tool for microarchitecture performance analysis, this thesis contributes to both high-efficiency computing systems and improving system evaluation methodologies.

Details

1010268
Title
Synergizing High Performance Computing and Machine Learning for Co-Designing Computing Systems
Number of pages
130
Publication year
2025
Degree date
2025
School code
0190
Source
DAI-A 87/1(E), Dissertation Abstracts International
ISBN
9798290614465
Advisor
Committee member
Zhang, Zhao; Yazdanbakhsh, Amir; Chen, Yingying
University/institution
Rutgers The State University of New Jersey, School of Graduate Studies
Department
Electrical and Computer Engineering
University location
United States -- New Jersey
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32166807
ProQuest document ID
3234535465
Document URL
https://www.proquest.com/dissertations-theses/synergizing-high-performance-computing-machine/docview/3234535465/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic