Content area

Abstract

This dissertation explores the problem of analyzing and optimizing data-dependent programs from a theoretical and practical perspective. The performance of these programs depends in a complex manner on the distribution of the input data, and they arise in many contexts, e.g. databases, sparse tensor programming, and graph analytics. By definition, these programs cannot be optimized by considering the code alone, so an optimizer for them must consider information about the data distribution. This dissertation presents two new theoretical approaches for analyzing data-dependent programs by bounding the size of their intermediate results: the degree sequence bound and partition constraints. It then describes two practical systems for producing these bounds: SafeBound and COLOR. Lastly, we present a state-of-the-art optimizer for sparse tensor programs, Galley, that demonstrates the value of data-aware optimization.

Details

1010268
Title
Data-Aware Complexity Analysis and Program Optimization
Number of pages
277
Publication year
2025
Degree date
2025
School code
0250
Source
DAI-B 87/3(E), Dissertation Abstracts International
ISBN
9798293850617
Committee member
Bernstein, Gilbert
University/institution
University of Washington
Department
Computer Science and Engineering
University location
United States -- Washington
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32238123
ProQuest document ID
3251632233
Document URL
https://www.proquest.com/dissertations-theses/data-aware-complexity-analysis-program/docview/3251632233/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic