Content area

Abstract

Integrating artificial intelligence (AI) with database (DB) systems unlocks transformative opportunities to overcome the limitations of both fields: the lack of intelligence and automation for data management, and the lack of declarative and unified storage and computation optimization for end-to-end data-driven AI workflows. This dissertation explores critical challenges toward a fully integrated framework for both "AI4DB" and "DB4AI" with a focus on the following research problems: (1) How to leverage neural network models to compactly and accurately represent data to speed up database queries in resource-constrained environments? (2) How to manage neural network models as relations in database storage systems to enable efficient serving of AI models from database systems? (3) How to co-optimize AI computations and relational computations to enable efficient execution of inference queries that nest AI model inferences and Structured Query Language (SQL) processing?

To bridge these gaps, this work proposes a novel architecture for next-generation database systems that fully integrates AI and DB, with three core contributions. First, it introduces DeepMapping, a learning-based hybrid data representation for lossless data compression and efficient key-value lookup. DeepMapping couples a neural network model that learns the key-value mapping with an auxiliary data structure that corrects model errors to provide 100% accuracy and caches insertions, deletions, and updates to avoid frequent retraining while supporting data modifications. Second, it presents a novel database storage system that unifies the management of relational data and neural network models to support in-database model serving and avoid data transfer overheads. It leverages an accuracy-aware similar tensor block deduplication algorithm, a greedy packing algorithm for tensor blocks, and an enhanced caching mechanism to reduce memory and storage costs while maintaining low-latency inference for serving multiple models. Lastly, CactusDB, a novel database query compilation system, co-optimizes SQL and AI/ML inference using a flexible multi-level intermediate representation and a learned query optimizer. By integrating deep learning into database systems and enabling co-optimization, this research advances the design of next-generation database systems capable of supporting complex AI-based analytics with significant acceleration over the state of the art.

Details

1010268
Business indexing term
Title
Toward Next-Generation Database System: Integrating Data Management With Artificial Intelligence
Author
Number of pages
165
Publication year
2025
Degree date
2025
School code
0010
Source
DAI-B 87/6(E), Dissertation Abstracts International
ISBN
9798265480286
Advisor
Committee member
Baral, Chitta; Candan, Kasim Selcuk; Zhao, Ming
University/institution
Arizona State University
Department
Computer Engineering
University location
United States -- Arizona
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32282985
ProQuest document ID
3281171285
Document URL
https://www.proquest.com/dissertations-theses/toward-next-generation-database-system/docview/3281171285/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic