Content area

Abstract

In an era of rapidly growing data, efficient and intelligent relational data management is essential for generating actionable insights and automating decision-making. A key factor driving advancements in this domain is the use of data semantics, which captures the deeper meaning and context of data, extending beyond traditional heuristic and syntactic approaches. By leveraging data semantics, we can enhance tasks such as insight generation, data integration, and other essential relational data management tasks.

This dissertation explores how advanced data semantics can address several key challenges in relational data management. First, we investigate methods to capture user-defined semantics for assessing the interestingness of data insights, moving beyond traditional developer-defined measures of interestingness. Second, we leverage the enhanced natural language understanding capabilities of large language models (LLMs) to generate fine-grained column semantics for relational data and introduce the concept of “aggregate-related table search”, which captures table semantics across varying aggregation levels. Finally, we propose a self-training framework for LLM fine-tuning on table-related tasks, incorporating table task semantics by generating and validating training data to improve model performance in tasks such as natural language to SQL and schema matching.

Through these contributions, this dissertation aims to advance relational data management by embedding a deeper understanding of different aspects of data semantics into various data applications, including data analysis and data discovery systems, ultimately improving the performance of relational data management tasks.

Details

1010268
Title
Leveraging Data Semantics for Relational Data Management Tasks
Number of pages
121
Publication year
2025
Degree date
2025
School code
0127
Source
DAI-B 86/11(E), Dissertation Abstracts International
ISBN
9798314874769
Committee member
Hemphill, Libby; Mozafari, Barzan; Wang, Xinyu
University/institution
University of Michigan
Department
Computer Science & Engineering
University location
United States -- Michigan
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32092611
ProQuest document ID
3202727682
Document URL
https://www.proquest.com/dissertations-theses/leveraging-data-semantics-relational-management/docview/3202727682/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic