Content area

Abstract

As database query processing techniques are being used to handle diverse workloads, a key emerging challenge is how to efficiently handle multi-way join queries containing multiple many-to-many joins. While uncommon in traditional enterprise settings that have been the focus of much of the query optimization work to date, such queries are seen frequently in other contexts such as graph workloads. This has led to much work on developing join algorithms for handling cyclic queries, on compressed (factorized) representations for more efficient storage of intermediate results, and on use of semi-joins or predicate transfer to avoid generating large redundant intermediate results. In this paper, we address a core query optimization problem in this context. Specifically, we introduce an improved cost model that more accurately captures the cost of a query plan in such scenarios, and we present several optimization algorithms for query optimization that incorporate these new cost functions. We present an extensive experimental evaluation, that compares the factorized representation approach with a full semi-join reduction approach as well as to an approach that uses bitvectors to eliminate tuples early through sideways information passing. We also present new analyses of robustness of these techniques to the choice of the join order, potentially eliminating the need for more complex query optimization and selectivity estimation techniques.

Details

1009240
Business indexing term
Identifier / keyword
Title
Optimizing Queries with Many-to-Many Joins
Publication title
arXiv.org; Ithaca
Publication year
2024
Publication date
Dec 20, 2024
Section
Computer Science
Publisher
Cornell University Library, arXiv.org
Source
arXiv.org
Place of publication
Ithaca
Country of publication
United States
University/institution
Cornell University Library arXiv.org
e-ISSN
2331-8422
Source type
Working Paper
Language of publication
English
Document type
Working Paper
Publication history
 
 
Online publication date
2024-12-24
Milestone dates
2024-12-20 (Submission v1)
Publication history
 
 
   First posting date
24 Dec 2024
ProQuest document ID
3148947893
Document URL
https://www.proquest.com/working-papers/optimizing-queries-with-many-joins/docview/3148947893/se-2?accountid=208611
Full text outside of ProQuest
Copyright
© 2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2024-12-25
Database
ProQuest One Academic