Content area

Abstract

The search for joinable data is pivotal for numerous applications, such as data integration, data augmentation, and data analysis. Although there have been many successful joinable search studies for table discovery, the study of finding joinable spatial datasets for a given query from multiple spatial data sources has not been well considered. This paper studies two cases of joinable search problems from multiple spatial data sources. In addition to the overlap joinable search problem (OJSP), we also propose a novel coverage joinable search problem (CJSP) that has not been considered before, motivated by many real-world applications in the field of spatial search. To support two cases of joinable search over multiple spatial data sources seamlessly, we propose a multi-source spatial dataset search framework. Firstly, we design a DIstributed Tree-based Spatial index structure called DITS, which is used not only to design acceleration strategies to speed up joinable searches, but also to support efficient communication between multiple data sources. Additionally, we prove that the CJSP is NP-hard and design a greedy approximate algorithm to solve the problem. We evaluate the efficiency of our search framework on five real-world data sources, and the experimental results show that our framework can significantly reduce running time and communication costs compared with baselines.

Details

1009240
Identifier / keyword
Title
Joinable Search over Multi-source Spatial Datasets: Overlap, Coverage, and Efficiency
Publication title
arXiv.org; Ithaca
Publication year
2024
Publication date
Dec 10, 2024
Section
Computer Science
Publisher
Cornell University Library, arXiv.org
Source
arXiv.org
Place of publication
Ithaca
Country of publication
United States
University/institution
Cornell University Library arXiv.org
e-ISSN
2331-8422
Source type
Working Paper
Language of publication
English
Document type
Working Paper
Publication history
 
 
Online publication date
2024-12-11
Milestone dates
2023-11-22 (Submission v1); 2024-11-11 (Submission v2); 2024-12-05 (Submission v3); 2024-12-10 (Submission v4)
Publication history
 
 
   First posting date
11 Dec 2024
ProQuest document ID
2892789226
Document URL
https://www.proquest.com/working-papers/joinable-search-over-multi-source-spatial/docview/2892789226/se-2?accountid=208611
Full text outside of ProQuest
Copyright
© 2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2024-12-12
Database
2 databases
  • ProQuest One Academic
  • ProQuest One Academic