Content area

Abstract

Graph query, pattern mining and knowledge discovery become challenging on large-scale heterogeneous information networks (HINs). State-of-the-art techniques involving path propagation mainly focus on the inference of node labels, and neighborhood structures. However, entity links in the real world also contain rich hierarchical inheritance relations. For example, the vulnerability of a product version is likely to be inherited from its older versions. Taking advantage of the hierarchical inheritances can potentially improve the quality of query results. Motivated by this, we explore hierarchical inheritance relations between entities and formulate the problem of graph query on HINs with hierarchical inheritance relations. We propose a graph query search algorithm by decomposing the original query graph into multiple star queries and applying a star query algorithm to each star query. Candidates from each star query result are then constructed for the final top-k query answer to the original query. To efficiently obtain the graph query result from a large-scale HIN, we design a bound-based pruning technique by using the uniform cost search to prune the search spaces. We implement our algorithm in Spark GraphX to test the effectiveness and efficiency on synthetic and real-world datasets. Compared with two state-of-the-art graph query algorithms, our algorithm can effectively obtain more accurate results and competitive performance.

Details

Title
Scalable top-k query on information networks with hierarchical inheritance relations
Author
Wu, Fubao 1   VIAFID ORCID Logo  ; Gao, Lixin 1 

 University of Massachusetts Amherst, Department of Electrical and Computer Engineering, Amherst, USA (GRID:grid.266683.f) (ISNI:0000 0001 2166 5835) 
Publication title
Volume
42
Issue
1
Pages
1-30
Publication year
2024
Publication date
Mar 2024
Publisher
Springer Nature B.V.
Place of publication
New York
Country of publication
Netherlands
ISSN
09268782
e-ISSN
15737578
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2023-06-03
Milestone dates
2023-04-28 (Registration); 2023-04-21 (Accepted)
Publication history
 
 
   First posting date
03 Jun 2023
ProQuest document ID
3255419898
Document URL
https://www.proquest.com/scholarly-journals/scalable-top-i-k-query-on-information-networks/docview/3255419898/se-2?accountid=208611
Copyright
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023.
Last updated
2025-09-29
Database
ProQuest One Academic