Content area

Abstract

In massive and rapid graph streams, a useful and important task is to summarize the structure of graph streams in order to enable efficient and effective graph query processing. Although this task has been extensively studied in the literature, we observe that the existing solutions for graph sketches can only answer queries about the current status of the graph stream. In this paper, we aim at designing persistent graph sketches to support graph queries in any given time range in the past. To this end, we first introduce a baseline method by extending an existing graph summarization method. However, our empirical study suggests that the accuracy performance of the baseline method is unsatisfactory, especially when the query time interval is large. To tackle this issue, we propose a new method PGSS-BDH, which stores the streaming edges using a set of hierarchically organized hashmaps. When a query arrives, we divide the query time interval into a set of disjoint sub-intervals and return the sum of query results on all sub-intervals as the overall query answer. Observing that PGSS-BDH bears a linear space cost to the graph stream size, we further propose an advance method PGSS-MDC by using a set of fixed-size hierarchical counters to store the weight of edges, where the query processing is similar to PGSS-BDH. We theoretically analyze the accuracy performance of PGSS-BDH and PGSS-MDC. The experiment results on real-life datasets demonstrate that PGSS-MDC can return much more accurate answers than the competitors by consuming comparable query time and much less memory.

Details

Title
Persistent graph stream summarization for real-time graph analytics
Author
Jia, Yan 1 ; Gu, Zhaoquan 1 ; Jiang, Zhihao 2 ; Gao, Cuiyun 1 ; Yang, Jianye 3 

 Harbin Institute of Technology, School of Computer Science and Technology, Shenzhen, China (GRID:grid.19373.3f) (ISNI:0000 0001 0193 3564) 
 Hunan University, College of Computer Sience and Electronic Engineering, Changsha, China (GRID:grid.67293.39) 
 Guangzhou University, Cyberspace Institute of Advanced Technology, Guangzhou, China (GRID:grid.411863.9) (ISNI:0000 0001 0067 3588) 
Pages
2647-2667
Publication year
2023
Publication date
Sep 2023
Publisher
Springer Nature B.V.
ISSN
1386145X
e-ISSN
15731413
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2875646291
Copyright
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.