A novel framework for the efficient evaluation of hybrid tree-pattern queries on large data graphs

Document Type

Article

Publication Date

7-1-2023

Abstract

Graph pattern matching is a fundamental operation for exploring and analyzing large graphs, which are widely used to represent entities and their relationships in a plethora of applications. However, existing algorithms do not perform satisfactorily as they generate a large number of intermediate results. In this paper, we introduce a novel framework to address the problem of efficiently finding homomorphic matches of hybrid tree-patterns over large graphs. These are patterns that can include two types of edges: child edges (which are mapped to edges in the data graph) and/or descendant edges (which are mapped to paths in the data graph), thus allowing for higher expressiveness and flexibility in query formulation. We introduce the concept of answer graph to compactly represent the query results and exploit computation sharing. Leveraging the answer graph, our approach is able to avoid the generation of intermediate results not participating in the query answer, and can produce the query answer in time linear on the size of the input data graph and the output. Moreover, the answer graph allows query counting without explicitly enumerating query results. We design a bottom-up algorithm (BUP) and a simulation-based algorithm (SIM) for building the answer graph, each having their own strengths in different cases. We also present two algorithms for enumerating the results using the answer graph. Our extensive experimental evaluation on real and synthetic datasets shows that our proposed techniques greatly outperform the state-of-the-art graph matching algorithms as well as a popular graph DBMS on evaluating various hybrid tree-pattern queries. Our source code, datasets and queries are publicly available at https://github.com/wuxyng/TwigMatchOnGraph.

Identifier

85165129998 (Scopus)

Publication Title

Information Systems

External Full Text Location

https://doi.org/10.1016/j.is.2023.102249

ISSN

03064379

Volume

117

Grant

61872276

Fund Ref

National Natural Science Foundation of China

This document is currently not available here.

Share

COinS