Efficient discovery of embedded patterns from large attributed trees
Document Type
Conference Proceeding
Publication Date
1-1-2018
Abstract
Discovering informative patterns deeply hidden in large tree datasets is an important research area that has many practical applications. Many modern applications and systems represent, export and exchange data in the form of trees whose nodes are associated with attributes. In this paper, we address the problem of mining frequent embedded attributed patterns from large attributed data trees. Attributed pattern mining requires combining tree mining and itemset mining. This results in exploring a larger pattern search space compared to addressing each problem separately. We first design an interleaved pattern mining approach which extends the equivalence-class based tree pattern enumeration technique with attribute sets enumeration. Further, we propose a novel layered approach to discover all frequent attributed patterns in stages. This approach seamlessly integrates an itemset mining technique with a recent unordered embedded tree pattern mining algorithm to greatly reduce the pattern search space. Our extensive experimental results on real and synthetic large-tree datasets show that the layered approach displays, in most cases, orders of magnitude performance improvements over both the interleaved mining method and the attribute-as-node embedded tree pattern mining method and has good scaleup properties.
Identifier
85048971810 (Scopus)
ISBN
[9783319914572]
Publication Title
Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics
External Full Text Location
https://doi.org/10.1007/978-3-319-91458-9_34
e-ISSN
16113349
ISSN
03029743
First Page
558
Last Page
576
Volume
10828 LNCS
Recommended Citation
Wu, Xiaoying and Theodoratos, Dimitri, "Efficient discovery of embedded patterns from large attributed trees" (2018). Faculty Publications. 8946.
https://digitalcommons.njit.edu/fac_pubs/8946
