Billion-scale Detection of Isomorphic Nodes

Document Type

Conference Proceeding

Publication Date

1-1-2023

Abstract

This paper presents an algorithm for detecting attributed high-degree node isomorphism. High-degree isomorphic nodes seldom happen by chance and often represent duplicated entities or data processing errors. By definition, isomorphic nodes are topologically indistinguishable and can be problematic in graph ML tasks. The algorithm employs a parallel, 'degree-bounded' approach that fingerprints each node's local properties through a hash, which constrains the search to nodes within hash-defined buckets, thus minimising the number of comparisons. This method scales on graphs with billions of nodes and edges. Finally, we provide isomorphic node oddities identified in real-world data.

Identifier

85169299295 (Scopus)

ISBN

[9798350311990]

Publication Title

2023 IEEE International Parallel and Distributed Processing Symposium Workshops Ipdpsw 2023

External Full Text Location

https://doi.org/10.1109/IPDPSW59300.2023.00046

First Page

230

Last Page

233

Grant

2109988

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS