On a two-stage progressive clustering algorithm with graph-augmented density peak clustering
Document Type
Article
Publication Date
2-1-2022
Abstract
Due to the rapidly growing volume and velocity of big data, real-time streaming data analysis has become increasingly important in many applications. To discover knowledge from such data, a wide range of machine learning techniques have been proposed and used in practice. Among them, clustering, which aims at grouping objects into different classes on the basis of their similarity, is the most common form of unsupervised learning. However, most existing clustering algorithms are designed for static data, and hence are not best suited for streaming data. In this paper, we propose PC-DPC, a two-stage progressive clustering algorithm with graph-augmented density peak clustering. PC-DPC first identifies clusters of streaming data using an improved density peak clustering algorithm, and then merges newly arriving data into the existing data pool by measuring inter-cluster structural similarity, which considers the distance between a center and representative points. We illustrate the superiority of PC-DPC over several state-of-the-art clustering algorithms in terms of clustering accuracy and running time on publicly available benchmark datasets.
Identifier
85120998845 (Scopus)
Publication Title
Engineering Applications of Artificial Intelligence
External Full Text Location
https://doi.org/10.1016/j.engappai.2021.104566
ISSN
09521976
Volume
108
Grant
SGSCXT00XGJS1800219
Fund Ref
Sichuan Province Science and Technology Support Program
Recommended Citation
Niu, Xinzheng; Zheng, Yunhong; Liu, Wuji; and Wu, Chase Q., "On a two-stage progressive clustering algorithm with graph-augmented density peak clustering" (2022). Faculty Publications. 3138.
https://digitalcommons.njit.edu/fac_pubs/3138