A self-adaptive graph-based clustering method with noise identification

Li, Lin; Chen, Xiang; Song, Chengyun

doi:10.1007/s10044-023-01160-0

A self-adaptive graph-based clustering method with noise identification

Short Paper
Published: 12 April 2023

Volume 26, pages 907–916, (2023)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

160 Accesses
1 Altmetric
Explore all metrics

Abstract

Graph-based clustering methods offer competitive performance in dealing with complex and nonlinear data patterns. The outstanding characteristic of such methods is the capability to mine the internal topological structure of a dataset. However, most graph-based clustering algorithms are vulnerable to parameters. In this paper, we propose a self-adaptive graph-based clustering method (SAGC) with noise identification based on directed natural neighbor graph to auto identify the desired number of clusters and simultaneously obtain reliable clustering results without prior knowledge and parameter setting. This method adopts parameter adaptive process to deal with specific data patterns and can identify clusters with diverse shapes and detect noises. We use synthetic and UCI real-world datasets to prove the validity of the innovatory method by comparing it to k-means, DBSCAN, OPTICS, AP, SC, CutPC, and WC algorithms in terms of clustering Accuracy, Adjusted Rand index, Normalized Mutual Information and Fowlkes–Mallows index. The experimental results confirm that the proposed method contributes to the progress of graph-based clustering algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Fig. 7

Data availability

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

References

Stevens S (1951) Mathematics measurement and psychophysics. Handbook of experimental psychology
Vargas Muñoz J, Gonçalves MA, Dias Z et al (2019) Hierarchical clustering-based graphs for large scale approximate nearest neighbor search. Pattern Recogn 96(106):970
Google Scholar
Qin Y, Yu ZL, Wang CD et al (2018) A novel clustering method based on hybrid k-nearest-neighbor graph. Pattern Recogn 74:1–14
Article Google Scholar
Kim Y, Do H, Kim SB (2020) Outer-points shaver: robust graph-based clustering via node cutting. Pattern Recogn 97(107):001
Google Scholar
Xia J, Zhang J, Wang Y et al (2022) WC-KNNG-PC: watershed clustering based on k-nearest-neighbor graph and Pauta criterion. Pattern Recogn 121(108):177
Google Scholar
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Article MATH Google Scholar
Guo G, Wang H, Bell D, Bi Y, Greer K (2003) KNN model-based approach in classification. In: Meersman R, Tari Z, Schmidt DC (eds) On the move to meaningful internet systems 2003: CoopIS, DOA, and ODBASE. OTM 2003. Lecture Notes in Computer Science, vol 2888. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39964-3_62
Chapter Google Scholar
Zhu Q, Feng J, Huang J (2016) Natural neighbor: a self-adaptive neighborhood method without parameter k. Pattern Recogn Lett 80:30–36
Article Google Scholar
Li LT, Xiong ZY, Dai QZ et al (2020) A novel graph-based clustering method using noise cutting. Inf Syst 91(101):504
Google Scholar
Yan D, Wang Y, Wang J et al (2021) K-nearest neighbor search by random projection forests. IEEE Trans Big Data 7(1):147–157
Article Google Scholar
Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517
Article MATH Google Scholar
Tarjan R (1971) Depth-first search and linear graph algorithms. In: 12th Annual symposium on switching and automata theory (SWAT 1971), pp 114–121
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proc. Fifth Berkeley Sympos. Math. Statist. and probability (Berkeley, Calif., 1965/66). Univ. California Press, Berkeley, Calif., pp Vol. I: Statistics, pp 281–297
Ester M, Kriegel H, Sander J et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis E, Han J, Fayyad UM (eds) Proceedings of the second international conference on knowledge discovery and data mining (KDD-96), Portland, Oregon, USA. AAAI Press, pp 226–231
Ankerst M, Breunig MM, Kriegel H et al (1999) OPTICS: ordering points to identify the clustering structure. In: Delis A, Faloutsos C, Ghandeharizadeh S (eds) SIGMOD 1999, Proceedings ACM SIGMOD international conference on management of data, June 1–3, 1999. ACM Press, Philadelphia, Pennsylvania, USA, pp 49–60
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science (New York, NY) 315(5814):972–976
Article MathSciNet MATH Google Scholar
Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Proceedings of the 14th international conference on neural information processing systems: natural and synthetic, NIPS’01. MIT Press, Cambridge, pp 849–856
Schölkopf B, Platt J, Hofmann T (2007) A local learning approach for clustering, pp 1529–1536
McInnes L, Healy J (2017) Accelerated hierarchical density based clustering. In: 2017 IEEE International conference on data mining workshops (ICDMW), pp 33–42
Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of China (No. 41804112), the Youth Project of Science and Technology Research Program of Chongqing Education Commission of China (No. KJQN202001143) and the High Quality Development Plan of Graduate Education of Chongqing University of Technology (No. gzlcx20223216).

Funding

The research leading to these results received funding from [the Natural Science Foundation of China] under Grant Agreement No. [41804112]. The research leading to these results received funding from [the Youth Project of Science and Technology Research Program of Chongqing Education Commission of China] under Grant Agreement No. [KJQN202001143]. The research leading to these results received funding from [the High Quality Development Plan of Graduate Education of Chongqing University of Technology] under Grant Agreement No. [gzlcx20223216].

Author information

Lin Li and Xiang Chen contributed equally to this work.

Authors and Affiliations

College of Computer Science and Engineering, Chongqing University of Technology, Huaxi, Banan, Chongqing, 400054, China
Lin Li & Chengyun Song
College of Computer Science, Chongqing University, ShaZhengjie, Shapingba, Chongqing, 400044, China
Xiang Chen

Authors

Lin Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chengyun Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengyun Song.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, L., Chen, X. & Song, C. A self-adaptive graph-based clustering method with noise identification. Pattern Anal Applic 26, 907–916 (2023). https://doi.org/10.1007/s10044-023-01160-0

Download citation

Received: 29 May 2022
Accepted: 21 March 2023
Published: 12 April 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s10044-023-01160-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A self-adaptive graph-based clustering method with noise identification

Abstract

Access this article

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation