An efficient clustering algorithm based on searching popularity peaks

Motallebi, Hassan; Malakoutifar, Najmeh

doi:10.1007/s10044-024-01261-4

An efficient clustering algorithm based on searching popularity peaks

Theoretical Advances
Published: 21 May 2024

Volume 27, article number 67, (2024)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

340 Accesses
1 Citation
Explore all metrics

Abstract

In order to address some deficiencies of the density peak clustering algorithm, namely sensitivity to density kernels and challenges with large density differences across clusters, we propose a popularity peak clustering algorithm that is based on a more robust notion of density called popularity. The popularity of a sample is computed according to the number, similarity and popularity of points that have the sample in their k-nearest neighbors. The popularity concept has some properties that help in handling challenges like identifying cluster centers in sparse regions and handling situations with large density differences across clusters. Moreover, in the density peak clustering algorithm, the strategy of assigning non-center points to the same cluster as their nearest higher-density neighbor can cause error propagation. To address this issue, we also propose a new popularity-based label assignment strategy. Our results demonstrate that the proposed algorithm can recognize clusters regardless of their densities and overlap degree and can often outperform the existing density peak clustering algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VDENCLUE: An Enhanced Variant of DENCLUE Algorithm

A New Density Clustering Method Using Mutual Nearest Neighbor

Enhancing Cluster Center Identification in Density Peak Clustering

Data availability

The data that support the findings of this study are openly available in Kaggle and UCI Machine Learning Repositories at https://www.kaggle.com/datasets and https://archive.ics.uci.edu/ml/index.php respectively and also at https://cs.joensuu.fi/sipu/datasets/.

References

Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496
Google Scholar
Ji X, Wang G, Deng W (2016) DenPEHC: density peak based efficient hierarchical clustering. Inf Sci 373:200–218
Google Scholar
Hou J, Cui H (2017) Experimental evaluation of a density kernel in clustering. In: International conference on intelligent control & information processing, pp 55-59
Mehmood R, Zhang G, Bie R, Dawood H, Ahmad H (2016) Clustering by fast search and find of density peaks via heat diffusion. Neurocomputing 208:210–217
Google Scholar
Zhou Z, Si G, Zhang Y, Zheng K (2018) Robust clustering by identifying the veins of clusters based on kernel density estimation. (Knowl Based Syst) Based Syst 159:309–320
Google Scholar
Lotfi A, Moradi P, Beigy H (2020) Density peaks clustering based on density backbone and fuzzy neighborhood. Pattern Recognit 107:107449
Google Scholar
Seyedi SA, Lotfi A, Moradi P, Qader NN (2019) Dynamic graph-based label propagation for density peaks clustering. Expert Syst Appl 115:314–328
Google Scholar
Mingjing D, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl Based Syst 99:135–145
Google Scholar
Guo Z, Huang T, Cai Z, Zhu W (2018) A new local density for density peak clustering. PAKDD 3:426–438
Google Scholar
Fan J-C, Jia P, Ge L (2020) Mk-NNG-DPC: density peaks clustering based on improved mutual K-nearest-neighbor graph. Int J Mach Learn Cybern 11(6):1179–1195
Google Scholar
Wang Y, Wang D, Zhang X, Pang W, Miao C, Tan A-H, Zhou Y (2020) Mcdpc: multi-center density peak clustering. Neural Comput Appl 32(17):13465–13478
Google Scholar
Xie J, Weiliang J (2017) Clustering by searching density peaks via local standard deviation. IDEAL, Lijuan Ding, pp 295–305
Google Scholar
Xiao X, Ding S, Shi Z (2018) An improved density peaks clustering algorithm with fast finding cluster centers. Knowl Based Syst 158:65–74
Google Scholar
Xiao X, Ding S, Mingjing D, Xue Yu (2018) DPCG: an efficient density peaks clustering algorithm based on grid. Int J Mach Learn Cybern 9(5):743–754
Google Scholar
Agrawal Rakesh, Gehrke Johannes, Gunopulos Dimitrios, Raghavan Prabhakar (1998) Automatic subspace clustering of high dimensional data for data mining applications. In: SIGMOD conference pp 94–105
Xie J, Gao H, Xie W, Liu X, Grant PW (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inf Sci 354:19–40
Google Scholar
Liu Y, Zhengming Ma Yu, Fang (2017) Adaptive density peak clustering based on K-nearest neighbors with aggregating strategy. Knowl Based Syst 133:208–220
Google Scholar
Liu R, Wang H, Xiaomei Y (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226
MathSciNet Google Scholar
Zhang W, Li J (2015) Extended fast search clustering algorithm: widely density clusters, no density peaks. https://doi.org/10.5121/csit.2015.50701. arXiv preprint arXiv:1505.05610
Liang Z, Chen P (2016) Delta-density based clustering with a divide-and-conquer strategy: 3DC clustering. Pattern Recogn Lett 73:52–59
Google Scholar
Gong S, Zhang Y (2016) EDDPC: an efficient distributed density peaks clustering algorithm. J Comput Res Develop 53(6):1400–1409
Google Scholar
Chen Y, Hu X, Fan W, Shen L, Zhang Z, Liu X, Du J, Li H, Chen Y, Li H (2020) Fast density peak clustering for large scale data based on kNN. Knowl Based Syst 187:104824
Google Scholar
Sieranoja S, Franti P (2019) Fast and general density peaks clustering. Pattern Recognit Lett 128:551–558
Google Scholar
Parmar M, Wang D, Zhang X, Tan AH, Miao C, Jiang J, Zhou Y (2019) REDPC: a residual error-based density peak clustering algorithm. Neurocomputing 348:82–96
Google Scholar
Huang L, Wang G, Wang Y et al (2016) A link density clustering algorithm based on automatically selecting density peaks for overlapping community detection. Int J Modern Phys B 30(24):1650167
MathSciNet Google Scholar
Chen YW, Lai DH, Qi H et al (2016) A new method to estimate ages of facial image for large database. Multimed Tools Appl 75(5):2877–2895
Google Scholar
Mingjing D, Ding S, Xiao X, Xue Yu (2018) Density peaks clustering using geodesic distances. Int J Mach Learn Cybern 9(8):1335–1349
Google Scholar
Sharma KK, Aya S, Anis Y, Ondrej K (2022) A new adaptive mixture distance-based improved density peaks clustering for gearbox fault diagnosis. IEEE Trans Instrum Meas 71:1–16
Google Scholar
Sharma KK, Ayan S, Enrique H-V, Ondrej K (2021) An enhanced spectral clustering algorithm with S-distance. Symmetry 13(4):596
Google Scholar
Ng Andrew Y, Jordan Michael I, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems 14, pp 849–856
Motallebi H, Nasihatkon R, Jamshidi M (2022) A local mean-based distance measure for spectral clustering. Pattern Anal Appl 25(2):351–359
Google Scholar
Chakraborty S, Das S (2017) “k- means clustering with a new divergence-based distance metric: convergence and performance analysis,’’. Pattern Recogn Lett 100:67–73
Google Scholar
Seal A, Karlekar A, Krejcar O, Herrera-Viedma E (2021) Performance and convergence analysis of modified C-means using Jeffreys-divergence for clustering. Int J Interact Multim Artif Intell 7(2):141
Google Scholar
Sharma KK, Ayan S, Anis Y, Ali S, Ondrej K (2021) Clustering uncertain data objects using Jeffreys-divergence and maximum bipartite matching based similarity measure. IEEE Access 9:79505–79519
Google Scholar
Lin J-L (2019) Accelerating density peak clustering algorithm. Symmetry 11(7):859
Google Scholar
Hou J, Zhang A (2020) Enhancing density peak clustering via density normalization. IEEE Trans Ind Inf 16(4):2477–2485
Google Scholar
Mingjing D, Shifei Ding Yu, Xue (2018) A robust density peaks clustering algorithm using fuzzy neighborhood. Int J Mach Learn Cybern 9(7):1131–1140
Google Scholar
Nasibov EN, Ulutagay G (2007) A new unsupervised approach for fuzzy clustering. Fuzzy Sets Syst 158:2118–2133
MathSciNet Google Scholar
Hou J, Lv C, Zhang A (2019) Merging DBSCAN and density peak for robust clustering. ICANN 4:595–610
Google Scholar
Ester M, Kriegel HP, Sander J, Xu XW (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Int Conf Knowl Discov Data Mining 10:226–231
Google Scholar
Liu X, Fan J-C, Chen Z (2020) Improved fuzzy C-means algorithm based on density peak. Int J Mach Learn Cybern 11(3):545–552
Google Scholar
Dunn JC (1973) A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern 3(3):32–57
MathSciNet Google Scholar
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Adv Appl Pattern Recognit 22(1171):203–239
Google Scholar
Ethier SN, Kurtz TG (1986) Markov processes: characterization and convergence. Wiley series in probability and mathematical statistics. Wiley, New York. https://doi.org/10.1002/9780470316658
Chapter Google Scholar
Bentley JL (1975) Multidimensional binary search trees used for associative searching. Commun ACM 18(9):509–517
Google Scholar
Brendan J, Dueck FD (2007) Clustering by passing messages between data points. Science 315(5814):972–976
MathSciNet Google Scholar
Dueck D, Frey BJ, Jojic N, Jojic V, Giaever G, Emili A, Gabe M (2008) Constructing treatment portfolios using affinity propagation. RECOMB, Robert Hegele, pp 360–371
Google Scholar
Kumar Abhishek , Daume Hal (2011) A co-training approach for multi-view spectral clustering. ICML 393-400
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619
Google Scholar
Chen M, Li L, Wang B, Cheng J, Pan L, Chen X (2016) Effectively clustering by finding density backbone based-on kNN. Pattern Recognit 60:486–498
Google Scholar
Hou J, Zhang A, Qi N (2020) Density peak clustering based on relative density relationship. Pattern Recognit 108:107554
Google Scholar
Halkidi M, Batistakis Y, Vazirgiannis M (2002) Cluster validity methods: part I. SIGMOD Rec 31(2):40–45
Google Scholar
Bezdek J, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10:191–203
Google Scholar
Madson Luiz Dantas Dias (2019) fuzzy-c-means: An implementation of Fuzzy $C$-means clustering algorithm. https://doi.org/10.5281/zenodo.3066222 (https://git.io/fuzzy-c-means)
Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):4
Google Scholar
Fu L, Medico E (2007) FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinform 8:1–15
Google Scholar
Franti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recognit 39(5):761–775
Google Scholar
Rezaei M, Franti P (2020) Can the number of clusters be determined by external indices? IEEE Access 8:89239–89257
Google Scholar
Franti P, Virmajoki O, Hautamaki V (2006) Fast agglomerative clustering using a k-nearest neighbor graph. IEEE Trans Pattern Anal Mach Intell 28(11):1875–1881
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Electrical and Computer Engineering, Graduate University of Advanced Technology (GUAT), Kerman, Iran
Hassan Motallebi
Department of Computer, Faculty of Fatimah, Kerman Branch, Technical and Vocational University (TVU), Kerman, Iran
Najmeh Malakoutifar

Authors

Hassan Motallebi
View author publications
You can also search for this author inPubMed Google Scholar
Najmeh Malakoutifar
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Hassan Motallebi.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Motallebi, H., Malakoutifar, N. An efficient clustering algorithm based on searching popularity peaks. Pattern Anal Applic 27, 67 (2024). https://doi.org/10.1007/s10044-024-01261-4

Download citation

Received: 01 November 2022
Accepted: 18 February 2024
Published: 21 May 2024
DOI: https://doi.org/10.1007/s10044-024-01261-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient clustering algorithm based on searching popularity peaks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

VDENCLUE: An Enhanced Variant of DENCLUE Algorithm

A New Density Clustering Method Using Mutual Nearest Neighbor

Enhancing Cluster Center Identification in Density Peak Clustering

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now