Abstract
Density peaks clustering (DPC) algorithm provides an efficient method to quickly find cluster centers with decision graph. In recent years, due to its unique parameter, no iteration, and good robustness, DPC has been widely studied and applied. However, it also has some shortcomings, such as unable to effectively identify cluster centers and the chain reaction caused by non-central points error allocation. Aiming at these two shortcomings of DPC, an improved density peaks clustering based on variance (DPCV) is proposed. First, the algorithm uses the variance between points to improve similarity and reduce the density difference of unevenly distributed data sets. Then, according to the similar density relationship between a cluster center and surrounding points, the low-density points are used as the dividing boundary of the initial allocation process. In order to optimize the time consumption of calculating the variance, this paper replaces the variance with the Manhattan distance between points and proposes density peaks clustering based on Manhattan distance (MDDPC). Theoretical analysis and experiments on artificial data and UCI data sets show that, compared with DPC and its improved algorithms, DPCV and MDDPC further improve the clustering accuracy of the DPC algorithm while controlling the running time.















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Jiefang L, Zhihui Z (2021) Parallel clustering algorithm for big data. Comput Eng Des 42(08):2265–2270
Hu N, Tian Z, Lu H et al (2021) A multiple-kernel clustering based intrusion detection scheme for 5G and IoT networks. Int J Mach Learn Cybern 12:3129–3144
Shang C, Feng S, Zhao Z et al (2017) Efficiently detecting overlapping communities through seedi- ng and semi-supervised learning. Int J Mach Learn Cybern 8(2):455–468
Dawei H, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protoc 4(1):44–57
Bouras C, Tsogkas V (2017) Improving news articles recommendations via user clustering. Int J Mach Learn Cybern 8(1):223–237
Wu D, Ren J, Sheng L (2017) Representative points clustering algorithm based on density factor and relevant degree. Int J Mach Learn Cybern 8(2):641–649
MacQueen J B (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth berkeley symposium on mathematical statistics and probability, vol 1. pp 281–297
Murtagh F, Contreras P (2012) Algorithms for hierarchical clustering: an overview. Wiley Interdisci- plinary Rev: Data Mining Knowl Discov 2(1):86–97
Wang W, Yang J, Muntz R (1997) STING: A statistical information grid approach to spatial data mining. In: Proceedings of the 23rd International Conference on Very Large Data Bases, San Francisco, CA, United States, pp 186–195
Yuan X, Yu H, Liang J et al (2021) A novel density peaks clustering algorithm based on K nearest neighbors with adaptive merging strategy. Int J Mach Learn Cybern 12(10):2825–2841
Li M (2019) NNGDPC: a kNNG-based density peaks clustering. Int J Collab Intell 2(1):1–15
Xing CZ, Zhao QY, Wang X et al (2017) Research on accelerated EM algorithm based on robust Gaussian mixture model. Comput Appl Res 04:1042–1046
Deng Xiang YuLu (2021) Overview of deep clustering algorithms. Commun Technol 54(08):1807–1814
Xu X (2020) An efficient density-based clustering algorithm with circle-filtering strategy. Int J Collab Intell 2(2):94–107
Wang Y, Ding S, Wang L et al (2020) An improved density-based adaptive p-spectral clustering algorithm. Int J Mach Learn Cybern 12(6):1571–1582
Wang L, Ding S, Wang Y et al (2021) A robust spectral clustering algorithm based on grid-partition and decision-graph. Int J Mach Learn Cybern 12(5):1243–1254
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496
Xu X, Ding S, Du M et al (2018) DPCG: an efficient density peaks clustering algorithm based on grid. Int J Mach Learn Cybern 9(5):743–754
Du M, Ding S, Xu X et al (2018) Density peaks clustering using geodesic distances. Int J Mach Learn Cybern 9(8):1335–1349
Xie J, Gao H, Xie W et al (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inf Sci 354:19–40
Wang F, Zhang D, Xiao Y (2022) Density peaks algorithm based on weighted shared nearest neighbor and cumulative sequence. Comput Eng 48(04):61–69
Jia Lu, Desheng Z, Duanduan Lv (2020) Optimized density peak clustering algorithm in physics. Comput Eng Appl 56(13):47–53
Xu L, Zhao J, Yao Z et al (2019) Density Zpeak clustering based on cumulative nearest neighbors degree and micro cluster merging. J Signal Process Syst Signal, Image, Video Technol 91(10):1219–1236
Jain H, Liu W (2017) An enhanced density peak based clustering algorithm. In: 2017 4th IAPR Asian conference on pattern recognition (ACPR), pp 411–416
Qiao D, Liang Y, Jiao L (2019) Boundary detection-based density peaks clustering. IEEE Access 7:152755–152765
Jain H, Cui H (2017) Density normalization in density peak based clustering. In: Graph-based representations in pattern recognition, vol 10310. Anacapri, Italy, pp 187–196
Du J, Ma Y, Huang H (2022) Clustering algorithm based on local gravity and distance. Comput Appl 42(05):1472–1479
Ding S, Xu X, Wang YR (2020) Optimized density peaks clustering algorithm based on dissimilarity measure. J Softw 31(11):3321–3333
Wang XX, Zhang YF, Xie J et al (2020) A density-core-based clustering algorithm with local resultant force. Soft Comput 24(8):6571–6590
Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl-Based Syst 99:135–145
Yu D, Liu G, Guo M et al (2019) Density peaks clustering based on weighted local density sequence and nearest neighbor assignment. IEEE Access 7:34301–34317
Yang Y, Jin F, Mohamed K (2008) A survey of clustering validity evaluation. Comput Appl Res 06: 1630–1632+1638
Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
Zhang YX, Bai XZ, Fan RR et al (2019) Deviation-sparse fuzzy C-means with neighbor information constraint. IEEE Trans Fuzzy Syst: Publ IEEE Neural Netw Counc 27(1):185–199
Tang Y, Ren F, Pedrycz W (2020) Fuzzy C-means clustering through SSIM and patch for image segmentation. Appl Soft Comput 87:105928
Acknowledgements
This work is supported by the National Natural Science Foundations of China (No.61976216, No.62276265).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ding, S., Du, W., Li, C. et al. Density peaks clustering algorithm based on improved similarity and allocation strategy. Int. J. Mach. Learn. & Cyber. 14, 1527–1542 (2023). https://doi.org/10.1007/s13042-022-01711-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01711-7