An improved density peaks clustering algorithm using similarity assignment strategy with K-nearest neighbors

Hu, Wei; Feng, Ji; Yang, Degang

doi:10.1007/s10586-024-04592-3

An improved density peaks clustering algorithm using similarity assignment strategy with K-nearest neighbors

Published: 16 June 2024

Volume 27, pages 12689–12706, (2024)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Wei Hu¹,
Ji Feng¹ &
Degang Yang¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Some particular shaped datasets, such as manifold datasets, have restrictions on density peak clustering (DPC) performance. The main reason of variations in sample densities between clusters of data and uneven densities is not taken into consideration by the DPC algorithm, which could result in the wrong clustering center selection. Additionally, the use of single assignment method is leads to the domino effect of assignment errors. To address these problems, this paper creates a new, improved density peaks clustering method use the similarity assignment strategy with K nearest Neighbors (IDPC-SKNN). Firstly, a new method for defining local density is proposed. Local density is comprehensively consider in the proportion of the average density inside the region, which realize the precise location of low-density clusters. Then, using the samples’ K-nearest neighbors information, a new similarity allocation method is proposed. Allocation strategy successfully address assignment cascading mistakes and improves algorithms robustness. Finally, based on four evaluation indicators, our algorithm outperforms all the comparative clustering algorithm, according to experiments conducted on synthetic, real world and the Olivetti Faces datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel density deviation multi-peaks automatic clustering algorithm

Article Open access 24 June 2022

Adaptive Density Peak Algorithm Based on K-nearest Neighbors with Pre-screening Strategy

An improvement of spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation

Article 06 April 2022

Availability of data and materials

Data availability Readers can access the experimental data for this paper at the following GitHub link. https://github.com/milaan9/Clustering-Datasets.

References

MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967). Oakland, CA, USA
Guha, S., Rastogi, R., Shim, K.: Cure: an efficient clustering algorithm for large databases. ACM SIGMOD Rec. 27(2), 73–84 (1998)
Article Google Scholar
Birch, Z.: An efficient data clustering method for very large databases. In: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data (SIGMOD96). ACM, New York, pp. 103–114 (1996)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol. 96, pp. 226–231 (1996)
Sun, J., Du, M., Lew, Z., Dong, Y.: Twstream: Three-way stream clustering. IEEE Transactions on Fuzzy Systems (2024)
Sun, J., Du, M., Sun, C., Dong, Y.: Efficient online stream clustering based on fast peeling of boundary micro-cluster. IEEE Transactions on Neural Networks and Learning Systems (2024)
Wang, W., Yang, J., Muntz, R., et al.: Sting: A statistical information grid approach to spatial data mining. In: Vldb, vol. 97, pp. 186–195 (1997)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–22 (1977)
Article MathSciNet Google Scholar
Sun, L., Guo, C.: Incremental affinity propagation clustering based on message passing. IEEE Trans. Knowl. Data Eng. 26(11), 2731–2744 (2014)
Article Google Scholar
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems 14, (2001)
Yu, J., Hong, R., Wang, M., You, J.: Image clustering based on sparse patch alignment framework. Pattern Recogn. 47(11), 3512–3519 (2014)
Article Google Scholar
Jan, Z., Ai-Ansari, N., Mousa, O., Abd-Alrazaq, A., Ahmed, A., Alam, T., Househ, M.: The role of machine learning in diagnosing bipolar disorder: scoping review. J. Med. Internet Res. 23(11), 29749 (2021)
Article Google Scholar
Fang, F., Qiu, L., Yuan, S.: Adaptive core fusion-based density peak clustering for complex data with arbitrary shapes and densities. Pattern Recogn. 107, 107452 (2020)
Article Google Scholar
Li, C., Chen, H., Li, T., Yang, X.: A stable community detection approach for complex network based on density peak clustering and label propagation. Appl. Intell. 52(2), 1188–1208 (2022)
Article Google Scholar
Rodriguez, A., Laio, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Article Google Scholar
Zhang, R., Du, T., Qu, S., Sun, H.: Adaptive density-based clustering algorithm with shared knn conflict game. Inf. Sci. 565, 344–369 (2021)
Article MathSciNet Google Scholar
Lotfi, A., Moradi, P., Beigy, H.: Density peaks clustering based on density backbone and fuzzy neighborhood. Pattern Recogn. 107, 107449 (2020)
Article Google Scholar
Xu, T., Jiang, J.: A graph adaptive density peaks clustering algorithm for automatic centroid selection and effective aggregation. Expert Syst. Appl. 195, 116539 (2022)
Article Google Scholar
Cheng, D., Li, Y., Xia, S., Wang, G., Huang, J., Zhang, S.: A fast granular-ball-based density peaks clustering algorithm for large-scale data. IEEE Trans. Neural Netw. Learn. Syst. (2023). https://doi.org/10.1109/TNNLS.2023.3300916
Article Google Scholar
Qiu, T., Li, Y.-J.: Fast ldp-mst: an efficient density-peak-based clustering method for large-size datasets. IEEE Trans. Knowl. Data Eng. 35(5), 4767–4780 (2022)
Article Google Scholar
Ding, S., Li, C., Xu, X., Ding, L., Zhang, J., Guo, L., Shi, T.: A sampling-based density peaks clustering algorithm for large-scale data. Pattern Recogn. 136, 109238 (2023)
Article Google Scholar
Xu, X., Ding, S., Du, M., Xue, Y.: Dpcg: an efficient density peaks clustering algorithm based on grid. Int. J. Mach. Learn. Cybernetics 9(5), 743–754 (2018)
Article Google Scholar
Niu, X., Zheng, Y., Liu, W., Wu, C.Q.: On a two-stage progressive clustering algorithm with graph-augmented density peak clustering. Eng. Appl. Artif. Intell. 108, 104566 (2022)
Article Google Scholar
Li, C., Ding, S., Xu, X., Du, S., Shi, T.: Fast density peaks clustering algorithm in polar coordinate system. Appl. Intell. 52(12), 14478–14490 (2022)
Article Google Scholar
Laohakiat, S., Sa-Ing, V.: An incremental density-based clustering framework using fuzzy local clustering. Inf. Sci. 547, 404–426 (2021)
Article MathSciNet Google Scholar
Du, M., Ding, S., Jia, H.: Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl.-Based Syst. 99, 135–145 (2016)
Article Google Scholar
Xie, J., Gao, H., Xie, W., Liu, X., Grant, P.W.: Robust clustering by detecting density peaks and assigning points based on fuzzy weighted k-nearest neighbors. Inf. Sci. 354, 19–40 (2016)
Article Google Scholar
Liu, R., Wang, H., Yu, X.: Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf. Sci. 450, 200–226 (2018)
Article MathSciNet Google Scholar
Du, M., Ding, S., Xue, Y., Shi, Z.: A novel density peaks clustering with sensitivity of local density and density-adaptive metric. Knowl. Inf. Syst. 59, 285–309 (2019)
Article Google Scholar
Diao, Q., Dai, Y., An, Q., Li, W., Feng, X., Pan, F.: Clustering by detecting density peaks and assigning points by similarity-first search based on weighted k-nearest neighbors graph. Complexity 2020, 1–17 (2020)
Article Google Scholar
Zhang, R., Miao, Z., Tian, Y., Wang, H.: A novel density peaks clustering algorithm based on hopkins statistic. Expert Syst. Appl. 201, 116892 (2022)
Article Google Scholar
Tong, W., Liu, S., Gao, X.-Z.: A density-peak-based clustering algorithm of automatically determining the number of clusters. Neurocomputing 458, 655–666 (2021)
Article Google Scholar
Wang, Y., Wang, D., Zhou, Y., Zhang, X., Quek, C.: Vdpc: variational density peak clustering algorithm. Inf. Sci. 621, 627–651 (2023)
Article Google Scholar
Li, C., Ding, S., Xu, X., Hou, H., Ding, L.: Fast density peaks clustering algorithm based on improved mutual k-nearest-neighbor and sub-cluster merging. Inf. Sci. 647, 119470 (2023)
Article Google Scholar
Shi, Y., Bai, L.: Density peaks clustering based on candidate center and multi assignment policies. IEEE Access (2023)
Ding, S., Du, W., Xu, X., Shi, T., Wang, Y., Li, C.: An improved density peaks clustering algorithm based on natural neighbor with a merging strategy. Inf. Sci. 624, 252–276 (2023)
Article Google Scholar
García-García, J.C., García-Ródenas, R.: A methodology for automatic parameter-tuning and center selection in density-peak clustering methods. Soft. Comput. 25, 1543–1561 (2021)
Article Google Scholar
Wang, Y., Pang, W., Zhou, J.: An improved density peak clustering algorithm guided by pseudo labels. Knowl.-Based Syst. 252, 109374 (2022)
Article Google Scholar
Yu, D., Liu, G., Guo, M., Liu, X., Yao, S.: Density peaks clustering based on weighted local density sequence and nearest neighbor assignment. Ieee Access 7, 34301–34317 (2019)
Article Google Scholar
Cheng, D., Huang, J., Zhang, S., Xia, S., Wang, G., Xie, J.: K-means clustering with natural density peaks for discovering arbitrary-shaped clusters. IEEE Trans. Neural Netw. Learn. Syst. (2023). https://doi.org/10.1109/TNNLS.2023.3248064
Article Google Scholar
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: is a correction for chance necessary? In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1073–1080 (2009)
Sun, L., Bao, S., Ci, S., Zheng, X., Guo, L., Luo, Y.: Differential privacy-preserving density peaks clustering based on shared near neighbors similarity. IEEE Access 7, 89427–89440 (2019)
Article Google Scholar
Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE Workshop on Applications of Computer Vision, pp. 138–142 (1994). IEEE

Download references

Funding

Funding This work is supported by the Science and Technology Project of Chongqing Municipal Education Commission (KJQN201800539), Science and Technology Research Program of Chongqing Municipal Education Commission (No. KJZD-M202300502).

Author information

Authors and Affiliations

College of Computer and Information Science, Chongqing Normal University, No. 37, Middle University City Road, Shapingba District, Chongqing, 401331, China
Wei Hu, Ji Feng & Degang Yang

Authors

Wei Hu
View author publications
You can also search for this author inPubMed Google Scholar
Ji Feng
View author publications
You can also search for this author inPubMed Google Scholar
Degang Yang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Ji Feng.

Ethics declarations

Conflict of interest

Conflict of interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Consent for publication

Consent for publication All authors have agreed to publish in this journal.

Code availability

Code availability If any scholars need further research please contact the corresponding author.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hu, W., Feng, J. & Yang, D. An improved density peaks clustering algorithm using similarity assignment strategy with K-nearest neighbors. Cluster Comput 27, 12689–12706 (2024). https://doi.org/10.1007/s10586-024-04592-3

Download citation

Received: 11 March 2024
Revised: 20 May 2024
Accepted: 23 May 2024
Published: 16 June 2024
Issue Date: December 2024
DOI: https://doi.org/10.1007/s10586-024-04592-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved density peaks clustering algorithm using similarity assignment strategy with K-nearest neighbors

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A novel density deviation multi-peaks automatic clustering algorithm

Adaptive Density Peak Algorithm Based on K-nearest Neighbors with Pre-screening Strategy

An improvement of spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Consent for publication

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now