Skip to main content
Log in

A novel clustering algorithm by adaptively merging sub-clusters based on the Normal-neighbor and Merging force

  • Theoretical advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Clustering by fast search and find of density peaks (DPC) is a popular clustering method based on density and distance. In DPC, each non-center point’s cluster label is led by its nearest point with higher density, which may cause some misclassifications of non-center points and interfere with the choice of correct cluster centers in the decision graph. To avoid these defects, we propose a novel clustering algorithm that automatically generates clusters without using the decision graph based on the Normal-neighbor and Merging force (NM-DPC). We conduct a series of experiments on various challenging synthetic datasets. Experimental results demonstrate that NM-DPC can better identify clusters of complex shapes and automatically recognize the number of clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Jain AK (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323

    Article  Google Scholar 

  2. Hansen P, Jaumard B (1997) Cluster analysis and mathematical programming. Math Program 79(1–3):191–215

    MathSciNet  MATH  Google Scholar 

  3. Xu R, Wunsch D II (2007) Computational intelligence in clustering algorithms, with applications

  4. Xu R, Wunsch DC (2010) Clustering algorithms in biomedical research: a review. IEEE Rev Biomed Eng 3:120–154

    Article  Google Scholar 

  5. Jain AK, Dubes RC (1988) Algorithms for clustering data. Technometrics 32(2):227–229

    MATH  Google Scholar 

  6. X Qian, Y Wu, M Li, Y Ren, S Jiang, Z Li (2020) LAST: location-appearance-semantic-temporal clustering based POI summarization. IEEE Trans Multimed

  7. Wu Z, Leahy Richard M (1993) An optimal graph theoretic approach to data clustering: theory and its application to image segmentation. IEEE Trans Pattern Anal Mach Intell 15(11):1101–1113

    Article  Google Scholar 

  8. Berry Michael W, Castellanos Malu (2007) Survey of text mining: clustering, classification, and retrieval. Springer, Berlin

    Google Scholar 

  9. Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260

    Article  MathSciNet  Google Scholar 

  10. Macqueen J (1965) Some methods for classification and analysis of multivariate observations. In: Proceedings of Berkeley symposium on mathematical statistics and probability

  11. Jain AK (2008) Data clustering: 50 years beyond k-means. In: Machine learning and knowledge discovery in databases

  12. Rahman MA, Islam MZ (2014) A hybrid clustering technique combining a novel genetic algorithm with K-Means. Knowl Based Syst 7:1345–365

    Google Scholar 

  13. Tzortzis G, Likas A (2014) The MinMax K-Means clustering algorithm. Pattern Recognit 47(7):2505–2516

    Article  Google Scholar 

  14. Likas A, Vlassis N, Verbeek JJ (2003) The global K-Means clustering algorithm. Pattern Recognit 36(2):451–46

    Article  Google Scholar 

  15. Xie J, Jiang S, Xie W, Gao X (2011) An efficient global K-Means clustering algorithm. JCP 6(2):27–279

    Google Scholar 

  16. Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    Article  MathSciNet  Google Scholar 

  17. Ester M, Kriegel HP, Xu X, Sanders J (1996) A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In: International conference on knowledge discovery and data mining

  18. Han J, Kamber M (2006) Data mining: concepts and techniques. In: Data mining concepts models methods and, 2nd edn, vol 5, no 4, pp 1–18

  19. Xu R, Wunsch DC (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678

    Article  Google Scholar 

  20. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344:1492–1496

    Article  Google Scholar 

  21. Pizzagalli Diego Ulisse, Gonzalez Santiago F, Krause Rolf (2019) A trainable clustering algorithm based on shortest paths from density peaks. Sci Adv 5(10):eaax3770

    Article  Google Scholar 

  22. Xie J, Gao H, Xie W, Liu X, Grant PW (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted k-nearest neighbors. Inf Sci 354:19–40

    Article  Google Scholar 

  23. Du M, Ding S, Jia H (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl Based Syst 99:135–145

    Article  Google Scholar 

  24. Liu R, Wang H, Yu X (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226

    Article  MathSciNet  Google Scholar 

  25. Jain AK, Law MH (2005) Data clustering: a user’s dilemma. In: International conference on pattern recognition and machine intelligence, pp 1–10

  26. Ball GH, Hall DJ (1965) ISODATA, a novel method of data analysis and pattern classification. Stanford Research Iinst, Menlo Park CA

  27. Chang H, Yeung D-Y (2008) Robust path-based spectral clustering. Pattern Recognit 41(1):191–203

    Article  Google Scholar 

  28. Zahn CT (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput 100(1):68–86

    Article  Google Scholar 

  29. Fu L, Medico E (2007) FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data. BMC Bioinform 8(1):1–15

    Article  Google Scholar 

  30. Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1(1):4

    Article  Google Scholar 

  31. Frnti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recognit 39(5):761–775

    Article  Google Scholar 

  32. Veenman CJ, Reinders MJT, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24(9):1273–1280

    Article  Google Scholar 

  33. L Zelnikmanor, P Perona (2004) Self-tuning spectral clustering. Neural Inf Process Syst

  34. Vinh HX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11(1):2837–2854

    MathSciNet  MATH  Google Scholar 

  35. Fowlkes EB, Mallows CL (1983) A method for comparing two hierarchical clusterings. J Am Stat Assoc 78(383):553–569

    Article  Google Scholar 

  36. Franti Pasi, Virmajoki Olli, Hautamaki Ville (2006) Fast agglomerative clustering using a k-nearest neighbor graph. IEEE Trans Pattern Anal Mach Intell 28(11):1875–1881

    Article  Google Scholar 

  37. Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of the second IEEE workshop on applications of computer vision, pp 138–142

Download references

Acknowledgements

This work was supported by the National Science Foundation of P.R. China (Grants: 61873239) and Zhejiang Science Foundation (Grant:2020C03074).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheng li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Junyi , G., li, S., Xiongxiong, H. et al. A novel clustering algorithm by adaptively merging sub-clusters based on the Normal-neighbor and Merging force. Pattern Anal Applic 24, 1231–1248 (2021). https://doi.org/10.1007/s10044-021-00981-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-021-00981-1

Keywords

Navigation