Skip to main content
Log in

An improvement of spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

A Correction to this article was published on 24 August 2022

This article has been updated

Abstract

Spectral clustering algorithm has become more popular in data clustering problems in recent years, due to the idea of optimally dividing the graph to solve the data clustering problems. However, the performance of the spectral clustering algorithm is affected by the quality of the similarity matrix. In addition, the traditional spectral clustering algorithm is unstable because it uses the K-means algorithm in the final clustering stage. Therefore, we propose a spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation (FDAP-SC). The algorithm obtains neighbor information more efficiently by changing the way of determining the number of neighbors. And it uses the shared nearest neighbors and the shared reverse neighbors between two points to construct the similarity matrix. Moreover, the algorithm regards all data points as nodes in the network and then calculates the clustering center of each sample through message passing between nodes. In this paper, we first experimentally on real datasets to verify that our proposed method for determining the number of neighbors outperforms the traditional natural nearest neighbor algorithm. We then demonstrate on synthetic datasets that FDAP-SC can handle complex shape datasets well. Finally, we compare FDAP-SC with several existing classical and novel algorithms on real datasets and Olivetti face datasets, proving the superiority and stability of FDAP-SC algorithm performance. Among the seven real datasets, FDAP-SC has the best performance on five datasets, and in the Olivetti face datasets, FDAP-SC achieves more than 87.5% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Change history

References

  1. MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, pp 281–297. Oakland, CA, USA

  2. Arthur D, Vassilvitskii S (2006) k-means++: The advantages of careful seeding. Technical report, Stanford

  3. Ester M, Kriegel H-P, Sander J, Xu X, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: kdd, pp 226–231

  4. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496

    Article  Google Scholar 

  5. Zhang T, Ramakrishnan R, Livny M (1996) Birch: an efficient data clustering method for very large databases. ACM SIGMOD Rec 25(2):103–114

    Article  Google Scholar 

  6. Guha S, Rastogi R, Shim K (1998) Cure: an efficient clustering algorithm for large databases. ACM SIGMOD Rec 27(2):73–84

    Article  Google Scholar 

  7. Bezdek JC, Ehrlich R, Full W (1984) Fcm: the fuzzy c-means clustering algorithm. Comput Geosci 10(2–3):191–203

    Article  Google Scholar 

  8. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  9. Ng Andrew Y, Jordan Michael I, Weiss Y(2002) On spectral clustering: Analysis and an algorithm. In: Advances in neural information processing systems, pp 849–856

  10. Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. In: advances in neural information processing systems (NIPS)

  11. Liu X-Y, Li J-W, Hong Yu, You Q-Z, Lin H-F (2011) Adaptive spectral clustering based on shared nearest neighbors. J Chinese Comput Syst 32(9):1876–1880

    Google Scholar 

  12. Zhu Q, Feng J, Huang J (2016) Natural neighbor: a self-adaptive neighborhood method without parameter k. Pattern Recogn Lett 80:30–36

    Article  Google Scholar 

  13. Yuan CY, Zhang LS (2020) Spectral averagely-dense clustering based on dynamic shared nearest neighbors. In: 2020 5th International Conference on Computational Intelligence and Applications (ICCIA), pp 138–144. IEEE

  14. Fix E, Hodges JL (1989) Discriminatory analysis. nonparametric discrimination: consistency properties. International Statistical Review/Revue Internationale de Statistique 57(3):238–247

    MATH  Google Scholar 

  15. Zou Xian L, Zhu Qing S, Yang Rui L(2011) Natural nearest neighbor for isomap algorithm without free-parameter. In: Advanced Materials Research, pp 994–998. Trans Tech Publ

  16. Cheng D, Zhu Q, Huang J, Yang L, Quanwang W (2017) Natural neighbor-based clustering algorithm with local representatives. Knowl-Based Syst 123:238–253

    Article  Google Scholar 

  17. Barlow HB (1989) Unsupervised learning. Neural Comput 1(3):295–311

    Article  Google Scholar 

  18. Fiedler M (1973) Algebraic connectivity of graphs. Czechoslov Math J 23(2):298–305

    Article  MathSciNet  Google Scholar 

  19. Fan N, Pardalos PM (2012) Multi-way clustering and biclustering by the ratio cut and normalized cut in graphs. J Comb Optim 23(2):224–251

    Article  MathSciNet  Google Scholar 

  20. Alpert Charles J, Yao S-Z (1995) Spectral partitioning: The more eigenvectors, the better. In: Proceedings of the 32nd Annual ACM/IEEE Design Automation Conference, pp 195–200

  21. Chung Fan RK, Graham FC (1997) Spectral graph theory. Number 92. American Mathematical Soc

  22. Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976

    Article  MathSciNet  Google Scholar 

  23. Meimei G (2019) Research on spectral clustering algorithm based on nearest neighbor graph analysis. PhD thesis, Shaanxi Normal University

  24. Güzel Kadir, Kurşun Olcay (2015) Improving spectral clustering using path-based connectivity. In: 2015 23nd Signal Processing and Communications Applications Conference (SIU), pp 2110–2113. IEEE

  25. Cheng D, Zhu Q, Huang J, Quanwang W, Yang L (2019) Clustering with local density peaks-based minimum spanning tree. IEEE Trans Knowl Data Eng 33(2):374–387

    Article  Google Scholar 

  26. Cheng D, Huang J, Zhang S, Zhang X, Luo X (2021) A novel approximate spectral clustering algorithm with dense cores and density peaks. IEEE transactions on systems, man, and cybernetics: systems, (2021)

  27. Wang L, Ding S, Wang Y, Ding L (2021) A robust spectral clustering algorithm based on grid-partition and decision-graph. Int J Mach Learn Cybern 12(5):1243–1254

    Article  Google Scholar 

  28. Wang Y, Ding S, Wang L, Ding L (2021) An improved density-based adaptive p-spectral clustering algorithm. Int J Mach Learn Cybern 12(6):1571–1582

    Article  Google Scholar 

  29. Wang L, Ding S, Jia H (2019) An improvement of spectral clustering via message passing and density sensitive similarity. IEEE Access 7:101054–101062

    Article  Google Scholar 

  30. Givoni I, Frey B (2009) Semi-supervised affinity propagation with instance-level constraints. In: Artificial intelligence and statistics, pp 161–168. PMLR

  31. Jia H, Wang L, Song H, Mao Q, Ding S (2018) A k-ap clustering algorithm basedon manifold similarity measure. In: International Conference on Intelligent Information Processing, pp 20–29. Springer

  32. Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3(Dec):583–617

    MathSciNet  MATH  Google Scholar 

  33. Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850

    Article  Google Scholar 

  34. Ertoz L, Steinbach M, Kumar V (2002) A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on Clustering High Dimensional Data and its Applications at 2nd SIAM International Conference on Data Mining, volume 8

  35. Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE workshop on applications of computer vision, pp 138–142. IEEE

Download references

Acknowledgements

This work was partially supported by National Natural Science Foundation of China (Grant No. 61972179), Guangdong Basic and Applied Basic Research Foundation (Grant No. 2020A1515011476), Guangdong Basic and Applied Basic Research Foundation(2021B1515120048).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ziyang Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: the headers in Table 3 were interchanged.

Appendix: Notation

Appendix: Notation

See Table 9.

Table 9 Symbol Table

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Q., Li, Z., Han, G. et al. An improvement of spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation. J Supercomput 78, 14597–14625 (2022). https://doi.org/10.1007/s11227-022-04456-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04456-w

Keywords

Navigation