An improvement of spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation

Yang, Qifen; Li, Ziyang; Han, Gang; Gao, Wanyi; Zhu, Shuhua; Wu, Xiaotian; Deng, Yuhui

doi:10.1007/s11227-022-04456-w

An improvement of spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation

Published: 06 April 2022

Volume 78, pages 14597–14625, (2022)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Qifen Yang⁵,
Ziyang Li ORCID: orcid.org/0000-0002-1647-0301²,
Gang Han³,
Wanyi Gao⁴,
Shuhua Zhu¹,
Xiaotian Wu⁵ &
…
Yuhui Deng⁵

467 Accesses
1 Citation
Explore all metrics

A Correction to this article was published on 24 August 2022

This article has been updated

Abstract

Spectral clustering algorithm has become more popular in data clustering problems in recent years, due to the idea of optimally dividing the graph to solve the data clustering problems. However, the performance of the spectral clustering algorithm is affected by the quality of the similarity matrix. In addition, the traditional spectral clustering algorithm is unstable because it uses the K-means algorithm in the final clustering stage. Therefore, we propose a spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation (FDAP-SC). The algorithm obtains neighbor information more efficiently by changing the way of determining the number of neighbors. And it uses the shared nearest neighbors and the shared reverse neighbors between two points to construct the similarity matrix. Moreover, the algorithm regards all data points as nodes in the network and then calculates the clustering center of each sample through message passing between nodes. In this paper, we first experimentally on real datasets to verify that our proposed method for determining the number of neighbors outperforms the traditional natural nearest neighbor algorithm. We then demonstrate on synthetic datasets that FDAP-SC can handle complex shape datasets well. Finally, we compare FDAP-SC with several existing classical and novel algorithms on real datasets and Olivetti face datasets, proving the superiority and stability of FDAP-SC algorithm performance. Among the seven real datasets, FDAP-SC has the best performance on five datasets, and in the Olivetti face datasets, FDAP-SC achieves more than 87.5% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey on community detection methods and applications in complex information networks

Article 18 April 2024

K-Means algorithm based on multi-feature-induced order

Article 09 April 2024

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Article Open access 06 November 2019

Change history

24 August 2022
A Correction to this paper has been published: https://doi.org/10.1007/s11227-022-04743-6

References

MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, pp 281–297. Oakland, CA, USA
Arthur D, Vassilvitskii S (2006) k-means++: The advantages of careful seeding. Technical report, Stanford
Ester M, Kriegel H-P, Sander J, Xu X, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: kdd, pp 226–231
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496
Article Google Scholar
Zhang T, Ramakrishnan R, Livny M (1996) Birch: an efficient data clustering method for very large databases. ACM SIGMOD Rec 25(2):103–114
Article Google Scholar
Guha S, Rastogi R, Shim K (1998) Cure: an efficient clustering algorithm for large databases. ACM SIGMOD Rec 27(2):73–84
Article Google Scholar
Bezdek JC, Ehrlich R, Full W (1984) Fcm: the fuzzy c-means clustering algorithm. Comput Geosci 10(2–3):191–203
Article Google Scholar
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Article Google Scholar
Ng Andrew Y, Jordan Michael I, Weiss Y(2002) On spectral clustering: Analysis and an algorithm. In: Advances in neural information processing systems, pp 849–856
Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. In: advances in neural information processing systems (NIPS)
Liu X-Y, Li J-W, Hong Yu, You Q-Z, Lin H-F (2011) Adaptive spectral clustering based on shared nearest neighbors. J Chinese Comput Syst 32(9):1876–1880
Google Scholar
Zhu Q, Feng J, Huang J (2016) Natural neighbor: a self-adaptive neighborhood method without parameter k. Pattern Recogn Lett 80:30–36
Article Google Scholar
Yuan CY, Zhang LS (2020) Spectral averagely-dense clustering based on dynamic shared nearest neighbors. In: 2020 5th International Conference on Computational Intelligence and Applications (ICCIA), pp 138–144. IEEE
Fix E, Hodges JL (1989) Discriminatory analysis. nonparametric discrimination: consistency properties. International Statistical Review/Revue Internationale de Statistique 57(3):238–247
MATH Google Scholar
Zou Xian L, Zhu Qing S, Yang Rui L(2011) Natural nearest neighbor for isomap algorithm without free-parameter. In: Advanced Materials Research, pp 994–998. Trans Tech Publ
Cheng D, Zhu Q, Huang J, Yang L, Quanwang W (2017) Natural neighbor-based clustering algorithm with local representatives. Knowl-Based Syst 123:238–253
Article Google Scholar
Barlow HB (1989) Unsupervised learning. Neural Comput 1(3):295–311
Article Google Scholar
Fiedler M (1973) Algebraic connectivity of graphs. Czechoslov Math J 23(2):298–305
Article MathSciNet Google Scholar
Fan N, Pardalos PM (2012) Multi-way clustering and biclustering by the ratio cut and normalized cut in graphs. J Comb Optim 23(2):224–251
Article MathSciNet Google Scholar
Alpert Charles J, Yao S-Z (1995) Spectral partitioning: The more eigenvectors, the better. In: Proceedings of the 32nd Annual ACM/IEEE Design Automation Conference, pp 195–200
Chung Fan RK, Graham FC (1997) Spectral graph theory. Number 92. American Mathematical Soc
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976
Article MathSciNet Google Scholar
Meimei G (2019) Research on spectral clustering algorithm based on nearest neighbor graph analysis. PhD thesis, Shaanxi Normal University
Güzel Kadir, Kurşun Olcay (2015) Improving spectral clustering using path-based connectivity. In: 2015 23nd Signal Processing and Communications Applications Conference (SIU), pp 2110–2113. IEEE
Cheng D, Zhu Q, Huang J, Quanwang W, Yang L (2019) Clustering with local density peaks-based minimum spanning tree. IEEE Trans Knowl Data Eng 33(2):374–387
Article Google Scholar
Cheng D, Huang J, Zhang S, Zhang X, Luo X (2021) A novel approximate spectral clustering algorithm with dense cores and density peaks. IEEE transactions on systems, man, and cybernetics: systems, (2021)
Wang L, Ding S, Wang Y, Ding L (2021) A robust spectral clustering algorithm based on grid-partition and decision-graph. Int J Mach Learn Cybern 12(5):1243–1254
Article Google Scholar
Wang Y, Ding S, Wang L, Ding L (2021) An improved density-based adaptive p-spectral clustering algorithm. Int J Mach Learn Cybern 12(6):1571–1582
Article Google Scholar
Wang L, Ding S, Jia H (2019) An improvement of spectral clustering via message passing and density sensitive similarity. IEEE Access 7:101054–101062
Article Google Scholar
Givoni I, Frey B (2009) Semi-supervised affinity propagation with instance-level constraints. In: Artificial intelligence and statistics, pp 161–168. PMLR
Jia H, Wang L, Song H, Mao Q, Ding S (2018) A k-ap clustering algorithm basedon manifold similarity measure. In: International Conference on Intelligent Information Processing, pp 20–29. Springer
Strehl A, Ghosh J (2002) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3(Dec):583–617
MathSciNet MATH Google Scholar
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850
Article Google Scholar
Ertoz L, Steinbach M, Kumar V (2002) A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on Clustering High Dimensional Data and its Applications at 2nd SIAM International Conference on Data Mining, volume 8
Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of 1994 IEEE workshop on applications of computer vision, pp 138–142. IEEE

Download references

Acknowledgements

This work was partially supported by National Natural Science Foundation of China (Grant No. 61972179), Guangdong Basic and Applied Basic Research Foundation (Grant No. 2020A1515011476), Guangdong Basic and Applied Basic Research Foundation(2021B1515120048).

Author information

Authors and Affiliations

Network & Educational Technology Center, Jinan University, Guangzhou, 510632, China
Shuhua Zhu
College of Art, Northeast Agricultural University, Harbin, Heilongjiang, 150030, China
Ziyang Li
Granduate School, Jinan University, Guangzhou, 510632, China
Gang Han
School of Economics, Jinan University, Guangzhou, 510632, China
Wanyi Gao
Department of Computer Science, Jinan University, Guangzhou, 510632, China
Qifen Yang, Xiaotian Wu & Yuhui Deng

Authors

Qifen Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ziyang Li
View author publications
You can also search for this author in PubMed Google Scholar
Gang Han
View author publications
You can also search for this author in PubMed Google Scholar
Wanyi Gao
View author publications
You can also search for this author in PubMed Google Scholar
Shuhua Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaotian Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yuhui Deng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ziyang Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: the headers in Table 3 were interchanged.

Appendix: Notation

See Table 9.

Table 9 Symbol Table

Full size table

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, Q., Li, Z., Han, G. et al. An improvement of spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation. J Supercomput 78, 14597–14625 (2022). https://doi.org/10.1007/s11227-022-04456-w

Download citation

Accepted: 07 March 2022
Published: 06 April 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s11227-022-04456-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improvement of spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on community detection methods and applications in complex information networks

K-Means algorithm based on multi-feature-induced order

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Change history

24 August 2022

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Notation

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An improvement of spectral clustering algorithm based on fast diffusion search for natural neighbor and affinity propagation

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on community detection methods and applications in complex information networks

K-Means algorithm based on multi-feature-induced order

Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets

Change history

24 August 2022

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Notation

Appendix: Notation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation