Skip to main content
Log in

Manifold clustering optimized by adaptive aggregation strategy

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Different from general spherical datasets, manifold datasets have a more complex spatial manifold structure, which makes it difficult to distinguish sample points on different manifold structures by Euclidean distance. Although the density peak clustering (DPC, two parameters: the cut-off ratio \(\mathrm{dc}\) and the number of class centers \(C\)) algorithm can search for density peaks quickly and assign sample points, it cannot identify clusters effectively with complex manifold structures due to the sample similarity measurement only based on Euclidean distance. To solve these problems, this paper proposes a Manifold Clustering optimized by Adaptive Aggregation Strategy (MC-AAS, two parameters: the number of nearest neighbors \(k\) and the threshold ratio of core points \(p\)). Firstly, it introduces a novel manifold similarity measurement based on the shared nearest neighbors and redefines the local density of sample points by summing the manifold similarity. Secondly, the core points are determined by the statistical characteristics of local density, and the local sub-clusters of manifold structural datasets are obtained by means of the nearest neighbor connection of the core points. And then, the initial clusters are merged on the basis of the statistical test of boundary density and the silhouette coefficient of adjacent subclass to realize the identification of manifold structural datasets. Finally, based on three evaluation metrics: Adjusted Mutual Information, Adjusted Rand Index and Fowlkes-Mallows Index, we conduct extensive experiments on synthetic datasets and real-world datasets. The experimental results indicate that, compared with current methods, the MC-AAS algorithm achieves a better clustering effect in identifying complex manifold datasets and has better robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://docs.scipy.org/doc/scipy-1.0.0/reference/generated/scipy.cluster.hierarchy.linkage.html.

  2. https://scikit-learn.org/dev/modules/generated/sklearn.metrics.silhouette_samples.html.

  3. https://scikit-learn.org/dev/modules/generated/sklearn.metrics.silhouette_score.html.

  4. https://scikit-learn.org/stable/modules/generated/sklearn.metrics.adjusted_mutual_info_score.html.

  5. https://scikit-learn.org/stable/modules/generated/sklearn.metrics.adjusted_rand_score.html.

  6. https://scikit-learn.org/stable/modules/generated/sklearn.metrics.fowlkes_mallows_score.html.

  7. https://scikit-learn.org/dev/modules/generated/sklearn.cluster.AffinityPropagation.html.

References

  1. Akbar S, Khan MNA, Zulfikar S, Bhutto A (2014) Critical analysis of density-based spatial clustering of applications with noise (DBSCAN) techniques. Int J Database Theory Appl 7:17–28

    Article  Google Scholar 

  2. Omar M, Al-akash S, Sakinah S, Ahmad M, Sanusi A (2018) Fuzzy Distance measure based affinity propagation clustering. Int J Appl Eng Res 13:1501–1505

    Google Scholar 

  3. Wang LJ, Ding SF, Jia HJ (2019) An improvement of spectral clustering via message passing and density sensitive similarity. IEEE Access 7:101054–101062

    Article  Google Scholar 

  4. Cohen-Addad V, Kanade V, Mallmann-Trenn F, Mathieu C (2019) Hierarchical clustering: objective functions and algorithms. J ACM 66:1–42

    Article  MathSciNet  MATH  Google Scholar 

  5. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344:1492–1496

    Article  Google Scholar 

  6. Sun J, Liu J, Zhao L (2008) Clustering algorithms research. J Softw 19:48–61

    Article  MATH  Google Scholar 

  7. Ohyver M, Moniaga JV, Sungkawa I, Subagyo BE, Chandra IA (2019) The comparison firebase realtime database and MySQL database performance using Wilcoxon signed-rank test. Procedia Comput Sci 157:396–405

    Article  Google Scholar 

  8. Wang X, Xu Y (2019) An improved index for clustering validation based on Silhouette index and Calinski-Harabasz index. IOP Conf Ser Mater Sci Eng 569:052024

    Article  Google Scholar 

  9. Xie JY, Gao HC, Xie WX, Liu XH, Grant PW (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inform Sci 354:19–40

    Article  Google Scholar 

  10. Liu YZ, Cheng RF, Liang YQ (2018) A density peak clustering algorithm based on shared neighborhood. Comput Sci 45:125–129+146

    Google Scholar 

  11. Jiang JH, Chen YJ, Meng XQ, Wang LM, Li KQ (2019) A novel density peaks clustering algorithm based on k nearest neighbors for improving assignment process. Physica A 523:702–713

    Article  MATH  Google Scholar 

  12. Du MJ, Ding SF, Jia HJ (2016) Study on density peaks clustering based on k-nearest neighbors and principal component analysis. Knowl Based Syst 99:135–145

    Article  Google Scholar 

  13. Liu R, Wang H, Yu XM (2018) Shared-nearest-neighbor-based clustering by fast search and find of density peaks. Inf Sci 450:200–226

    Article  MathSciNet  Google Scholar 

  14. Diao Q, Dai YP, An QC, Li WX, Feng XX, Pan F (2020) Clustering by detecting density peaks and assigning points by similarity-first search based on weighted K-nearest neighbors graph. Complexity 2020:1–17

    Article  Google Scholar 

  15. Wang FY, Zhang DS, Zhang X (2021) Adaptive density peaks clustering algorithm combining with whale optimization algorithm. Comput Eng Appl 57:94–102

    Google Scholar 

  16. Mirjalili SM, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67

    Article  Google Scholar 

  17. Abualigah L, Diabat A, Mirjalili S, Elaziz MA, Gandomi AH (2021) The arithmetic optimization algorithm. Comput Methods Appl Mech Eng 376:113609

    Article  MathSciNet  MATH  Google Scholar 

  18. Abualigah L, Yousri D, Elaziz MA, Ewees EG, Gandomi AH (2021) Aquila optimizer: a novel meta-heuristic optimization algorithm. Comput Ind Eng 157:107250

    Article  Google Scholar 

  19. Abualigah L, Elaziz MA, Sumari P, Zong WG, Gandomi AH (2021) Reptile search algorithm (RSA): a nature-inspired meta-heuristic optimizer. Expert Syst Appl 191:116158

    Article  Google Scholar 

  20. Xie JY, Gao HC, Xie WX (2016) K-nearest neighbors optimized clustering algorithm by fast search and finding the density peaks of a dataset. Sci Sin Inf 46:258–280

    Article  Google Scholar 

  21. Li T, Ge HW, Su SZ (2017) Research on density peak clustering based on density adaptive distance. J Chin Comput Syst 38:1347–1352

    Google Scholar 

  22. Ye XL, Zhao JY (2019) Multi-manifold clustering: a graph-constrained deep nonparametric method. Pattern Recogn 93:215–227

    Article  Google Scholar 

  23. Cheng DD, Zhang SL, Huang JL (2020) Dense members of local cores-based density peaks clustering algorithm. Knowl Based Syst 193:105454

    Article  Google Scholar 

  24. Xu XH, Ju YS, Liang YL, He P (2015) Manifold density peaks clustering algorithm. In: 2015 Third international conference on advanced cloud and big data, pp 311–318

  25. Zhang J, Pechenizkiy M, Pei Y, Efremova J (2016) A robust density-based clustering algorithm for multi-manifold structure. In: Proceedings of the 31st annual ACM symposium on applied computing, pp 832–838

  26. Chen JF, Zhang M, Zhao JC (2020) Clustering algorithm by fast search and find of density peaks for complex high-dimensional data. Comput Sci 47:79–86

    Google Scholar 

  27. Liu LN, Yu DH (2020) Density peaks clustering algorithm based on weighted k-nearest neighbors and geodesic distance. IEEE Access 8:168282–168296

    Article  Google Scholar 

  28. Wang XX, Zhang YF, Xie J, Dai QZ, Xiong ZY, Dan JP (2020) A density-core-based clustering algorithm with local resultant force. Soft Comput 24:6571–6590

    Article  Google Scholar 

  29. Bai XY, Yang PL, Shi XH (2017) An overlapping community detection algorithm based on density peaks. Neurocomputing 226:7–15

    Article  Google Scholar 

  30. Shi Y, Chen Z, Qi Z, Meng F, Cui L (2016) A novel clustering-based image segmentation via density peaks algorithm with mid-level feature. Neural Comput Appl 28:29–39

    Article  Google Scholar 

  31. Wu J, Zhong SH, Jiang JM, Yang YY (2016) A novel clustering method for static video summarization. Multimed Tools Appl 76:9625–9641

    Article  Google Scholar 

  32. Shen YP, Gu SH, Zheng LX (2019) Bionic optimized clustering data mining algorithm based on cloud computing platform. Comput Sci 46:247–250

    Google Scholar 

  33. Su YJ (2019) Clustering scheduling algorithm for large data in optical fiber communication based on cloud computing. Laser J 40:168–172

    Google Scholar 

  34. Wang L, Yu SB, Qin T (2017) Application of improved DBSCAN clustering algorithm in task scheduling of cloud computing. J Beijing Univ Posts Telecommun 40:68–71

    Google Scholar 

  35. Rajavel R, Ravichandran SK, Nagappan P, Gobichettipalayam KR (2021) Cloud service negotiation framework for real-time E-commerce application using game theory decision system. J Intell Fuzzy Syst 41:5617–5628

    Article  Google Scholar 

  36. Bendechache M, Tari K, Kechadi MT (2019) Parallel and distributed clustering framework for big spatial data mining. Int J Parallel Emergent Distrib Syst 34:671–689

    Article  Google Scholar 

  37. Baalamurugan KM, Bhanu SV (2018) An efficient clustering scheme for cloud computing problems using metaheuristic algorithms. Clust Comput 22:12917–12927

    Article  Google Scholar 

  38. Rajavel R, Thangarathanam M (2021) Agent-based automated dynamic SLA negotiation framework in the cloud using the stochastic optimization approach. Appl Soft Comput 101:107040

    Article  Google Scholar 

  39. Rajavel R, Ravichandran SK, Harimoorthy K, Nagappan P, Gobichettipalayam KR (2022) IoT-based smart healthcare video surveillance system using edge computing. J Ambient Intell Humaniz Comput 13:3195–3207

    Article  Google Scholar 

  40. Shooshtarian L, Lan D, Taherkordi A (2019) A clustering-based approach to efficient resource allocation in fog computing. In: I-SPAN, pp 207–224

  41. Zou Y, Zhao Z, Shi S, Wang L, Peng Y, Ping Y, Wang B (2020) Highly secure privacy-preserving outsourced k-means clustering under multiple keys in cloud computing. Secur Commun Netw 1238505(1238501–1238505):1238511

    Google Scholar 

  42. Dua D, Taniskidou EK (2017) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine

    Google Scholar 

  43. Gionis A, Mannila H, Tsaparas P (2007) Clustering aggregation. ACM Trans Knowl Discov Data 1:4

    Article  Google Scholar 

  44. Jain AK, Law MH (2005) Data clustering: a user's dilemma. In: International conference on pattern recognition and machine intelligence, pp 1–10

  45. Hong C, Yeung DY (2008) Robust path-based spectral clustering. Pattern Recogn 41:191–203

    Article  MATH  Google Scholar 

  46. Goldgof DB (1993) Nuclear feature extraction for breast tumor diagnosis. Proc Spie 861–870

  47. Dias DB, Madeo RC, Rocha T, Bȡscaro HH, Peres SM (2009) Hand movement recognition for Brazilian sign language: a study using distance-based neural networks. In: International joint conference on neural networks. IEEE, pp 697–704

  48. Cheng D, Zhu Q, Huang J, Wu Q, Yang L (2018) A novel cluster validity index based on local cores. In: IEEE transactions on neural networks and learning systems, pp 1–15

  49. Sigillito VG, Wing SP, Hutton LV, Baker KB (1989) Classification of radar returns from the ionosphere using neural networks. J Hopkins APL Tech Dig 10:262–266

    Google Scholar 

  50. Ha J, Seok S, Lee JS (2014) Robust outlier detection using the instability factor. Knowl Based Syst 63:15–23

    Article  Google Scholar 

Download references

Funding

This work is supported by the National Key Research and Development Program of China (No.2021YFC3300602).

Author information

Authors and Affiliations

Authors

Contributions

YZ Conceptualization, Methodology, Code, Writing—original draft. XW Supervision, Writing—review and editing, Funding acquisition. CL Supervision, Writing—review and editing. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xiao Wei.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethics approval

Not Applicable.

Consent to participate

Not Applicable.

Consent for publication

Not Applicable.

Availability of data and material

The experimental datasets include two major categories in this paper, six two-dimensional manifold datasets are artificially constructed, and the other six real datasets come from the UCI Machine Learning Repository. These datasets are available upon request to the corresponding author.

Code availability

The python code written to get the simulation results is available upon request to the corresponding author.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Wei, X. & Li, C. Manifold clustering optimized by adaptive aggregation strategy. Knowl Inf Syst 65, 379–408 (2023). https://doi.org/10.1007/s10115-022-01769-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-022-01769-3

Keywords

Navigation