Skip to main content
Log in

An effective multi-level synchronization clustering method based on a linear weighted Vicsek model

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

To conquer the shortcoming that general clustering methods cannot process big data in the main memory, this paper presents an effective multi-level synchronization clustering (MLSynC) method by using a framework of “divide and collect” and a linear weighted Vicsek model. We also introduce two concrete implementations of MLSynC method, a two-level framework algorithm and a recursive algorithm. MLSynC method has a different process with SynC algorithm, ESynC algorithm and SSynC algorithm. By the theoretic analysis, we find the time complexity of MLSynC method is less than SSynC. Simulation and experimental study on multi-kinds of data sets validate that MLSynC method not only gets better local synchronization effect but also needs less iterative times and time cost than SynC algorithm. Moreover, we observe that MLSynC method not only needs less time cost than ESynC and SSynC, but also almost gets the same local synchronization effect as ESynC and SSynC if the partition of the data set is proper. Further comparison experiments with some classical clustering algorithms demonstrate the clustering effect of MLSynC method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323

    Google Scholar 

  2. MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the 5-th MSP. University of California Press, Berkeley, pp 281–297

  3. Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York

    MATH  Google Scholar 

  4. Bouguettaya A, Yu Q, Liu X et al (2015) Efficient agglomerative hierarchical clustering. Expert Syst Appl 42(5):2785–2797

    Google Scholar 

  5. Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for clustering large databases. In: Proceedings of ACM SIGMOD, pp 73–84

  6. Karypis G, Han EH, Kumar V (1999) CHAMELEON: a hierarchical clustering algorithm using dynamic modeling. IEEE Comput 32(8):68–75

    Google Scholar 

  7. Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering method for very large databases. In: Proceedings of ACM SIGMOD, pp 103–114

  8. Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: Proceedings of ACM SIGMOD, pp 49–60

  9. Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial data sets with noise. In: Proceedings of ACM SIGKDD, pp 226–231

  10. Agrawal R, Gehrke J, Gunopolos D et al (1998) Automatic subspace clustering of high dimensional data for data mining application. In: Proceedings of ACM SIGMOD, pp 94–105

  11. Wang W, Yang J, Muntz R (1997) STING: a statistical information grid approach to spatial data mining. In: Proceedings of VLDB, pp 186–195

  12. Theodoridis S, Koutroumbas K (2006) Pattern recognition. Academic, New York

    MATH  Google Scholar 

  13. Tan PN, Steinbach M, Kumar V (2005) Introduction to data mining. Addison Wesley, Boston

    Google Scholar 

  14. Zahn CT (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput C-20(1):68–86

    MATH  Google Scholar 

  15. Horn D, Gottlieb A (2002) Algorithm for data clustering in pattern recognition problems based on quantum mechanics. Phys Rev Lett 88(1):018702

    Google Scholar 

  16. Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Proceedings of NIPS, pp 849–856

  17. Luxburg UV (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416

    MathSciNet  Google Scholar 

  18. Girolami M (2002) Mercer kernel-based clustering in feature space. IEEE Trans Neural Netw 13:780–784

    Google Scholar 

  19. Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(16):972–976

    MathSciNet  MATH  Google Scholar 

  20. Böhm C, Plant C, Shao J et al (2010) Clustering by synchronization. In: Proceedings of ACM SIGKDD, Washington, USA, pp 583–592

  21. Shao J, Yang Q, Böhm C, Plant C (2011) Detection of arbitrarily oriented synchronized clusters in high-dimensional data. In: Proceedings of ICDM, pp 607–616

  22. Shao J, He X, Plant C, Yang Q, Böhm C (2013a) Robust synchronization-based graph clustering. In: Proceedings of PAKDD, pp 249–260

  23. Shao J, He X, Böhm C, Yang Q, Plant C (2013b) Synchronization inspired partitioning and hierarchical clustering. IEEE Trans Knowl Data Eng 25(4):893–905

    Google Scholar 

  24. Chen X (2014) A fast synchronization clustering algorithm. arXiv:1407.7449 [cs.LG]. http://arxiv.org/abs/1407.7449

  25. Chen X (2017) An effective synchronization clustering algorithm. Appl Intell 46(1):135–157

    Google Scholar 

  26. Chen X (2018) Fast synchronization clustering algorithms based on spatial index structures. Expert Syst Appl 94:276–290

    Google Scholar 

  27. Hang W, Choi K, Wang S (2017) Synchronization clustering based on central force optimization and its extension for large-scale datasets. Knowl-Based Syst 118:31–44

    Google Scholar 

  28. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496

    Google Scholar 

  29. Vicsek T, Czirok A, Ben-Jacob E et al (1995) Novel type of phase transitions in a system of self-driven particles. Phys Rev Lett 75(6):1226–1229

    MathSciNet  Google Scholar 

  30. Jadbabaie A, Lin J, Morse AS (2003) Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Trans Autom Control 48(6):998–1001

    MathSciNet  MATH  Google Scholar 

  31. Wang L, Liu Z (2009) Robust consensus of multi-agent systems with noise. Sci China Ser F: Inform Sci 52(5):824–834

    MathSciNet  MATH  Google Scholar 

  32. Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113

    Google Scholar 

  33. GrÄunwald P (2005) A tutorial introduction to the minimum description length principle. MIT Press, Cambridge

    Google Scholar 

  34. Von der Malsburg C (1973) Self-organization of orientation sensitive cells in the striate cortex. Kybernetik 14:85–100

    Google Scholar 

  35. Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biol Cybern 43(1):59–69

    MathSciNet  MATH  Google Scholar 

  36. Kohonen T (1989) On the significance of internal representations in neural networks. In: Proceedings of ICANN, pp 158–162

  37. Grossberg S (1976a) Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors. Biol Cybern 23(3):121–134

    MathSciNet  MATH  Google Scholar 

  38. Grossberg S (1976b) Adaptive pattern classification and universal recoding: II. Feedback, expectation, olfaction, illusions. Biol Cybern 23(4):187–202

    MATH  Google Scholar 

  39. Du KL (2010) Clustering: a neural network approach. Neural Netw 23(1):89–107

    MATH  Google Scholar 

  40. Brito da Silva LE, Elnabarawy I, Wunsch DC II (2019) A survey of adaptive resonance theory neural network models for engineering applications. Neural Netw 120:167–203

    Google Scholar 

  41. Amis GP, Carpenter GA (2010) Self-supervised ARTMAP. Neural Netw 2:265–282

    MATH  Google Scholar 

  42. Seiffertt J (2019) Adaptive resonance theory in the time scales calculus. Neural Netw 120:32–39

    MATH  Google Scholar 

  43. Bradley PS, Fayyad UM, Reina C et al (1998) Scaling clustering algorithms to large databases. In: Proceedings of ACM SIGKDD, pp 9–15

  44. Sculley D (2010) Web-scale k-means clustering. In: Proceedings of WWW, pp 1177–1178

  45. Urruty T, Djeraba C, Simovici DA (2007) Clustering by random projections. In: Proceedings of ICDM, pp 107–119

  46. Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: a cluster ensemble approach. In: Proceedings of ICML, pp 186–193

  47. Milenova BL, Campos MM (2002) O-cluster: scalable clustering of large high dimensional data sets. In: Proceedings of ICDM, pp 290–297

  48. Rathore P, Kumar D, Bezdek JC, Rajasegarar S, Palaniswami M (2019) A rapid hybrid clustering algorithm for large volumes of high dimensional data. IEEE Trans Knowl Data Eng 31(4):641–654

    Google Scholar 

  49. Chen X (2015) A new clustering algorithm based on near neighbor influence. Expert Syst Appl 42(21):7746–7758

    Google Scholar 

  50. Dua D, Graff C (2019) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine. http://archive.ics.uci.edu/ml

    Google Scholar 

  51. Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854

    MathSciNet  MATH  Google Scholar 

  52. Strehl A, Ghosh J (2002) Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617

    MathSciNet  MATH  Google Scholar 

  53. Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32–40

    MathSciNet  MATH  Google Scholar 

  54. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal 24(5):603–619

    Google Scholar 

  55. Slonim N, Atwal GS, Tkacik G, Bialek W (2005) Information-based clustering. Proc Natl Acad Sci U S A 102(51):18297–18302

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by the projects from Natural Science Research in Colleges and Universities of Anhui Province of China (grant number: KJ2019ZD15, KJ2019A0158), the University Synergy Innovation Program of Anhui Province (grant number: GXXT-2019-002), Anhui Polytechnic University (grant number: 2018YQQ031), Chongqing Cutting-edge and Applied Foundation Research Program (grant number: cstc2016jcyjA0521), Chongqing Three Gorges University (grant number: 16PY08) and National Natural Science Foundation of China (grant number: 61976005). The authors thank the editors and the anonymous reviewers for their useful suggestions that help us to improve this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinquan Chen.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

ESM 1

(DOCX 1.64 mb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, X., Qiu, Y. An effective multi-level synchronization clustering method based on a linear weighted Vicsek model. Appl Intell 50, 4063–4080 (2020). https://doi.org/10.1007/s10489-020-01767-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01767-4

Keywords

Navigation