Initialization-similarity clustering algorithm

Liu, Tong; Zhu, Jingting; Zhou, Jukai; Zhu, YongXin; Zhu, Xiaofeng

doi:10.1007/s11042-019-7663-8

Initialization-similarity clustering algorithm

Published: 07 May 2019

Volume 78, pages 33279–33296, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Tong Liu ORCID: orcid.org/0000-0003-3047-1148^1,2,
Jingting Zhu²,
Jukai Zhou²,
YongXin Zhu³ &
…
Xiaofeng Zhu^1,2

514 Accesses
6 Citations
Explore all metrics

Abstract

Classic k-means clustering algorithm randomly selects centroids for initialization to possibly output unstable clustering results. Moreover, random initialization makes the clustering result hard to reproduce. Spectral clustering algorithm is a two-step strategy, which first generates a similarity matrix and then conducts eigenvalue decomposition on the Laplacian matrix of the similarity matrix to obtain the spectral representation. However, the goal of the first step in the spectral clustering algorithm does not guarantee the best clustering result. To address the above issues, this paper proposes an Initialization-Similarity (IS) algorithm which learns the similarity matrix and the new representation in a unified way and fixes initialization using the sum-of-norms regularization to make the clustering more robust. The experimental results on ten real-world benchmark datasets demonstrate that our IS clustering algorithm outperforms the comparison clustering algorithms in terms of three evaluation metrics for clustering algorithm including accuracy (ACC), normalized mutual information (NMI), and Purity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving a Centroid-Based Clustering by Using Suitable Centroids from Another Clustering

Article 24 April 2019

Spectral clustering via half-quadratic optimization

Article 07 November 2019

k-Means-MIND: comparing seeds without repeated k-means runs

Article 28 July 2022

References

Ahmed T, Sarma M (2018) Locality sensitive hashing based space partitioning approach for indexing multidimensional feature vectors of fingerprint image data. IET Image Process 12(6):1056–1064
Google Scholar
Ankerst M, et al (1999) OPTICS: ordering points to identify the clustering structure. in ACM Sigmod record. p. 49–60
Barron JT (2017) A more general robust loss function. arXiv preprint arXiv:1701.03077
Bian Z, Ishibuchi H, Wang S (2019) Joint learning of spectral clustering structure and fuzzy similarity matrix of data. IEEE Trans Fuzzy Syst 27(1):31–44
Google Scholar
Bin Y et al (2018) Describing video with attention-based bidirectional LSTM. IEEE transactions on cybernetics. https://doi.org/10.1109/TCYB.2018.2831447
Google Scholar
Black MJ, Rangarajan A (1996) On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. Int J Comput Vis 19(1):57–91
Google Scholar
Bu Z et al (2018) GLEAM: a graph clustering framework based on potential game optimization for large-scale social networks. Knowl Inf Syst 55(3):741–770
Google Scholar
Cherng JS, Lo MJ (2001) A hypergraph based clustering algorithm for spatial data sets. in ICDM, p. 83–90
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619
Google Scholar
Das A, Panigrahi P (2018) Normalized Laplacian spectrum of some subdivision-joins and R-joins of two regular graphs. AKCE International Journal of Graphs and Combinatorics 15(3):261–270
MathSciNet MATH Google Scholar
Deelers S, Auwatanamongkol S (2007) Enhancing K-means algorithm with initial cluster centers derived from data partitioning along the data axis with the highest variance. Int J Comput Sci 2(4):247–252
Google Scholar
Doad PK, Mahip MB (2013) Survey on Clustering Algorithm & Diagnosing Unsupervised Anomalies for Network Security. International Journal of Current Engineering and Technology ISSN, p. 2277–410
Domeniconi C, Al-Razgan M (2009) Weighted cluster ensembles: methods and analysis. ACM Transactions on Knowledge Discovery from Data (TKDD) 2(4):17
Google Scholar
Duan Y, Liu Q, Xia S (2018) An improved initialization center k-means clustering algorithm based on distance and density in AIP: 1955(1), p. 040–046
Estivill-Castro V, Lee I (2000) Amoeba: Hierarchical clustering based on spatial proximity using delaunay diagram. in ISSDH, p. 1–16
Geman S, McClure DE (1987) Statistical methods for tomographic image reconstruction. Bulletin of the International statistical Institute 52(4):5–21
MathSciNet Google Scholar
Guha S, Rastogi R, Shim K (2000) ROCK: a robust clustering algorithm for categorical attributes. Inf Syst 25(5):345–366
Google Scholar
Guha S, Rastogi R, Shim K (2001) Cure: an efficient clustering algorithm for large databases. Inf Syst 26(1):35–58
MATH Google Scholar
Hartigan JA, Wong MA (1979) Algorithm AS 136: a k-means clustering algorithm. J R Stat Soc: Ser C: Appl Stat 28(1):100–108
MATH Google Scholar
Hu H, et al (2014) Smooth representation clustering. in CV PR. p. 3834–3841
Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666
Article Google Scholar
Kang Z et al (2019) Low-rank kernel learning for graph-based clustering. Knowl-Based Syst 163:510–517
Google Scholar
Karypis G, Han E-H, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75
Google Scholar
Kuncheva LI, Vetrov DP (2006) Evaluation of stability of k-means cluster ensembles with respect to random initialization. IEEE Trans Pattern Anal Mach Intell 28(11):1798–1808
Google Scholar
Lakshmi MA, Daniel GV, Rao DS (2019) Initial Centroids for K-Means Using Nearest Neighbors and Feature Means, in SCSP, p. 27–34
Lei C, Zhu X (2018) Unsupervised feature selection via local structure learning and sparse learning. Multimed Tools Appl 77(22):29605–29622
Google Scholar
Likas A, Vlassis N, Verbeek JJ (2003) The global k-means clustering algorithm. Pattern Recogn 36(2):451–461
Google Scholar
Lindsten F, Ohlsson H, Ljung L (2011) Clustering using sum-of-norms regularization: With application to particle filter output computation. in SSP, p. 201–201
Liu G et al (2013) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184
Google Scholar
Lloyd S (1982) Least squares quantization in PCM. IEEE Trans Inf Theory 28(2):129–137
MathSciNet MATH Google Scholar
Lu CY, et al (2012) Robust and efficient subspace segmentation via least squares regression. in ECCV. p. 347–360
Google Scholar
Moftah HM et al (2014) Adaptive k-means clustering algorithm for MR breast image segmentation. Neural Comput & Applic 24(7–8):1917–1928
Google Scholar
Motwani M, Arora N, Gupta A (2019) A Study on Initial Centroids Selection for Partitional Clustering Algorithms, in Software Engineering. p. 211–220
Google Scholar
Nie F, Wang X, Huang H (2014) Clustering and projected clustering with adaptive neighbors. in SIGKDD, p. 977–986
Park S, Zhao H (2018) Spectral clustering based on learning similarity matrix. Bioinformatics 34(12):2069–2076
Google Scholar
Pavan KK, Rao AD, Sridhar G (2010) Single pass seed selection algorithm for k-means. J Comput Sci 6(1):60–66
Google Scholar
Radhakrishna V et al (2018) A novel fuzzy similarity measure and prevalence estimation approach for similarity profiled temporal association pattern mining. Futur Gener Comput Syst 83:582–595
Google Scholar
Rasmussen CE (2000) The infinite Gaussian mixture model. in NIPS, p.554–560
Rong H et al (2018) A novel subgraph K⁺-isomorphism method in social network based on graph similarity detection. Soft Comput 22(8):2583–2601
Google Scholar
Satsiou A, Vrochidis S, Kompatsiaris I (2018) A Hybrid Recommendation System Based on Density-Based Clustering. in INSCI 2018
Saxena A et al (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681
Google Scholar
Shah SA, Koltun V (2017) Robust continuous clustering. Proc Natl Acad Sci 114(37):9814–9819
Google Scholar
Sharan R, Shamir R (2000) CLICK: a clustering algorithm with applications to gene expression analysis. in ICISMB. 8(307), p. 307–316
Silva FB et al (2018) Graph-based bag-of-words for classification. Pattern Recogn 74:266–285
Google Scholar
Singh A, A Yadav, Rana A (2013) K-means with Three different Distance Metrics. International Journal of Computer Applications, 67(10)
Google Scholar
Song J et al (2018) From deterministic to generative: multimodal stochastic RNNs for video captioning. IEEE transactions on neural networks and learning systems. https://doi.org/10.1109/TNNLS.2018.2851077
Google Scholar
Voloshinov VV (2018) A generalization of the Karush–Kuhn–Tucker theorem for approximate solutions of mathematical programming problems based on quadratic approximation. Comput Math Math Phys 58(3):364–377
MathSciNet MATH Google Scholar
Wang J, et al (2015) Fast Approximate K-Means via Cluster Closures, in MDMA. p. 373–395
Wang C et al (2018) Multiple kernel clustering with global and local structure alignment. IEEE Access 6:77911–77920
Google Scholar
Wong KC (2015) A short survey on data clustering algorithms. in ISCMI
Wu S, Feng X, Zhou W (2014) Spectral clustering of high-dimensional data exploiting sparse representation vectors. Neurocomputing 135:229–239
Google Scholar
Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Annals of Data Science 2(2):165–193
MathSciNet Google Scholar
Xu X, et al. (1998) A distribution-based clustering algorithm for mining in large spatial databases. in ICDE, p. 324–331
Yan Q et al (2019) A discriminated similarity matrix construction based on sparse subspace clustering algorithm for hyperspectral imagery. Cogn Syst Res 53:98–110
Google Scholar
Zahra S et al (2015) Novel centroid selection approaches for KMeans-clustering based recommender systems. Inf Sci 320:156–189
MathSciNet Google Scholar
Zheng W et al (2018) Unsupervised feature selection by self-paced learning regularization. Pattern Recogn Lett. https://doi.org/10.1016/j.patrec.2018.06.029
Zheng W et al (2018) Dynamic graph learning for spectral feature selection. Multimed Tools Appl 77(22):29739–29755
Google Scholar
Zhou X et al (2018) Graph convolutional network hashing. IEEE transactions on cybernetics. https://doi.org/10.1109/TCYB.2018.2883970
Zhu X et al (2017) Graph PCA hashing for similarity search. IEEE Transactions on Multimedia 19(9):2033–2044
Google Scholar
Zhu X et al (2018) Low-rank sparse subspace for spectral clustering. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2018.2858782
Google Scholar
Zhu X et al (2018) One-step multi-view spectral clustering. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2018.2873378
Google Scholar

Download references

Funding

This work was partially supported by the Research Fund of Guangxi Key Lab of Multi-source Information Mining & Security (MIMS18-M-01), the Natural Science Foundation of China (Grants No: 61876046 and 61573270); the Guangxi High Institutions Program of Introducing 100 High-Level Overseas Talents; the Strategic Research Excellence Fund at Massey University, and the Marsden Fund of New Zealand (Grant No: MAU1721).

Author information

Authors and Affiliations

GuangXi Key Lab of Multi-Source Information Mining and Security, GuangXi Normal University, Guilin, 541004, China
Tong Liu & Xiaofeng Zhu
School of Natural & Computational Sciences, Massey University, Auckland, New Zealand
Tong Liu, Jingting Zhu, Jukai Zhou & Xiaofeng Zhu
Hebei GEO University, Shijiazhuang, 050000, China
YongXin Zhu

Authors

Tong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jingting Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jukai Zhou
View author publications
You can also search for this author in PubMed Google Scholar
YongXin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaofeng Zhu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, T., Zhu, J., Zhou, J. et al. Initialization-similarity clustering algorithm. Multimed Tools Appl 78, 33279–33296 (2019). https://doi.org/10.1007/s11042-019-7663-8

Download citation

Received: 06 March 2019
Revised: 28 March 2019
Accepted: 16 April 2019
Published: 07 May 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11042-019-7663-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Initialization-similarity clustering algorithm

Abstract

Access this article

Similar content being viewed by others

Improving a Centroid-Based Clustering by Using Suitable Centroids from Another Clustering

Spectral clustering via half-quadratic optimization

k-Means-MIND: comparing seeds without repeated k-means runs

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Initialization-similarity clustering algorithm

Abstract

Access this article

Similar content being viewed by others

Improving a Centroid-Based Clustering by Using Suitable Centroids from Another Clustering

Spectral clustering via half-quadratic optimization

k-Means-MIND: comparing seeds without repeated k-means runs

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation