Abstract
This paper proposes a new spectral clustering method based on local Principal Component Analysis (PCA) and connected graph decomposition. Specifically, our method randomly select centroids of the data set to take global structure of data points into consideration, and then uses local PCA to preserve the local structure of data points for constructing the similarity matrix. Furthermore, our method employs the connected graph decomposition to partition the resulting similarity matrix to group data points into clusters. Experimental analysis on 12 UCI data sets showed that our proposed method outperformed the state-of-the-art clustering methods in terms of clustering performance.


Similar content being viewed by others
References
Arias-Castro E, Lerman G, Zhang T (2013) Spectral clustering based on local pca. J Mach Learn Res 18(9):1–57
Cong L, Zhu X (2017) Unsupervised feature selection via local structure learning and sparse learning. https://doi.org/10.1007/s11042-017-5381-7, vol 11
Elhamifar E, Vidal R (2013) Sparse subspace clustering:Algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781
Fei W, Jimeng S (2015) Survey on distance metric learning and dimensionality reduction in data mining. Kluwer Academic Publishers, South Holland
Fodor IK (2002) A survey of dimension reduction techniques. Neoplasia 7 (5):475–485
Geach JE (2012) Unsupervised self-organized mapping: a versatile empirical tool for object selection, classification and redshift estimation in large surveys. Mon Not R Astron Soc 419(3):2633–2645
Gao L, Guo Z, Zhang H, Xu X, Shen HT (2017) Video captioning with attention-based LSTM and semantic consistency. IEEE Trans Multimed 19(9):2045–2055
Hartigan JA (1979) A k-means clustering algorithm. Appl Stat 28(1):100–108
Hu R, Zhu X, Cheng D, He W, Yan Y, Song J, Zhang S (2017) Graph self-representation method for unsupervised feature selection. Neurocomputing 220:130–137
Jolliffe IT (1986) Principal component analysis. J Mark Res 87(100):513
Kaufmann L, J Rousseeuw P (1987) Clustering by means of medoids, pp 405–416
Laio A, Rodriguez A (2016) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496
Li Y, Zhang S, Cheng D, He W, Wen G, Xie Q (2016) Spectral clustering based on hypergraph and self-re-presentation. Multimed Tools Appl 76(16):1–18
Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2013) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184
Lu CY, Min H, Zhao ZQ, Zhu L, Huang DS, Yan S (2012) Robust and efficient subspace segmentation via least squares regression. In: ECCV, pp 347–360
McQueen J (1967) Some methods of classification and analysis of multivariate observations, pp 281–297
Pang Y, Zhang L, Liu Z, Yu N, Li H (2005) Neighborhood preserving projections (npp): a novel linear dimension reduction method. Lect Notes Comput Sci 3644:117–125
Riffenburgh RH, Clunies-Ross CW (1960) Linear discriminant analysis. Chicago 3(6):27–33
Rokach L, Maimon O (2005) Clustering methods. Data Mining & Knowledge Discovery Handbook 3(3):321–352
Roweis Sam T, Saul Lawrence K (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Scholkopf B, Smola A, M1ller K-R (2003) Kernel principal component analysis. Lect Notes Comput Sci 27(4):555–559
Shashanka M (2010) A privacy preserving framework for Gaussian mixture models. In: ICDM, pp 499–506
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Song J, Gao L, Nie F, Shen HT, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25(11):4999–5011
Song J, Shen HT, Wang J, Huang Z, Sebe N, Wang J (2016) A distance-computation-free search scheme for binary code databases. IEEE Trans Multimed 18(3):484–495
Song J, Gao L, Li L, Zhu X, Sebe N (2018) Quantization-based hashing: a general framework for scalable image and video retrieval. Pattern Recogn 75:175–187
Tenenbaum JB, Silva VD, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Tran a TN, Daszykowski M, Drab K (2012) Revised dbscan algorithm to cluster data with dense adjacent clusters. Chemometr Intell Lab Syst 120(2013):92–96
Wang S, Yuan X, Yao T, Yan S, Shen J (2011) Efficient subspace segmentation via quadratic programming. In: AAAI, pp 519–524
Xiaofeng Z, Xuelong L, Shichao Z, Chunhua J, Xindong W (2017) Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Syst 28(6):1263–1275
Yang MH (2003) Discriminant isometric mapping for face recognition. In: ICVS, pp 470–480
Yang Y, Duan Y, Wang X, Huang Z, Xie N, Shen HT (2018) Hierarchical multi-clue modelling for poi popularity prediction with heterogeneous tourist information. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2018.2842190
Zhao Q, Jin J, Wang X, Cichocki A (2012) A novel bci based on erp components sensitive to configural processing of human faces. J Neural Eng 9(2):026018
Zhang Y, Jin J, Qing X, Wang B, Wang X (2012) Lasso based stimulus frequency recognition model for ssvep bcis. Biomed Signal Process Control 7(2):104–111
Zhang S, Li X, Zong M, Zhu X, Wang R (2018) Efficient knn classification with different numbers of nearest neighbors. IEEE Trans Neural Netw Learn Syst 29(5):1774–1785
Zheng W, Zhu X, Zhu Y, Hu R, Lei C (2017) Dynamic graph learning for spectral feature selection. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-017-5272-y
Zheng W, Zhu X, Wen G, Zhu Y, Yu H, Gan J (2018) Unsupervised feature selection by self-paced learning regularization. Pattern Recognition Letters. https://doi.org/10.1016/j.patrec.2018.06.029
Zhu Y, Lucey S (2015) Convolutional sparse coding for trajectory reconstruction. IEEE Trans Pattern Anal Mach Intell 37(3):529–540
Zhu X, Zhang S, Jin Z, Zhang Z, Zhuoming X (2011) Missing value estimation for mixed-attribute data sets. IEEE Trans Knowl Data Eng 23(1):110–121
Zhu X, Zhang L, Huang Z (2014) A sparse embedding and least variance encoding approach to hashing. IEEE Trans Image Process 23(9):3737–3750
Zhu X, Li X, Zhang S (2016) Block-row sparse multiview multilabel learning for image classification. IEEE Trans Cybern 46(2):450–461
Zhu X, Li X, Zhang S, Xu Z, Yu L, Wang C (2017) Graph pca hashing for similarity search. IEEE Trans Multimed 19(9):2033–2044
Zhu X, Suk H-Il, Huang H, Shen D (2017) Low-rank graph-regularized structured sparse regression for identifying genetic biomarkers. IEEE Trans Big Data 3(4):405–414
Zhu X, Suk H ll, Wang L, Lee S-W, Shen D (2017) A novel relational regularization feature selection method for joint regression and classification in ad diagnosis. Med Image Anal 38:205–214
Zhu Y, Kim M, Zhu X, Yan J, Kaufer D, Wu G (2017) Personalized diagnosis for alzheimer’s disease. In: MICCAI, pp 205–213
Zhu Y, Zhu X, Kim M, Kaufer D, Wu G (2017) A novel dynamic hyper-graph inference framework for computer assisted diagnosis of neuro-diseases. In: IPMI, pp 158–169
Zhu X, Zhang S, Hu R, Zhu Y, Song J (2018) Local and global structure preservation for robust unsupervised spectral feature selection, pp 517–529
Zhu X, Zhang S, Li Y, Zhang J, Yang L, Fang Y (2018) Low-rank sparse subspace for spectral clustering. IEEE Transactions on Knowledge and Data Engineering. https://doi.org/10.1109/TKDE.2018.2858782
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work is partially supported by the China Key Research Program (Grant No: 2016YFB1000905); the Natural Science Foundation of China (Grants No: 61876046, 61573270 and 61672177); the Project of Guangxi Science and Technology (GuiKeAD17195062); the Guangxi Natural Science Foundation (Grant No: 2015GXNSFCB139011); the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing; the Guangxi High Institutions Program of Introducing 100 High-Level Overseas Talents; and the Research Fund of Guangxi Key Lab of Multisource Information Mining & Security (18-A-01-01).
Rights and permissions
About this article
Cite this article
Tong, T., Zhu, X. & Du, T. Connected graph decomposition for spectral clustering. Multimed Tools Appl 78, 33247–33259 (2019). https://doi.org/10.1007/s11042-018-6643-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6643-8