Abstract
The key step of spectral clustering is learning the affinity matrix to measure the similarity among data points. This paper proposes a new spectral clustering method, which uses mutual k nearest neighbor to obtain the affinity matrix by removing the influence of noise. Then, the characteristics of high-dimensional data are self-represented to ensure local important information of data by using affinity matrix in standardized processing. Furthermore, we also use the normalization method to further improve the performance of clustering. Experimental analysis on eight benchmark data sets showed that our proposed method outperformed the state-of-the-art clustering methods in terms of clustering performance such as cluster accuracy and normalized mutual information.
Similar content being viewed by others
References
Chaudhari S, Cabric D (2018) Unsupervised frequency clustering algorithm for null space estimation in wideband spectrum sharing networks. In: SIP, pp 224–228
Chew SE, Cahill ND (2015) Normalized cuts with soft must-link constraints for image segmentation and clustering. In: WNYISPW, pp 6–10
Cui X, Zhu P, Yang X, Li K, Ji C (2014) Optimized big data k-means clustering using mapreduce. J Supercomput 70(3):1249–1259
Deutsch EW, Mendoza L, Shteynberg D, Slagel J, Sun Z, Moritz RL (2015) Trans-proteomic pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics. Proteomics Clin Appl 9(7–8):745–754
Elhamifar E, Vidal R (2013) Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781
Gao L, Guo Z, Zhang H, Xu X, Shen H (2017) Video captioning with attention-based LSTM and semantic consistency. IEEE Trans Multimed 19(9):2045–2055
Hu R, Zhu X, Cheng D, He W, Yan Y, Song J, Zhang S (2017) Graph self-representation method for unsupervised feature selection. Neurocomputing 220:130–137
Jia H, Ding S, Ma H, Xing W (2014) Spectral clustering with neighborhood attribute reduction based on information entropy. J Comput 9(6):1316–1324
Lei C, Zhu X (2017) Unsupervised feature selection via local structure learning and sparse learning. Multimedia Tools Appl 77(22):29605–29622
Liang B, Jiye L, Chuangyin D, Fuyuan C (2013) A novel fuzzy clustering algorithm with between-cluster information for categorical data. Fuzzy Sets Syst 215(7):55–73
Liu H, Zhang S (2012) Noisy data elimination using mutual -nearest neighbor for classification mining. J Syst Softw 84(10):1067–1074
Lu CY, Min H, Zhao ZQ, Zhu L, Huang DS, Yan S (2012) Robust and efficient subspace segmentation via least squares regression. In: ECCV, pp 347–360
Ma H, Yang X (2013) A method of recognition based on the feature layer fusion of palmprint and hand vein. In: Proceedings of SPIE—the international society for optical engineering, vol. 9045, no. 3, pp 615–626
Maier M, Von Luxburg U, Hein M (2013) How the result of graph clustering methods depends on the construction of the graph. ESAIM Probab Stat 17(4):370–418
Manjusha M, Harikumar R (2016) Performance analysis of KNN classifier and K-means clustering for robust classification of epilepsy from EEG signals. In: WCSP, pp 2412–2416
Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Proceedings of NIPS, vol 14, no 1, pp 849–856
Nguyen T, Khosravi A, Creighton D, Nahavandi S (2014) Spike sorting using locality preserving projection with gap statistics and landmark-based spectral clustering. J Neurosci Methods 238:43–53
Nie F, Wang X, Jordan MI, Huang H (2016) The constrained Laplacian rank algorithm for graph-based clustering. In: AAAI, pp 1969–1976
Shashirekha HL, Wani AH (2015) Gene selection by mutual nearest neighbor approach. In: ICERECT, pp 398–402
Shi Y, Du S, Wang W (2016) Local consistent low rank representation for image clustering. In: CDC, pp 3877–3881
Song J, Gao L, Liu L, Zhu X, Sebe N (2018) Quantization-based hashing: a general framework for scalable image and video retrieval. Pattern Recognit 75:175–187
Song J, Gao L, Nie F, Shen HT, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25(11):4999–5011
Song J, Shen HT, Wang J, Huang Z, Sebe N, Wang J (2016) A distance-computation-free search scheme for binary code databases. IEEE Trans Multimed 18(3):484–495
Sugiyama M, Niu G, Yamada M, Kimura M, Hachiya H (2014) Information-maximization clustering based on squared-loss mutual information. Neural Comput 26(1):84–131
Trebuna P, Halcinova J, Fil’o M, Markovic J (2014) The importance of normalization and standardization in the process of clustering. In: SAMI, pp 381–385
Wang L, Zhang X, Pan C (2015) MSDLSR: margin scalable discriminative least squares regression for multicategory classification. IEEE Trans Neural Netw Learn Syst 27(99):2711–2717
Xiong F, Kam M, Hrebien L, Wang B, Qi Y (2016) Kernelized information-theoretic metric learning for cancer diagnosis using high-dimensional molecular profiling data. ACM Trans Knowl Discov Data 10(4):1–23
Xue Z, Yu X, Tan X, Fu Q (2017) Local hypergraph Laplacian regularized low-rank representation for noise reduction of hyperspectral images. Acta Optica Sinica 37(5):0510001
Yang Y, Rutayisire T, Lin C, Li T, Teng F (2013) An improved cop-kmeans clustering for solving constraint violation based on mapreduce framework. Fundam Informaticae 126(4):301–318
Yang Y, Duan Y, Wang X, Huang Z, Xie N, Shen HT (2018) Hierarchical multi-clue modelling for poi popularity prediction with heterogeneous tourist information. In: TKDE, pp 1–12
Zeng D, Xie D, Liu R, Li X (2017) Missing value imputation methods for TCM medical data and its effect in the classifier accuracy. In: HealthCom, pp 1–4
Zhang J, Li CG, Zhang H, Guo J (2017) Low-rank and structured sparse subspace clustering. In: VCIP, pp 27–30
Zhang S, Li X, Zong M, Zhu X, Cheng D (2017) Learning k for knn classification. ACM Trans Intell Syst Technol 8(3):43
Zhang S, Li X, Zong M, Zhu X, Wang R (2018) Efficient knn classification with different numbers of nearest neighbors. IEEE Trans Neural Netw Learn Syst 29(5):1774–1785
Zhang Y, Jin J, Qing X, Wang B, Wang X (2012) Lasso based stimulus frequency recognition model for SSVEP BCIs. Biomed Signal Process Control 7(2):104–111
Zhang Y, Zhao Q, Jin J, Wang X, Cichocki A (2012) A novel BCI based on erp components sensitive to configural processing of human faces. J Neural Eng 9(2):1–22
Zheng W, Zhu X, Wen G, Zhu Y, Yu H, Gan J (2018) Unsupervised feature selection by self-paced learning regularization. Pattern Recognit Lett. https://doi.org/10.1016/j.patrec.2018.06.029
Zheng W, Zhu X, Zhu Y, Hu R, Lei C (2017) Dynamic graph learning for spectral feature selection. Multimedia Tools Appl 77(22):29739–29755
Zhu X, Li X, Zhang S (2016) Block-row sparse multiview multilabel learning for image classification. IEEE Trans Cybern 46(2):450–461
Zhu X, Li X, Zhang S, Ju C, Wu X (2017) Robust joint graph sparse coding for unsupervised spectral feature selection. IEEE Trans Neural Netw Learn Syst 28(6):1263–1275
Zhu X, Li X, Zhang S, Xu Z, Yu L, Wang C (2017) Graph PCA hashing for similarity search. IEEE Trans Multimed 19(9):2033–2044
Zhu X, Suk H-I, Huang H, Shen D (2017) Low-rank graph-regularized structured sparse regression for identifying genetic biomarkers. IEEE Trans Big Data 3(4):405–414
Zhu X, Suk H-I, Wang L, Lee S-W, Shen D (2017) A novel relational regularization feature selection method for joint regression and classification in AD diagnosis. Med Image Anal 38(6):205–214
Zhu X, Zhang L, Huang Z (2014) A sparse embedding and least variance encoding approach to hashing. IEEE Trans Image Process 23(9):3737–3750
Zhu X, Zhang S, He W, Hu R, Lei C, Zhu P, One-step multi-view spectral clustering. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2018.2873378
Zhu X, Zhang S, Hu R, Zhu Y et al (2018) Local and global structure preservation for robust unsupervised spectral feature selection. IEEE Trans Knowl Data Eng 30(3):517–529
Zhu X, Zhang S, Jin Z, Zhang Z, Zhuoming X (2011) Missing value estimation for mixed-attribute data sets. IEEE Trans Knowl Data Eng 23(1):110–121
Zhu X, Zhang S, Li Y, Zhang J, Yang L, Fang Y, Low-rank sparse subspace for spectral clustering. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2018.2858782
Zhu Y, Kim M, Zhu X, Yan J, Kaufer D, Wu G (2017) Personalized diagnosis for alzheimers disease. In: MICCAI, pp 205–213
Zhu Y, Lucey S (2015) Convolutional sparse coding for trajectory reconstruction. IEEE Trans Pattern Anal Mach Intell 37(3):529–540
Zhu Y, Zhu X, Kim M, Kaufer D, Wu G (2017) Dynamic hyper-graph inference framework for computer assisted diagnosis of neuro-diseases. In: IPMI, pp 158–169
Acknowledgments
This work is partially supported by the China Key Research Program (Grant No. 2016YFB1000905); the Key Program of the National Natural Science Foundation of China (Grant No. 61836016); the Natural Science Foundation of China (Grants Nos. 61876046, 61573270, 81701780 and 61672177); the Project of Guangxi Science and Technology (GuiKeAD17195062); the Guangxi Natural Science Foundation (Grant Nos. 2015GXNSFCB139011, 2017GXNSFBA198221); the Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing; the Guangxi High Institutions Program of Introducing 100 High-Level Overseas Talents; and the Research Fund of Guangxi Key Lab of Multisource Information Mining and Security (18-A-01-01).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We wish to draw the attention of the Editor to the following facts which may be considered as potential conflicts of interest and to significant financial contributions to this work. We confirm that the manuscript has been read and approved by all named authors and that there are no other persons who satisfied the criteria for authorship, but are not listed. We further confirm that the order of authors listed in the manuscript has been approved by all of us. We understand that the Corresponding Author is the sole contact for the Editorial process (including Editorial Manager and direct communications with the office). He/she is responsible for communicating with the other authors about progress, submissions of revisions, and the final approval of proofs.
Rights and permissions
About this article
Cite this article
Tan, M., Zhang, S. & Wu, L. Mutual kNN based spectral clustering. Neural Comput & Applic 32, 6435–6442 (2020). https://doi.org/10.1007/s00521-018-3836-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3836-z