Abstract
With the rapid development of information technologies, more and more data are collected from multiple sources, which contain different perspectives of the data. To accurately explore the shared information among multiple views, K-means based multi-view clustering methods are designed and widely used in various applications for their simplicity and efficiency. However, all of these methods cluster data in the original high-dimensional feature space which is extremely time-consuming and sensitive to outliers, or cluster data in the embedded feature space for each view, which is hard to find the optimal reduced dimensionality. To solve these problems, we propose a robust discriminative multi-view K-means clustering with feature selection and group sparsity learning. Compared to the state-of-the-arts, the proposed algorithm has two advantages: 1) Discriminative K-means clustering and feature learning are integrated jointly into a single framework, where robust and accurate clustering results are obtained in the embedded feature space with an l2, 1-norm based loss function. 2) Group sparsity constraints are imposed to select the most relevant features and the most important views. We apply the proposed algorithm to serval kinds of multimedia understanding applications. Experimental results demonstrate the effectiveness of the proposed algorithm.
Similar content being viewed by others
References
Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining - KDD ‘10. ACM Press, New York, p 333
Cai X, Nie F, Huang H (2013) Multi-View K -Means Clustering on Big Data. In: The 23rd International Joint Conference on Artificial Intelligence. pp 2598–2604
Chang X, Nie F, Ma Z, et al (2015) A Convex Formulation for Spectral Shrunk Clustering. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence 2532–2538
Ding C, Li T (2007) Adaptive dimension reduction using discriminant analysis and K -means clustering. Proceedings of the 24th International Conference on Machine Learning:521–528. https://doi.org/10.1145/1273496.1273562
Du L, Shen Z, Li X, et al (2013) Local and Global Discriminative Learning for Unsupervised Feature Selection. In: 2013 I.E. 13th International Conference on Data Mining. IEEE, pp 131–140
Dueck D, Frey BJ (2007) Non-metric affinity propagation for unsupervised image categorization. In: 2007 I.E. 11th International Conference on Computer Vision. IEEE, pp 1–8
Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106:59–70. https://doi.org/10.1016/j.cviu.2005.09.012
Feng Y, Xiao J, Zhuang Y, Liu X (2013) Adaptive unsupervised multi-view feature selection for visual concept recognition. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7724 LNCS:343–357. https://doi.org/10.1007/978-3-642-37331-2_26
Hou C, Nie F, Jiao Y et al (2013) Learning a subspace for clustering via pattern shrinking. Inf Process Manag 49:871–883. https://doi.org/10.1016/j.ipm.2013.01.004
Hou C, Nie F, Yi D, Tao D (2015) Discriminative embedded clustering: a framework for grouping high-dimensional data. IEEE Transactions on Neural Networks and Learning Systems 26:1287–1299. https://doi.org/10.1109/TNNLS.2014.2337335
Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31:651–666. https://doi.org/10.1016/j.patrec.2009.09.011
Kumar A, Rai P, Daume H (2011) Co-regularized multi-view spectral clustering. Adv Neural Inf Proces Syst 24 1413–1421. 10.1.1.229.2081
Li HLH, Jiang TJT, Zhang KZK (2006) Efficient and robust feature extraction by maximum margin criterion. IEEE Trans Neural Netw 17:157–165. https://doi.org/10.1109/TNN.2005.860852
Li Z, Yang Y, Liu J, et al (2012) Unsupervised Feature Selection Using Nonnegative Spectral Analysis. In: Twenty-Sixth AAAI Conference on Artificial Intelligence Unsupervised. pp 1026–1032
Li Y, Nie F, Huang H, Huang J (2015) Large-Scale Multi-View Spectral Clustering via Bipartite Graph. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. pp 2750–2756
Ma Z, Nie F, Yang Y et al (2012) Web image annotation via subspace-sparsity collaborated feature selection. IEEE Transactions on Multimedia 14:1021–1030
Ma Z, Yang Y, Sebe N, Hauptmann AG (2014) Knowledge adaptation with partially shared features for event detection with few exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligenc 36:1789–1802. https://doi.org/10.1109/TPAMI.2014.2306419
Nie F, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint l2,1-norms minimization. Adv Neural Inf Proces Syst 23:1813–1821
Nie F, Xiang S, Liu Y et al (2012) Orthogonal vs. uncorrelated least squares discriminant analysis for feature extraction. Pattern Recogn Lett 33:485–491. https://doi.org/10.1016/j.patrec.2011.11.028
Nie F, Li J, Li X (2016) Parameter-free auto-weighted multiple graph learning: A framework for multiview clustering and semi-supervised classification. In: IJCAI International Joint Conference on Artificial Intelligence. pp 1881–1887
Nie F, Zhu W, Li X (2016) Unsupervised feature selection with structured graph optimization. Proceedings of the 30th conference on artificial intelligence (AAAI 2016) 13:1302–1308
Nie F, Cai G, Li X (2017) Multi-View Clustering and Semi-Supervised Classification with Adaptive Neighbours. In: Proceedings of the 31th Conference on Artificial Intelligence (AAAI 2017). pp 2408–2414
Shang R, Zhang Z, Jiao L et al (2014) Global discriminative-based nonnegative spectral clustering. Pattern Recogn 55:172–182. https://doi.org/10.1016/j.patcog.2016.01.035
Siddiqi MH, Ali R, Idris M et al (2016) Human facial expression recognition using curvelet feature extraction and normalized mutual information feature selection. Multimedia Tools and Applications 75:935–959. https://doi.org/10.1007/s11042-014-2333-3
Song J, Yang Y, Li X et al (2014) Robust hashing with local models for approximate similarity search. IEEE Transactions on Cybernetics 44:1225–1236. https://doi.org/10.1109/TCYB.2013.2289351
Wang H, Nie F, Huang H et al (2012) Identifying quantitative trait loci via group-sparse multitask regression and feature selection: an imaging genetics study of the ADNI cohort. Bioinformatics 28:229–237. https://doi.org/10.1093/bioinformatics/btr649
Wang H, Nie F, Huang H et al (2012) Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning. Bioinformatics 28:127–136. https://doi.org/10.1093/bioinformatics/bts228
Wang H, Nie F, Huang H (2013) Multi-view clustering and feature learning via structured sparsity. Proceedings of the 30th International Conference on Machine Learning (ICML-13) 28:352–360
Wang D, Nie F, Huang H (2014) Unsupervised Feature Selection via Unified Trace Ratio Formulation and K-means Clustering (TRACK). In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). pp 306–321
Wang X, Zhang X, Zeng Z et al (2016) Unsupervised spectral feature selection with l1-norm graph. Neurocomputing 200:47–54. https://doi.org/10.1016/j.neucom.2016.03.017
Wang X, Chen R-C, Yan F, Zeng Z (2016) Semi-supervised feature selection with exploiting shared information among multiple tasks. J Vis Commun Image Represent 41:272–280. https://doi.org/10.1016/j.jvcir.2016.10.007
Wang S, Nie F, Chang X, et al (2016) Uncovering locally discriminative structure for feature analysis. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 9851 LNAI:281–295. https://doi.org/10.1007/978-3-319-46128-1_18
Wang X, Chen R-C, Yan F et al (2017) Semi-supervised adaptive feature analysis and its application for multimedia understanding. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-017-4990-5
Wang X, Chen R-C, Hong C et al (2017) Semi-supervised multi-label feature selection via label correlation analysis with l1-norm graph embedding. Image Vis Comput. https://doi.org/10.1016/j.imavis.2017.05.004
Xu J, Han J, Nie F (2016) Discriminatively Embedded K-Means for Multi-view Clustering. In: 2016 I.E. Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 5356–5364
Xu J, Han J, Nie F, Li X (2017) Re-weighted discriminatively embedded K-means for multi-view clustering. IEEE Trans Image Process 26:3016–3027. https://doi.org/10.1109/TIP.2017.2665976
Yan Y, Nie F, Li W et al (2016) Image classification by cross-media active learning with privileged information. IEEE Transactions on Multimedia 18:2494–2502. https://doi.org/10.1109/TMM.2016.2602938
Yang Y, Zhuang YT, Wu F, Pan YH (2008) Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Transactions on Multimedia 10:437–446. https://doi.org/10.1109/TMM.2008.917359
Yang Y, Xu D, Nie F, et al (2009) Ranking with local regression and global alignment for cross media retrieval. In: Proceedings of the seventeen ACM international conference on Multimedia - MM ‘09. p 175
Yang Y, Xu D, Nie F et al (2010) Image clustering using local discriminant models and global integration. IEEE Trans Image Process 19:2761–2773. https://doi.org/10.1109/TIP.2010.2049235
Yang Y, Shen HT, Nie F, et al (2011) Nonnegative Spectral Clustering with Discriminative Regularization. Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence 555–560
Yang Y, Shen HT, Ma Z, et al (2011) l2,1-norm regularized discriminative feature selection for unsupervised learning. IJCAI international joint conference on artificial intelligence 1589–1594. https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-267
Yang Y, Song J, Huang Z et al (2013) Multi-feature fusion via hierarchical regression for multimedia analysis. IEEE Transactions on Multimedia 15:572–581. https://doi.org/10.1109/TMM.2012.2234731
Yang Y, Ma Z, Hauptmann AG et al (2013) Feature selection for multimedia analysis by Shareing information among multiple tasks. IEEE Transactions on Multimedia 15:661–669
Yang Y, Ma Z, Nie F et al (2015) Multi-class active learning by uncertainty sampling with diversity maximization. Int J Comput Vis 113:113–127. https://doi.org/10.1007/s11263-014-0781-x
Yang XK, He L, Qu D, Zhang W (2016) Semi-supervised minimum redundancy maximum relevance feature selection for audio classification. Multimedia Tools and Applications:1–27. https://doi.org/10.1007/s11042-016-4287-0
Zhang H, Zha Z-J, Yang Y et al (2014) Robust (semi) nonnegative graph embedding. IEEE Trans Image Process 23:2996–3012. https://doi.org/10.1109/TIP.2014.2325784
Zhuge W, Hou C, Jiao Y et al (2017) Robust auto-weighted multi-view subspace clustering with common subspace representation matrix. PLoS One 12:e0176769. https://doi.org/10.1371/journal.pone.0176769
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zeng, Z., Wang, X., Yan, F. et al. Robust Discriminative multi-view K-means clustering with feature selection and group sparsity learning. Multimed Tools Appl 77, 22433–22453 (2018). https://doi.org/10.1007/s11042-018-6033-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6033-2