Multi-view spectral clustering via integrating nonnegative embedding and spectral embedding
Introduction
Clustering is a fundamental problem that arises in many fields, including data mining, computer vision, and machine learning. Conceptually, given n samples, the goal of clustering is to partition them into k subsets. In general, for each sample one can describe it from different views, and leveraging the information of multiple views simultaneously is beneficial for achieving a better clustering result. Such a problem refers to multi-view clustering. Most existing approaches for dealing with this issue can be roughly divided into four groups, including graph-based methods [1], [2], [3], [4], [5], [6], matrix factorization methods [7], [8], [9], [10], [11], multiple kernel-based methods [12], [13], [14], and subspace learning-based methods [15], [16]. In particular, due to the ability in digging non-linear structure information, graph-based methods generally outperform others in terms of clustering precision.
Different graph-based methods usually vary in how they learn the consistent spectral embedding. One simple alternative is multi-view spectral clustering that directly uses multiple graphs, constructed by k-nearest neighbors (or other similar approaches), to learn a consistent spectral embedding [1], [2], [17]. However, the noise levels of different views are generally different, which implies that the quality difference exists in different graphs. To alleviate this issue, multi-view subspace clustering [18], [19], [20] searches for first learning a consistent similarity matrix, i.e., a graph from multiple views, and then obtaining the final spectral embedding via traditional single-view spectral embedding methods, such as Ncut.
There are three deficiencies that are usually encountered in graph-based methods. First, the requirement to post-processing. For graph-based methods, the final clustering result is generally returned by conducting K-means or spectral rotation to consistent spectral embedding, which inevitably introduce the uncertainty caused by initialization. Second, the susceptibility to parameter selection. The models of most existing graph-based methods introduce some additional parameters, while parameter selection is not an easy thing for clustering, an unsupervised task. Third, the high computation cost. Eigenvalue decomposition with computation complexity is required by most multi-view spectral clustering methods, and matrix inversion with computation complexity is required by most multi-view subspace clustering methods when solve the involved models. Both eigenvalue decomposition and matrix inversion are time consuming for large scale data.
Another line of studies focus on the matrix factorization methods [7], [9], [21], [22] that generally provide a large advantage over graph-based methods in terms of time cost. However, these methods cannot tackle with the data with non-linear structure. In short, graph-based methods perform better but are limited by the high computation cost. On the other hand, matrix factorization methods are efficient but unable to provide a satisfactory clustering result. Such an issue motivates us to combine the advantages of them. In this work, we propose to implement spectral embedding and nonnegative embedding simultaneously. Our basic idea is partially motivated by Kuang et al. [23], where the relation between symmetric nonnegative matrix factorization and spectral clustering is discussed in single-view case. The main contributions of this work can be summarized as follows:
- •
We provide a novel multi-view spectral clustering algorithm, namely NESE (multi-view spectral clustering via integrating Nonnegative Embedding and Spectral Embedding). It inherits the advantages of both graph-based and matrix factorization methods. Specifically, the model of NESE is parameter-free, which makes it more applicable than existing methods. Moreover, the solution returned by NESE directly reveals the consistent clustering result. Such that the uncertainty brought by post-processing, such as K-means and spectral rotation can be avoided.
- •
We provide an efficient optimization approach, namely inexact Majorization-Minimization (inexact-MM), to solve the non-convex and non-smooth objective involved in NESE. The computation complexity of inexact-MM is approximately where n and k are the number of samples and clusters, respectively.
- •
We conduct numerous experiments to verify the performance of NESE. And the experimental results demonstrate that our method can achieve comparable and even better clustering results. We provide the datasets and code in https://github.com/sudalvxin/SMSC.git.
We report the notations that are widely used in this paper in Table 1. The remainder of this work is organized as follows. We introduce the related works in Section 2, and present proposed method NESE in Section 3. The optimization details w.r.t NESE is summarized in Section 4. We report comparison results in Section 5, and conclude this work in Section 6.
Section snippets
Related work
In this section, we first review some representative methods for multi-view clustering, and then introduce the studies that are related to our method.
Proposed method
In this section, we first provide a model for single-view spectral clustering, and then generalize it to multi-view setting.
Optimization of proposed method
In this section, we focus on solving the objective of NESE. Note that directly solving the problem (12) is a challenging task, for it is non-smooth and non-convex. Following [30], we adopt an inexact Majorization-Minimization (MM) method [43]. Before continuing, we provide a brief introduction for MM, which has been ignored by previous studies [4], [25], [30].
Experiments
In order to verify the performance of proposed method NESE, we compare it with a large number of graph-based multi-view clustering methods including CotSC [1], CorSC [2], MLAN [4], SwMC [25], AASC [24], MVGL [27], AMGL [3] and AWP [30]. For all algorithms that require the graph similarity matrices to serve as input, we use the method proposed in [29] to construct the similarity matrix for each view. The reason is that it can avoid the scale difference between different views and generate a
Conclusion
This work provided a novel method, namely NESE, for multi-view spectral clustering. The core idea of NESE is to learn a consistent nonnegative embedding and multiple spectral embeddings simultaneously. In particular, the nonnegative embedding directly reveals the consistent clustering result we desired. Furthermore, an inexact-MM method is developed to solve the involved objective. Numerous experimental results demonstrate the promising empirical performance of NESE. For the subproblem solved
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China under Grant 61772427, Grant 61751202 and Grant 61936014, in part by the National Key Research and Development Program of China under Grant 2018YFB1403500, and in part by the Fundamental Research Funds for the Central Universities under Grant G2019KY0501.
References (47)
- et al.
A co-training approach for multi-view spectral clustering
Proceedings of the International Conference on Machine Learning
(2011) - et al.
Co-regularized multi-view spectral clustering
Proceedings of the Advances in Neural Information Processing Systems
(2011) - et al.
Parameter-free auto-weighted multiple graph learning: a framework for multiview clustering and semi-supervised classification.
Proceedings of the International Joint Conference on Artificial Intelligence
(2016) - et al.
Multi-view clustering and semi-supervised classification with adaptive neighbours.
Proceedings of the AAAI Conference on Artificial Intelligence
(2017) - et al.
Generalized latent multi-view subspace clustering
IEEE Trans. Pattern Anal. Mach.Intell.
(2018) - et al.
Parameter-free weighted multi-view projected clustering with structured graph learning
IEEE Trans. Knowl. Data Eng.
(2019) - et al.
Multi-view clustering via joint nonnegative matrix factorization
Proceedings of the SIAM International Conference on Data Mining
(2013) - et al.
A matrix factorization approach for integrating multiple data views
Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases
(2009) - et al.
Multi-view k-means clustering on big data
Proceedings of the International Joint Conference on Artificial Intelligence
(2013) - et al.
Multi-view clustering via deep matrix factorization
Proceedings of the AAAI Conference on Artificial Intelligence
(2017)
Feature extraction via multi-view non-negative matrix factorization with local graph regularization
Proceedings of the IEEE International Conference on Image Processing
Multiple kernel clustering
Proceedings of the SIAM International Conference on Data Mining
Robust multiple kernel k-means using l21-norm
Proceedings of the International Joint Conference on Artificial Intelligence
Localized data fusion for kernel k-means clustering with application to cancer biology
Proceedings of the Advances in Neural Information Processing Systems
Multi-view clustering via canonical correlation analysis
Proceedings of the Annual International Conference on Machine Learning
Convex subspace representation learning from multi-view data
Proceedings of the AAAI Conference on Artificial Intelligence
Multiview spectral embedding
IEEE Trans. Syst. Man Cybern.Part B (Cybernetics)
Low-rank tensor constrained multiview subspace clustering
Proceedings of the IEEE International Conference on Computer Vision
Diversity-induced multi-view subspace clustering
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Multi-view subspace clustering
Proceedings of the IEEE International Conference on Computer Vision
Re-weighted discriminatively embedded k-means for multi-view clustering
IEEE Trans. Image Process.
Non-negative matrix factorization in multimodality data for segmentation and label prediction
Proceedings of the Computer Vision Winter Workshop
Symmetric nonnegative matrix factorization for graph clustering
Proceedings of the SIAM international Conference on Data Mining
Cited by (157)
Multi-view contrastive clustering via integrating graph aggregation and confidence enhancement
2024, Information FusionA novel federated multi-view clustering method for unaligned and incomplete data fusion
2024, Information FusionTowards unsupervised radiograph clustering for COVID-19: The use of graph-based multi-view clustering
2024, Engineering Applications of Artificial IntelligenceMulti-view clustering via pseudo-label guide learning and latent graph structure recovery
2024, Pattern RecognitionAnchor graph-based multiview spectral clustering
2024, NeurocomputingTowards a unified framework for graph-based multi-view clustering
2024, Neural Networks