Incomplete multiview nonnegative representation learning with multiple graphs
Introduction
Recently, multiview clustering has become an important problem in machine learning and artificial intelligence fields [1]. In multiview clustering, each instance is associated with multiple features from diverse views which often contain complementary information to each other, and the objective is to solve the problem of the complex correlation among multiple views [2], [3]. Most multiview clustering methods usually learn the unified representation of multiview data [4], [5], [6] or the common graph structure among instances [7], [8] for data grouping. For example, the multiview spectral clustering via integrating nonnegative embedding and spectral embedding (NESE) [4] method performed matrix factorization on multiple graphs and obtained the consistent nonnegative embedding for clustering. Multiview learning with adaptive neighbors (MLAN) [7] learned a consistent similarity matrix with connected components, where each view shares the consistent similarity matrix. However, an important assumption for traditional multiview clustering methods is that all views of instances should be complete [9], [10]. In many real-world multiview tasks, many instances suffer from the absence of partial views [11], [12], which leads to difficulties in modeling the correlation among instances.
This view-missing problem in multiview clustering is commonly referred to as the incomplete multiview clustering (IMC) problem. Some efforts have been made in recent years to handle this problem [13], [14]. The graph-based method is an important technique for solving the IMC problem, which aims to learn the consensus embedding and preserve the graph structure information among multiple incomplete views [15], [16]. Trivedi et al. [17] made use of the Laplacian regularization of a complete view to establish the kernel representation of incomplete views. This is the primary work on IMC, with the limitation that at least one view must be complete. Gao et al. [18] utilized the mean of instances to fill incomplete views and performed the latent consensus representation learning, where the filled incomplete views may affect the subsequent multiview learning. Wang et al. [19] proposed a perturbation-oriented IMC method, which obtained the consensus representation from multiple similarity graphs. Incomplete multimodality grouping (IMG) [20] transformed the complete and incomplete instances into a complete representation and then learned a common graph structure, while incomplete multiview spectral clustering with adaptive graph learning (IMSC_AGL) [21] performed subspace learning and consensus representation learning simultaneously. Moreover, IMG and IMSC_AGL have performed well without filling incomplete views.
The matrix factorization method is another research hotspot for solving the IMC problem, such as partial multiview clustering (PVC) [22], multiple incomplete view clustering (MIC) [23], online multiview clustering (OMVC) [24], doubly aligned IMC (DAIMC) [25], and one-pass IMC (OPIMC) [26]. DAIMC [25] introduced a regression constraint into the weighted semi-nonnegative matrix factorization, which utilized the given instance alignment information to learn a common latent feature matrix for all the views. OPIMC [26] was an efficient and effective IMC method by adequately considering the instance missing information with the help of regularized matrix factorization and weighted matrix factorization. Matrix factorization-based IMC methods usually introduced weighted matrices containing missing view information, so that they can intuitively deal with the IMC problem. However, they have obvious shortcomings in the nonlinear structural learning among instances compared with graph-based IMC methods. Currently, there are some IMC works based on matrix factorization and graph learning, such as graph regularized partial multi-view clustering (GPMVC) [27] and generalized IMC with flexible locality structure diffusion (GIMC_FLSD) [28]. For example, GIMC_FLSD [28] flexibly performed local structural learning and individual representation learning flexibly, where all individual representations can be easily converted to a common representation. Compared with graph-based IMC methods, GIMC_FLSD can make fuller use of the local geometric information among instances by performing matrix factorization on the neighbors of the instances. Besides, GIMC_FLSD adaptively learned the importance of different views which was usually ignored in graph-based or matrix factorization-based IMC methods. Therefore, this paper focuses on the IMC method integrating the graph information and the nonnegative matrix factorization.
In this paper, we develop a novel incomplete multiview nonnegative representation learning (IMNRL) framework for IMC, which inherits the advantages of both graph-based and matrix factorization-based IMC methods. As shown in Fig. 1, IMNRL takes advantage of the neighbor structure of each individual incomplete view to construct multiple similarity graphs and decomposes these graphs into the consensus nonnegative embedding and view-specific graph embeddings. In this way, the consensus nonnegative embedding can contain nonlinear structural information on different views. Moreover, we employ an additional graph regularization term to constrain the consensus embedding, so that the learned consensus embedding can retain more graph structure information. In IMNRL, the final cluster labels are determined by the column index of the largest value in each row of the consensus embedding. To summarize, this papers main contributions are:
- •
We build a novel incomplete multiview nonnegative representation learning framework, namely IMNRL. It can handle various incomplete cases.
- •
IMNRL performs the consensus nonnegative representation learning and the view-specific representation learning simultaneously. The consensus nonnegative embedding retains local structural information on different incomplete views, and it directly reveals the clustering results.
- •
We perform experiments to verify the proposed IMNRL, and the results on different incomplete scenarios demonstrate that IMNRL achieves state-of-the-art incomplete multiview clustering results.
The remainder of the paper is organized as follows. Section 2 introduces the related works. Section 3 explains the proposed IMNRL model, and the optimization of IMNRL is given in Section 4. In Section 5, experimental results show the feasibilities of the proposed method. Finally, some conclusions are given in the last section.
Section snippets
Related work
This section briefly reviews related work, including traditional single-view clustering methods and multiview clustering methods.
The proposed method
This section mainly introduces the IMNRL framework. IMNRL learns the consensus nonnegative embedding and the view-specific graph embeddings simultaneously.
Optimization
In this section, we first provide the inference and learning procedures of IMNRL, and then give the convergence analysis and the computational complexity.
Experiments
In this section, we compare the proposed IMNRL with state-of-the-art IMC methods on incomplete real-world multiview datasets.
Conclusion
In this paper, we have presented an effective incomplete multiview nonnegative representation learning (IMNRL) framework, which can handle the incomplete multiview clustering problem well without filling incomplete views. IMNRL uses the nonnegative term and the graph regularization term to constrain the consensus representation, and thus the consensus representation can retain local structural information on multiple incomplete views and final data partition information. Besides, the final
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant nos. 62076096 and 62006076), the Shanghai Municipal Project (No. 20511100900), Shanghai Knowledge Service Platform Project ZF1213, the Open Research Fund of KLATASDS-MOE, and the Fundamental Research Funds for the Central Universities.
Nan Zhang received the Ph.D. degree in artificial intelligence and pattern recognition from China University of Mining and Technology in 2019. Now, he studies as postdoctoral fellow with the School of Computer Science and Technology and the Head of the Pattern Recognition and Machine Learning Research Group, East China Normal University. His research results have expounded in over 20 publications at peer-reviewed journals. His current research interests include multiview learning and deep
References (47)
- et al.
Multi-view subspace clustering via simultaneously learning the representation tensor and affinity matrix
Pattern Recognit.
(2020) - et al.
Discriminative subspace matrix factorization for multiview data clustering
Pattern Recognit.
(2021) - et al.
Pseudo-label guided collective matrix factorization for multiview clustering
IEEE Trans. Cybern.
(2021) - et al.
Bayesian co-training
J. Mach. Learn. Res.
(2011) - et al.
Partial multi-view outlier detection based on collective learning
Proceedings of AAAI Conference on Artificial Intelligence
(2018) - et al.
Structured general and specific multi-view subspace clustering
Pattern Recognit.
(2019) - et al.
A matrix factorization approach for integrating multiple data views
Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases
(2009) - et al.
Multiview Machine Learning
(2019) - et al.
Multiview graph restricted Boltzmann machines
IEEE Trans. Cybern.
(2021) - et al.
Multi-view spectral clustering via integrating nonnegative embedding and spectral embedding
Inf. Fusion
(2020)
Multi-view clustering and semi-supervised classification with adaptive neighbours
Proceedings of AAAI Conference on Artificial Intelligence, San Francisco, California, USA
Dual distance adaptive multiview clustering
Neurocomputing
Multi-view learning overview: recent progress and new challenges
Inf. Fusion
Multiple kernel -means with incomplete kernels
IEEE Trans. Pattern Anal. Mach. Intell.
Efficient and effective regularized incomplete multi-view clustering
IEEE Trans. Pattern Anal. Mach. Intell.
Adaptive graph completion based incomplete multi-view clustering
IEEE Trans. Multimed.
Incomplete multi-view clustering via structured graph learning
Proceedings of Pacific Rim International Conference on Artificial Intelligence
Partial multiview clustering with locality graph regularization
Int. J. Intell. Syst.
Multiview clustering with incomplete views
Proceedings of Advances in Neural Information Processing Systems Workshop
Incomplete multi-view clustering
Proceedings of International Joint Conference on Intelligent Information Processing
Spectral perturbation meets incomplete multi-view data
Proceedings of International Joint Conference on Artificial Intelligence
Incomplete multi-modal visual data grouping
In Proceedings of. International Joint Conference on Artificial Intelligence, New York, NY, USA
Incomplete multiview spectral clustering with adaptive graph learning
IEEE Trans. Cybern.
Cited by (10)
Discovering common information in multi-view data
2024, Information FusionRelaxed multi-view discriminant analysis
2024, Engineering Applications of Artificial IntelligenceIncomplete multi-view learning: Review, analysis, and prospects
2024, Applied Soft ComputingJoint group and pairwise localities embedding for feature extraction
2024, Information SciencesIncremental unsupervised feature selection for dynamic incomplete multi-view data
2023, Information FusionMultiview Jointly Sparse Discriminant Common Subspace Learning
2023, Pattern Recognition
Nan Zhang received the Ph.D. degree in artificial intelligence and pattern recognition from China University of Mining and Technology in 2019. Now, he studies as postdoctoral fellow with the School of Computer Science and Technology and the Head of the Pattern Recognition and Machine Learning Research Group, East China Normal University. His research results have expounded in over 20 publications at peer-reviewed journals. His current research interests include multiview learning and deep learning.
Shiliang Sun received the Ph.D. degree in pattern recognition and intelligent systems from Tsinghua University, Beijing, China, in 2007. He is a Professor with the School of Computer Science and Technology and the Head of the Pattern Recognition and Machine Learning Research Group, East China Normal University, Shanghai, China. From 2009 to 2010, he was a Visiting Researcher with the School of Computer Science, Centre for Computational Statistics and Machine Learning, University College London, London, U.K. In 2014, he was a Visiting Researcher with the Department of Electrical Engineering, Columbia University, New York, NY, USA. His current research interests include kernel methods, multiview learning, learning theory, approximate inference, sequential modeling, deep learning and their applications. His research results have expounded in over 100 publications at peer-reviewed journals and conferences, such as IEEE T-PAMI, JMLR, IEEE T-NNLS, IEEE T-Cybernetics, PR, NIPS, ICML, IJCAI and ECML. Prof. Sun is on the Editorial Board of multiple international journals, including Pattern Recognition and IEEE Transactions on Neural Networks and Learning Systems.