Elsevier

Neurocomputing

Volume 443, 5 July 2021, Pages 106-116
Neurocomputing

Incomplete multi-view learning via half-quadratic minimization

https://doi.org/10.1016/j.neucom.2021.02.043Get rights and content

Abstract

In real applications, to deal with incomplete multi-view data, incomplete multi-view learning has experienced rapid development in recent years. Among various incomplete multi-view learning methods, a considerable number of methods were developed with the matrix factorization technique. Most of the existing matrix factorization based methods adopt the sum of squared 2-norm as loss functions directly, which is known to be susceptible to value missing. To overcome this issue, we propose a new matrix factorization method, named Incomplete Multi-view Learning via Half-quadratic Minimization (IMLHM). Different from previous methods, a robust estimator based on half-quadratic minimization theory is imported to our loss function to overcome the sensitivity of 2-norm to noise. The influence of bad recovered instances is decreased via the automatic weighting scheme derived from the half-quadratic minimization process, thereby improving the robustness of the proposed method. Additionally, a nuclear norm is introduced to exploit the low-rank structure of the learned representation matrix, further improving the robustness of the proposed method against to noise. An alternating iterative algorithm is developed to optimize the objective function. Comprehensive experimental results on seven data sets verify the effectiveness of the proposed method.

Introduction

In recent years, multi-view data sources are produced in various applications. For example, images can be described by Histogram of Oriented Gradient (HOG) [1], Local Binary Pattern (LBP) [2] and SIFT [3], and a film segment is represented by voice and video features. The research of multi-view learning has attracted numerous scholars’ attention [4], [5], [6], [7], [8], [9], [43]. Most traditional multi-view researches assume that the data on each view are complete. However, in real applications, the integrity of the multi-view data usually cannot be guaranteed [10], [11], [12], [44]. Practically, human error and mechanical failure occur time to time in the data collection process, which leads to incomplete multi-view data. Many traditional multi-view methods discard incomplete samples in most cases, which undoubtedly reduce the number of available samples [13], [14], [15], [42]. To make use of the information contained in the incomplete samples, a straightforward way is to fill the missing values with 0 or the mean of the survived values. Nevertheless, the performance of the filling strategy is still limited. In order to make better use of the incomplete multi-view samples, various incomplete multi-view learning methods have been proposed. They mainly fall into three categories: (1) Matrix factorization based methods with different constraints or regularization terms. This kind of methods attempt to learn a representation matrix containing as complete information as possible. Matrix factorization based methods can be easily extended to handle the case of both view missing and variable missing. (2) Kernel-based methods, which import kernel tricks into incomplete multi-view learning [16]. For instance, the within-view and between-view relationships are combined to predict the missing rows and columns of kernel matrices [17]. (3) Graph-based methods [18], [19]. When constructing graph on each view, the connecting weight between missing samples is set to 0. Then, based on the graphs of multiple views, a common representation matrix is learned. For example, Wen et al. [19] combined the self-representation model and spectral clustering to learning a common relaxed clustering indicator matrix after the graph construction. Kernel-based methods and graph-based methods achieve promising performance, but most of them are not applicable to case that there are both view and variable missing. To deal with various missing scenes in real applications, we focus on the matrix factorization based methods.

Based on the assumption that different views are generated from a common subspace, a bunch of popular incomplete multi-view learning methods are designed using the matrix decomposition technique. Concretely, based on the Nonnegative Matrix Factorization (NMF) and the 1sparse regularization, Li et al. [20] proposed the Partial multi-View Clustering (PVC) method. PVC is a pioneering work to learn a full representation for incomplete two-view data. Then, Xu et al. [21] proposed the Multi-View Learning with Incomplete Views (MVL-IV) method, which factorizes the incomplete data matrix of each view as the product of a view-specific basis matrix and the common representation matrix. Almost simultaneously, Shao et al. [22] put forward the Multiple Incomplete views Clustering (MIC) method, based on NMF and the 2,1 regularization. Concretely, MIC fills the missing views with the mean of available instances and gives smaller weights to the incomplete instances. The geometric information is not fully used in previous algorithms, Zhao et al. proposed the Incomplete Multi-modality Grouping (IMG) [23] approach to explore the compact global structure of data when learning the latent subspace. Specifically, an automatically learned graph Laplacian term is imposed on the learned latent representation to force similar samples to be close in the latent subspace. Utilizing the weighted semi-NMF, Hu et al. developed the Doubly Aligned Incomplete Multi-view Clustering (DAIMC) algorithm to align samples and the basis matrices simultaneously [24]. Albeit the above mentioned methods have greatly promoted the development of incomplete multi-view learning, the performance of incomplete multi-view learning can be further improved. It can be found that all the above mentioned methods employ squared 2-norm as loss functions. As we all know that squared 2-norm is not robust to noise, squared 2-norm based loss functions are not ideal to deal with incomplete multi-view data, as value missing is a typical kind of noise. Therefore, to make the learned representation matrix less affected by the missing values, more robust loss functions are expected. In this paper, we propose the Incomplete Multi-view Learning via Half-quadratic Minimization (IMLHM) to solve the above issue. Specifically, along the line of using the matrix factorization technique, we attempt to learn a common representation matrix for incomplete multi-view data. Unlike the previous methods based on the sum-of-square error directly, a robust estimator based on the half-quadratic theory is imposed on the loss of each sample of each view. With the help of the robust estimator, well recovered samples are assigned with larger weights and vise versa. In this way, the robustness against missing views and missing variables is improved. Additionally, noting that incomplete multi-view learning can be seen as a special case of matrix completion, the common representation matrix is forced to be low-rank to fulfill the assumption in matrix completion [25], [26]. Finally, the objective function of IMLHM is optimized with an iterative alternating algorithm. Experimental results on several incomplete multi-view data sets verify the effectiveness of the proposed method.

The remainder of the paper is organized as follows. Section 2 introduces the problem setting and some related incomplete multi-view learning methods. In Section 3, we formulate the incomplete-view problem and introduce our proposed algorithm. The optimization strategy and the convergence behavior of the optimization method are presented in Section 4. Section 5 provides the computational complexity. We analyze the experimental results in Section 6 and conclude in Section 7.

Section snippets

Problem setting and related work

In this section, we will introduce the problem setting first. Then we will introduce some related incomplete multi-view learning methods.

Formulation

In this section, we present the Incomplete Multi-view Learning via Half-quadratic Minimization (IMLHM) framework firstly. Since our method utilize the half-quadratic minimization theory, we introduce some details about half-quadratic minimization and analyze how it works in our method in the following.

For incomplete multi-view data, it is very common that it suffers from both view missing and variable missing. Therefore, we focus on the case of simultaneous absence of views and variables.

Optimization and Convergence Analysis

In this section, we introduce the alternating minimization strategy of the proposed method. Then, we will analyze the convergence of IMLHM. Finally, we provide the computational complexity of our method.

Experiment

In this section, we first introduce the experimental data sets and experimental settings. Secondly, we verify the effectiveness of the proposed method. Then we present the reconstruction visualization and examine the impact of parameters. Finally, we show the results about the convergence and time consumption.

Conclusion

In this paper, to deal with the views and features vacancy in the incomplete multi-view learning, we propose Incomplete Multi-view Learning via Half-quadratic Minimization. A robust estimator is imported to our loss function. Utilizing the half-quadratic minimization to optimize the model, we finally allocating small weights to abnormal samples in each view, which effectively improves the robustness of the model. The results on some data sets confirm the effectiveness of our method.

CRediT authorship contribution statement

Jiacheng Jiang: Software, Writing - original draft. Hong Tao: Methodology, Software, Investigation, Funding acquisition, Conceptualization. Ruidong Fan: Investigation. Wenzhang Zhuge: Investigation. Chenping Hou: Supervision, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the NSF of China under Grant 61922087, Grand 61906201 and Grant 62006238 and in part by the NSF for Distinguished Young Scholars of Hunan Province under Grant 2019JJ20020.

Jiacheng Jiang is a Master degree candidate at the National University of Defense Technology, Changsha, China. He received the B.S. degree with the same university in 2018. His research interests include data mining and machine learning..

References (43)

  • X. Xue et al.

    A multiview learning framework with a linear computational cost

    IEEE Trans. Cybern.

    (2017)
  • C. Tang et al.

    Learning a joint affinity graph for multiview subspace clustering

    IEEE Trans. Multimedia

    (2018)
  • Y. Li et al.

    A survey of multi-view representation learning

    IEEE Trans. Knowl. Data Eng.

    (2018)
  • H. Tao et al.

    Joint embedding learning and low-rank approximation: A framework for incomplete multiview learning

    IEEE Trans. Cybern

    (2021)
  • H. Tao et al.

    Unsupervised maximum margin incomplete multi-view clustering

    International CCF Conference on Artificial Intelligence, Springer

    (2018)
  • W. Zhuge et al.

    Simultaneous representation learning and clustering for incomplete multi-view data

    IJCAI

    (2019)
  • C. Hou et al.

    Multi-view unsupervised feature selection with adaptive similarity and view weight

    IEEE Trans. Knowl. Data Eng.

    (2017)
  • L. Zhang et al.

    Multi-view missing data completion

    IEEE Trans. Knowl. Data Eng.

    (2018)
  • H. Tao, C. Hou, J. Zhu, D. Yi, Multi-view clustering with adaptively learned graph, in: Asian Conference on Machine...
  • S. Bhadra et al.

    Multi-view kernel completion

    Mach. Learn.

    (2017)
  • A. Trivedi, P. Rai, H. Daumé III, S.L. DuVall, Multiview clustering with incomplete views, in: NIPS workshop, Vol. 224,...
  • Cited by (7)

    • Robust multi-view learning via adaptive regression

      2022, Information Sciences
      Citation Excerpt :

      Consequently, the low-quality views could adversely impact overall performance [16]. Benefiting from least square regression (LSR)[17], regression-based multi-view learning has received considerable attention in recent years [9,10,18–20]. Practically, Tao et al. [9] proposed a multi-view adaptive regression method to learn a classifier on each view and allocate view-wise weights to discriminate different views.

    • Compound Weakly Supervised Clustering

      2024, IEEE Transactions on Image Processing
    View all citing articles on Scopus

    Jiacheng Jiang is a Master degree candidate at the National University of Defense Technology, Changsha, China. He received the B.S. degree with the same university in 2018. His research interests include data mining and machine learning..

    Hong Tao received the Ph.D. degree from the National University of Defense Technology, Changsha, China, in 2019. She is currently a Lecturer with the College of Liberal Arts and Science of the same university. Her research interests include machine learning, system science, and data mining..

    Ruidong Fan is a master degree candidate at the National University of Defense Technology, Changsha, China. He received the B.S. degree from the Lanzhou University in 2018. His research interests include data mining, optimization and machine learning..

    Wenzhang Zhuge is a PhD candidate at the National University of Defense Technology, Changsha, China. He received the B.S. degree from Shandong University, Jinan, China, in 2015 and the M.S. degree from the National University of Defense Technology, Changsha, China in 2017. His research interests include machine learning, system science and data mining..

    Chenping Hou received the Ph.D. degrees from the National University of Defense Technology, Changsha, China in 2009. He is currently a full Professor with the Department of Systems Science of the same university. He has authored 80+ peer-reviewed papers in journals and conferences, such as the IEEE TPAMI, TNNLS/TNN, IEEE TSMCB/TCB, IEEE TIP, the IJCAI and AAAI. His current research interests include machine learning, data mining, and computer vision.

    1

    The same contribution to this article.

    View full text