Incomplete multi-view learning via half-quadratic minimization

doi:10.1016/j.neucom.2021.02.043

Neurocomputing

Volume 443, 5 July 2021, Pages 106-116

https://doi.org/10.1016/j.neucom.2021.02.043 Get rights and content

Abstract

In real applications, to deal with incomplete multi-view data, incomplete multi-view learning has experienced rapid development in recent years. Among various incomplete multi-view learning methods, a considerable number of methods were developed with the matrix factorization technique. Most of the existing matrix factorization based methods adopt the sum of squared $ℓ_{2}$ -norm as loss functions directly, which is known to be susceptible to value missing. To overcome this issue, we propose a new matrix factorization method, named Incomplete Multi-view Learning via Half-quadratic Minimization (IMLHM). Different from previous methods, a robust estimator based on half-quadratic minimization theory is imported to our loss function to overcome the sensitivity of $ℓ_{2}$ -norm to noise. The influence of bad recovered instances is decreased via the automatic weighting scheme derived from the half-quadratic minimization process, thereby improving the robustness of the proposed method. Additionally, a nuclear norm is introduced to exploit the low-rank structure of the learned representation matrix, further improving the robustness of the proposed method against to noise. An alternating iterative algorithm is developed to optimize the objective function. Comprehensive experimental results on seven data sets verify the effectiveness of the proposed method.

Introduction

In recent years, multi-view data sources are produced in various applications. For example, images can be described by Histogram of Oriented Gradient (HOG) [1], Local Binary Pattern (LBP) [2] and SIFT [3], and a film segment is represented by voice and video features. The research of multi-view learning has attracted numerous scholars’ attention [4], [5], [6], [7], [8], [9], [43]. Most traditional multi-view researches assume that the data on each view are complete. However, in real applications, the integrity of the multi-view data usually cannot be guaranteed [10], [11], [12], [44]. Practically, human error and mechanical failure occur time to time in the data collection process, which leads to incomplete multi-view data. Many traditional multi-view methods discard incomplete samples in most cases, which undoubtedly reduce the number of available samples [13], [14], [15], [42]. To make use of the information contained in the incomplete samples, a straightforward way is to fill the missing values with 0 or the mean of the survived values. Nevertheless, the performance of the filling strategy is still limited. In order to make better use of the incomplete multi-view samples, various incomplete multi-view learning methods have been proposed. They mainly fall into three categories: (1) Matrix factorization based methods with different constraints or regularization terms. This kind of methods attempt to learn a representation matrix containing as complete information as possible. Matrix factorization based methods can be easily extended to handle the case of both view missing and variable missing. (2) Kernel-based methods, which import kernel tricks into incomplete multi-view learning [16]. For instance, the within-view and between-view relationships are combined to predict the missing rows and columns of kernel matrices [17]. (3) Graph-based methods [18], [19]. When constructing graph on each view, the connecting weight between missing samples is set to 0. Then, based on the graphs of multiple views, a common representation matrix is learned. For example, Wen et al. [19] combined the self-representation model and spectral clustering to learning a common relaxed clustering indicator matrix after the graph construction. Kernel-based methods and graph-based methods achieve promising performance, but most of them are not applicable to case that there are both view and variable missing. To deal with various missing scenes in real applications, we focus on the matrix factorization based methods.

Based on the assumption that different views are generated from a common subspace, a bunch of popular incomplete multi-view learning methods are designed using the matrix decomposition technique. Concretely, based on the Nonnegative Matrix Factorization (NMF) and the $ℓ_{1}$ sparse regularization, Li et al. [20] proposed the Partial multi-View Clustering (PVC) method. PVC is a pioneering work to learn a full representation for incomplete two-view data. Then, Xu et al. [21] proposed the Multi-View Learning with Incomplete Views (MVL-IV) method, which factorizes the incomplete data matrix of each view as the product of a view-specific basis matrix and the common representation matrix. Almost simultaneously, Shao et al. [22] put forward the Multiple Incomplete views Clustering (MIC) method, based on NMF and the $ℓ_{2, 1}$ regularization. Concretely, MIC fills the missing views with the mean of available instances and gives smaller weights to the incomplete instances. The geometric information is not fully used in previous algorithms, Zhao et al. proposed the Incomplete Multi-modality Grouping (IMG) [23] approach to explore the compact global structure of data when learning the latent subspace. Specifically, an automatically learned graph Laplacian term is imposed on the learned latent representation to force similar samples to be close in the latent subspace. Utilizing the weighted semi-NMF, Hu et al. developed the Doubly Aligned Incomplete Multi-view Clustering (DAIMC) algorithm to align samples and the basis matrices simultaneously [24]. Albeit the above mentioned methods have greatly promoted the development of incomplete multi-view learning, the performance of incomplete multi-view learning can be further improved. It can be found that all the above mentioned methods employ squared $ℓ_{2}$ -norm as loss functions. As we all know that squared $ℓ_{2}$ -norm is not robust to noise, squared $ℓ_{2}$ -norm based loss functions are not ideal to deal with incomplete multi-view data, as value missing is a typical kind of noise. Therefore, to make the learned representation matrix less affected by the missing values, more robust loss functions are expected. In this paper, we propose the Incomplete Multi-view Learning via Half-quadratic Minimization (IMLHM) to solve the above issue. Specifically, along the line of using the matrix factorization technique, we attempt to learn a common representation matrix for incomplete multi-view data. Unlike the previous methods based on the sum-of-square error directly, a robust estimator based on the half-quadratic theory is imposed on the loss of each sample of each view. With the help of the robust estimator, well recovered samples are assigned with larger weights and vise versa. In this way, the robustness against missing views and missing variables is improved. Additionally, noting that incomplete multi-view learning can be seen as a special case of matrix completion, the common representation matrix is forced to be low-rank to fulfill the assumption in matrix completion [25], [26]. Finally, the objective function of IMLHM is optimized with an iterative alternating algorithm. Experimental results on several incomplete multi-view data sets verify the effectiveness of the proposed method.

The remainder of the paper is organized as follows. Section 2 introduces the problem setting and some related incomplete multi-view learning methods. In Section 3, we formulate the incomplete-view problem and introduce our proposed algorithm. The optimization strategy and the convergence behavior of the optimization method are presented in Section 4. Section 5 provides the computational complexity. We analyze the experimental results in Section 6 and conclude in Section 7.

Section snippets

Problem setting and related work

In this section, we will introduce the problem setting first. Then we will introduce some related incomplete multi-view learning methods.

Formulation

In this section, we present the Incomplete Multi-view Learning via Half-quadratic Minimization (IMLHM) framework firstly. Since our method utilize the half-quadratic minimization theory, we introduce some details about half-quadratic minimization and analyze how it works in our method in the following.

For incomplete multi-view data, it is very common that it suffers from both view missing and variable missing. Therefore, we focus on the case of simultaneous absence of views and variables.

Optimization and Convergence Analysis

In this section, we introduce the alternating minimization strategy of the proposed method. Then, we will analyze the convergence of IMLHM. Finally, we provide the computational complexity of our method.

Experiment

In this section, we first introduce the experimental data sets and experimental settings. Secondly, we verify the effectiveness of the proposed method. Then we present the reconstruction visualization and examine the impact of parameters. Finally, we show the results about the convergence and time consumption.

Conclusion

In this paper, to deal with the views and features vacancy in the incomplete multi-view learning, we propose Incomplete Multi-view Learning via Half-quadratic Minimization. A robust estimator is imported to our loss function. Utilizing the half-quadratic minimization to optimize the model, we finally allocating small weights to abnormal samples in each view, which effectively improves the robustness of the model. The results on some data sets confirm the effectiveness of our method.

CRediT authorship contribution statement

Jiacheng Jiang: Software, Writing - original draft. Hong Tao: Methodology, Software, Investigation, Funding acquisition, Conceptualization. Ruidong Fan: Investigation. Wenzhang Zhuge: Investigation. Chenping Hou: Supervision, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the NSF of China under Grant 61922087, Grand 61906201 and Grant 62006238 and in part by the NSF for Distinguished Young Scholars of Hunan Province under Grant 2019JJ20020.

Jiacheng Jiang is a Master degree candidate at the National University of Defense Technology, Changsha, China. He received the B.S. degree with the same university in 2018. His research interests include data mining and machine learning..

References (43)

C. Hou et al.
Multiple view semi-supervised dimensionality reduction
Pattern Recogn.
(2010)
C. Xu et al.
Multi-view learning with incomplete views
IEEE Trans. Image Process.
(2015)
C. Tang et al.
Feature selective projection with low-rank embedding and dual laplacian regularization
IEEE Trans. Knowl. Data Eng.
(2019)
H.A. Güvenir et al.
Learning differential diagnosis of erythemato-squamous diseases using voting feature intervals
Artif. Intell. Med.
(1998)
Z Kang
Partition Level Multiview Subspace Clustering
Neural Networks
(2020)
D.G. Lowe
Distinctive image features from scale-invariant keypoints
Int. J. Comput. Vision
(2004)
C.-C.R. Wang, J.-J.J. Lien, Adaboost learning for human detection based on histograms of oriented gradients, in: Asian...
T. Ojala et al.
Gray scale and rotation invariant texture classification with local binary patterns
European Conference on Computer Vision, Springer
(2000)
Y. Luo et al.
Multiview vector-valued manifold regularization for multilabel image classification
IEEE Trans. Neural Networks Learn. Syst.
(2013)
X. Zhu et al.
Block-row sparse multiview multilabel learning for image classification
IEEE Trans. Cybern.
(2015)

X. Xue et al.

A multiview learning framework with a linear computational cost

IEEE Trans. Cybern.

(2017)

C. Tang et al.

Learning a joint affinity graph for multiview subspace clustering

IEEE Trans. Multimedia

(2018)

Y. Li et al.

A survey of multi-view representation learning

IEEE Trans. Knowl. Data Eng.

(2018)

H. Tao et al.

Joint embedding learning and low-rank approximation: A framework for incomplete multiview learning

IEEE Trans. Cybern

(2021)

H. Tao et al.

Unsupervised maximum margin incomplete multi-view clustering

International CCF Conference on Artificial Intelligence, Springer

(2018)

W. Zhuge et al.

Simultaneous representation learning and clustering for incomplete multi-view data

IJCAI

(2019)

C. Hou et al.

Multi-view unsupervised feature selection with adaptive similarity and view weight

IEEE Trans. Knowl. Data Eng.

(2017)

L. Zhang et al.

Multi-view missing data completion

IEEE Trans. Knowl. Data Eng.

(2018)

H. Tao, C. Hou, J. Zhu, D. Yi, Multi-view clustering with adaptively learned graph, in: Asian Conference on Machine...

S. Bhadra et al.

Multi-view kernel completion

Mach. Learn.

(2017)

A. Trivedi, P. Rai, H. Daumé III, S.L. DuVall, Multiview clustering with incomplete views, in: NIPS workshop, Vol. 224,...

Cited by (7)

Robust multi-view learning via M-estimator joint sparse representation
2024, Pattern Recognition
Recently, multi-view learning has achieved extraordinary success in many research areas such as pattern recognition and data mining. Most existing multi-view methods mainly focus on exploring the correlation information between different views and their performance may severely degrade in the presence of heavy noises and outliers. In this paper, we put forward a robust multi-view joint sparse representation (RMJSR) method for multi-view learning. Firstly, we design a novel multi-view Cauchy estimator based loss function originating from robust statistics to address complex noises and outliers in reality. Based on this, we leverage the $ℓ_{1, q}$ norm to enhance our model by encouraging the learned representation of multiple views to share the same sparsity pattern. Secondly, to explore the optimal solution for the RMJSR model, we devise an effective optimization algorithm based on the half-quadratic (HQ) theory and the alternating direction method of multipliers (ADMM) framework. Thirdly, we provide the theoretical guarantee for revealing the theoretical condition for the success of the proposed method. Further, we have also provided extensive analysis of the proposed method, including the optimality condition, convergence analysis, and complexity analysis. Extensive experimental results validate the effectiveness and robustness of the proposed method in comparison with state-of-the-art competitors. The source code is available at https://github.com/Huyutao7/RMJSRC.
Incomplete multi-view learning: Review, analysis, and prospects
2024, Applied Soft Computing
Multi-view data, stemming from diverse information sources, often suffer from incompleteness due to various factors such as equipment failure and data transmission issues. This challenge has given rise to the emerging field of incomplete multi-view learning (IML). To provide guidance for newcomers and researchers in this field, this survey systematically presents an in-depth analysis of IML from generative and discriminative perspectives, focusing on all missing scenarios and various learning tasks. Within these categories, discriminative methods are further classified into matrix learning-based IML and graph learning-based IML, while generative methods encompass generative adversarial networks-based IML, auto-encoder-based IML, and contrastive learning-based IML. Meanwhile, practical applications across various domains are summarized, with extensions of IML to multiple labels as well as unaligned views. To advance this field, we conclude that adapting multi-view learning for incomplete data, addressing complex and arbitrary missing scenarios, tackling high missing ratios, exploring regularization techniques, reducing noise impact, and integrating IML with other learning paradigms are valuable research directions in the future.
Multikernel correntropy based robust least squares one-class support vector machine
2023, Neurocomputing
Least-squares one-class support vector machine (LS-OCSVM) is one of the most popular methods to perform one-class classification tasks, in which only the data of a specific class are available to train the classification model. However, the learning performance of LS-OCSVM heavily relies on the effectiveness of a squared loss function, which is sensitive to outliers, resulting in the poor robustness of LS-OCSVM to deal with contaminated data. In this paper, the original optimization problem of LS-OCSVM is therefore reformulated with a recently proposed robust similarity measure, called multikernel correntropy, generating a multikernel correntropy based LS-OCSVM (MKCLS-OCSVM). To find the solution to the new optimization problem effectively, a dynamic optimization algorithm developed with the popular half-quadratic optimization technique is adopted to perform the optimization process. Meanwhile, the convergence and computational complexity of the developed optimization algorithm are analyzed from theoretical perspectives. To further facilitate the implementation of MKCLS-OCSVM, an operationally simple search strategy, inspired by the hunting behavior of humpback whales, is designed for parameters selection. Experimental results on various one-class classification tasks are reported to demonstrate the performance superiority of the proposed MKCLS-OCSVM in comparison with LS-OCSVM and other robust LS-OCSVM variants.
Noise-tolerant clustering via joint doubly stochastic matrix regularization and dual sparse coding
2023, Expert Systems with Applications
Clustering has received a lot of attention and research in many important fields, such as machine learning and data mining. Especially, the clustering method based on non-negative matrix factorization (NMF) has been widely used. However, the following problems still exist. First, the clustering method based on the traditional NMF cannot handle noise and outliers well, although the reconstruction error can be measured by the $l_{2, 1}$ -norm instead Frobenius norm to improve the robustness, the effect is not obvious. Second, NMF based on graph regularization mostly relies on initial similarity graph, the method of constructing the graph is fixed and cannot update the graph adaptively. Third, there is no sparse constraints both on the basis matrix and the coefficient matrix, which may cause important information to be ignored during clustering. Therefore, in order to solve the above problems, we propose a joint doubly stochastic matrix regularization and dual sparse coding framework (DSNMF). Specifically, we use correntropy instead of the Euclidean distance to overcome the influence of non-Gaussian noise and outliers, making NMF more robust. In addition, the adaptive graph learning can learn a high-quality graph through doubly stochastic matrix to fully maintain local smoothness. Furthermore, we perform sparse coding both on the basis matrix and the coefficient matrix to make full use of the sparsity of the matrix. Finally, the experimental results on eleven datasets show that our method is better than other methods in most cases, and our DSNMF is robust to noise and outliers.
Robust multi-view learning via adaptive regression
2022, Information Sciences
Citation Excerpt :
Consequently, the low-quality views could adversely impact overall performance [16]. Benefiting from least square regression (LSR)[17], regression-based multi-view learning has received considerable attention in recent years [9,10,18–20]. Practically, Tao et al. [9] proposed a multi-view adaptive regression method to learn a classifier on each view and allocate view-wise weights to discriminate different views.
As data collected from different sources have multiple representations, multi-view learning has become an important paradigm of machine learning. To exploit multi-view data, previous works either tackle each view separately or concatenate all views directly, such that the distinctions as well as correlations of different views are often ignored. Furthermore, existing models usually involve intractable parameters that need to be manually determined to balance the contributions of different views, degrading the efficiency and applicability of models. In this paper, a novel multi-view learning framework, namely Robust Multi-view learning via Adaptive Regression (RMAR), is derived to discriminate diverse views in a self-supervised weighting manner without extra parameters. Meanwhile, RMAR coalesces multiple feature projections with adaptive view-wise weights and adopts $L_{2, 1}$ -norm regression loss to learn a joint projection subspace compatible across all views, not only increasing the robustness of model but also preserving the consistency and diversity among views. Furthermore, RMAR can be naturally extended for feature selection by imposing $L_{2, 1}$ -norm constraint on feature projections. Additionally, an efficient convergent algorithm is developed to solve RMAR. Extensive experiments have been performed to validate the effectiveness of RMAR for classification and feature selection and show its superiority over state-of-the-arts.
Compound Weakly Supervised Clustering
2024, IEEE Transactions on Image Processing

View all citing articles on Scopus

Hong Tao received the Ph.D. degree from the National University of Defense Technology, Changsha, China, in 2019. She is currently a Lecturer with the College of Liberal Arts and Science of the same university. Her research interests include machine learning, system science, and data mining..

Ruidong Fan is a master degree candidate at the National University of Defense Technology, Changsha, China. He received the B.S. degree from the Lanzhou University in 2018. His research interests include data mining, optimization and machine learning..

Wenzhang Zhuge is a PhD candidate at the National University of Defense Technology, Changsha, China. He received the B.S. degree from Shandong University, Jinan, China, in 2015 and the M.S. degree from the National University of Defense Technology, Changsha, China in 2017. His research interests include machine learning, system science and data mining..

Chenping Hou received the Ph.D. degrees from the National University of Defense Technology, Changsha, China in 2009. He is currently a full Professor with the Department of Systems Science of the same university. He has authored 80+ peer-reviewed papers in journals and conferences, such as the IEEE TPAMI, TNNLS/TNN, IEEE TSMCB/TCB, IEEE TIP, the IJCAI and AAAI. His current research interests include machine learning, data mining, and computer vision.

¹: The same contribution to this article.

View full text

Incomplete multi-view learning via half-quadratic minimization

Abstract

Introduction

Section snippets

Problem setting and related work

Formulation

Optimization and Convergence Analysis

Experiment

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Pattern Recogn.

IEEE Trans. Image Process.

IEEE Trans. Knowl. Data Eng.

Artif. Intell. Med.

Neural Networks

Distinctive image features from scale-invariant keypoints

Int. J. Comput. Vision

Gray scale and rotation invariant texture classification with local binary patterns

European Conference on Computer Vision, Springer

Multiview vector-valued manifold regularization for multilabel image classification

IEEE Trans. Neural Networks Learn. Syst.

Block-row sparse multiview multilabel learning for image classification

IEEE Trans. Cybern.

A multiview learning framework with a linear computational cost

IEEE Trans. Cybern.

Learning a joint affinity graph for multiview subspace clustering

IEEE Trans. Multimedia

A survey of multi-view representation learning

IEEE Trans. Knowl. Data Eng.

Joint embedding learning and low-rank approximation: A framework for incomplete multiview learning

IEEE Trans. Cybern

Unsupervised maximum margin incomplete multi-view clustering

International CCF Conference on Artificial Intelligence, Springer

Simultaneous representation learning and clustering for incomplete multi-view data

IJCAI

Multi-view unsupervised feature selection with adaptive similarity and view weight

IEEE Trans. Knowl. Data Eng.

Multi-view missing data completion

IEEE Trans. Knowl. Data Eng.

Multi-view kernel completion

Mach. Learn.