A novel multi-view clustering method via low-rank and matrix-induced regularization

doi:10.1016/j.neucom.2016.08.014

Neurocomputing

Volume 216, 5 December 2016, Pages 342-350

https://doi.org/10.1016/j.neucom.2016.08.014 Get rights and content

Abstract

Multi-view clustering algorithms have shown promising performance in various applications over the last few decades. Most of them, however, do not adequately take noises and correlation among multiple views into account, which may degrade the clustering performance. In this paper, we propose a novel multi-view clustering method to address these issues. In specific, we construct a low-rank consensus matrix and a sparse error matrix from each similarity matrix corresponding to each view. Furthermore, a matrix-induced regularization term is incorporated to reduce the redundancy and enhance the diversity among different views. The augmented Lagrangian multiplier algorithm is adopted to solve the resultant optimization problem. Comprehensive experiments are conducted to verify the effectiveness of the proposed algorithm. Results demonstrate that our algorithm outperforms several state-of-the-art ones on both synthetic and benchmark data sets.

Introduction

Multi-view clustering, which aims to partition data points into groups with similar ones by applying information from different views, has shown great performance in recent years. Multi-view data are common in real-life problems. For example, web data can be represented in features extracted from text and hyperlinks, whereas books and papers can be translated into different languages. Several multi-view clustering algorithms have been proposed. However, identifying the underlying relationships between data points is difficult because of noises and redundant information.

Robust clustering methods have been proposed to handle noises. The idea of low-rank has shown its powerfulness on robustness. Ye et al. [25] propose a robust late fusion method that decomposes each original score matrix from individual models into a common rank-2 matrix and sparse deviation errors. Pan et al. [22] utilize a low-rank constraint for rank aggregation. Similarly, Xia et al. [24] propose a robust multi-view spectral clustering (RMSC) method, which extracts a consensus transition probability matrix with a low-rank constraint and a sparse constraint on each residual error matrix. Hong et al. [13] integrate different Low Rank Representation (LRR) affinity matrices to form hypergraph Laplacian matrix. However, existing robust clustering methods regard the information from each view indiscriminately, and the redundant information between views degrades the clustering performance of these methods. The problems arising from the neglect of correlation between different views are listed as follows: (i) Information from similar views would be redundant and conquer much attention in clustering. (ii) Information from a special view would be suppressed because of its low occupancy rate.

To address the correlation between different views, some researchers [4], [16] have shown that the independence of different views helps in multi-view learning. Previous studies propose that a high independence translates to a high diversity of two variables [20], [21]. All types of dependence measures have the same intrinsic factor that expresses the similarity between views. Cao et al. [2] extend the existing subspace clustering into the multi-view domain and incorporates the Hilbert Schmidt Independence Criterion (HSIC) as a diversity term to explore the complementarity of multi-view representations. Liu et al. [19] propose a multiple kernel k-means clustering method with a matrix-induced regularization that reduces redundancy and enhances diversity. Kumar and Daumé [15] apply a co-training approach for multi-view spectral clustering to explore intra-cluster information. Zhou et al. [32] utilize Kullback–Leibler (KL) divergence to guide clustering ensemble and achieve outstanding performance. However, these multi-view methods have a finite ability to resist noises.

Considering that diversity increases clustering performance, we construct a matrix-induced regularization term for our methods. To recap the powerfulness of robust methods, we also provide a low-rank constraint to deal with noises. Therefore, a novel multi-view clustering method via low-rank and matrix-induced regularization (MCM-LRMIR) is proposed in this paper. The main contribution of our method is that we introduce matrix-induced regularization term into low-rank framework, which not only holds the advantages of low-rank framework in robustness, but also ensures the diversity between different views. First, we apply Gaussian kernels to define the similarity matrix. As Xia et al. [24] explicitly handle noises in transition probability matrices from different views. Hence, we apply low-rank and sparse decomposition constraints which are similar to the front method and introduce a matrix-induced regularization to guide the objective function. Finally, we propose an alternative optimization procedure based on the augmented Lagrangian multiplier (ALM) scheme to solve the objective function [17]. Experimental results show that this procedure converges in several iterations. In addition, experimental results on both synthetic and benchmark datasets of multi-view clustering demonstrate that the proposed method outperforms several state-of-the-art ones in the literature.

Section snippets

Related work

Multi-view clustering has been a hot topic in recent years. Related algorithms can be roughly grouped into two main categories: (i) Construct a common feature representation for all views before or during clustering. (ii) Integrate the clustering results from each view to obtain a final one.

Existing methods belonging to the first category have diverse ways of obtaining a common feature representation. With a priori hypothesis that the optimal kernel can be obtained by the linear combination of

Single view clustering

In single-view clustering, MKSC is similar to spectral clustering via Markov chains [30]. Let $X = {x_{1}, x_{2}, \dots, x_{n}} \in R^{d \times n}$ be a set of n data points. Define a similarity matrix $S \in R^{n \times n}$ to denote the similarity between samples. In general, Gaussian kernels are used to define similarity matrix: $S_{ij} = \exp (- \frac{∥ x_{i} - x_{j} ∥_{2}^{2}}{δ^{2}})$ where $∥ \cdot ∥_{2}$ denotes the ℓ₂ norm and δ² denotes average Euclidean distance over all pairs of data points. Let $G = (V, E, S)$ be a weighted graph with vertex set V, edge set E, and the similarity

The proposed algorithm

Existing algorithms have shown promising clustering performance. However, most of them have not considered the diversity between information from different views adequately [5], [7], [14], [24], [26]. For example, Eq. (3) in RMSC combines all views indiscriminately. The optimization procedure would render the error matrices from similar views small and sparse because similar views lead to similar error matrices and show advantages in numbers. In this aspect, similar views are dominant and over

Optimization

The optimization in Eq. (6) is still difficult because of the trace norm and the ℓ₁ norm. In this section, we apply the ALM scheme [17] to solve it.

First, we replace $\hat{X}$ with $J$ in the function and add their equality as a constraint. $\min_{J, \hat{X}, E^{(i)}, ω} ∥ J ∥_{⁎} + λ \sum_{i = 1}^{m} ω_{i} ∥ E^{(i)} ∥_{1} + β ω^{⊤} H ω s.t. X^{(i)} = \hat{X} + E^{(i)}, \hat{X} \geq 0, \hat{X} 1 = 1, ω^{⊤} 1 = 1, J = \hat{X} .$ Then, we obtain the corresponding augmented Lagrange function: $L (J, \hat{X}, E^{(i)}, ω) = ∥ J ∥_{⁎} + λ \sum_{i = 1}^{m} ω_{i} ∥ E^{(i)} ∥_{1} + \sum_{i = 1}^{m} 〈 Y^{(i)}, \hat{X} + E^{(i)} - X^{(i)} 〉 + \frac{μ}{2} \sum_{i = 1}^{m} ∥ \hat{X} + E^{(i)} - X^{(i)} ∥_{F}^{2} + 〈 Z, \hat{X} - J 〉 + \frac{μ}{2} ∥ \hat{X} - J ∥_{F}^{2} + ω^{⊤} H ω s.t. \hat{X} \geq 0, \hat{X} 1 = 1, ω^{⊤}$

Experiments

In this section, we evaluate our algorithm in both synthetic and benchmark data sets. Our proposed algorithm shows better performance compared with other methods. We use several real-world and synthetic data sets to show the performance of our proposed algorithm. The number of clusters is set to the true number of classes for all the data sets.

Conclusion

We propose a novel method (MCM-LRMIR) that not only retains the benefit of consensus matrix recovery methods but also considers the diversity of multi-views to guide clustering. Our algorithm is easily complemented with an alternative optimization in which most subproblems can be solved using off-the-shelf methods. Experiments on both synthetic and benchmark data sets show that our algorithm outperforms state-of-the-art ones. Parameter analysis shows that our proposed method is insensitive to

Acknowledgments

This work was supported by the National Natural Science Foundation of China (Project no. 61403405).

Yang Zhao received the B.S. degree in computer science from the National University of Defense Technology, Changsha, China, in 2014. He is currently working toward the M.S. degree at the National University of Defense Technology. His research interests include machine learning and computer vision.

References (33)

Chaoqun Hong et al.
Multi-view hypergraph learning by patch alignment framework
Neurocomputing
(2013)
K. Zeng et al.
Image clustering by hyper-graph regularized non-negative matrix factorization
Neurocomputing
(2014)
Sihang Zhou et al.
Random Fourier extreme learning machine with $ℓ_{2, 1} -norm$ regularization
Neurocomputing
(2016)
Jian-Feng Cai et al.
A singular value thresholding algorithm for matrix completion
SIAM J. Optim.
(2010)
Xiaochun Cao, Changqing Zhang, Huazhu Fu, Si Liu, Hua Zhang, Diversity-induced multi-view subspace clustering,...
Xiaochun Cao, Changqing Zhang, Huazhu Fu, Si Liu, Hua Zhang, Diversity-induced multi-view subspace clustering, in:...
Kamalika Chaudhuri, Sham M. Kakade, Karen Livescu, Karthik Sridharan, Multi-view clustering via canonical correlation...
Liang Du, Peng Zhou, Lei Shi, Hanmo Wang, Mingyu Fan, Wenjian Wang, Yi-Dong Shen, Robust multiple kernel k-means...
John Duchi, Shai Shalev-Shwartz, Yoram Singer, Tushar Chandra, Efficient projections onto the l 1-ball for learning in...
Mehmet Gönen, Adam A. Margolin, Localized data fusion for kernel k-means clustering with application to cancer biology,...

Derek Greene, Pádraig Cunningham, A matrix factorization approach for integrating multiple data views, in:Machine...

Arthur Gretton, Olivier Bousquet, Alex Smola, Bernhard Schölkopf, Measuring statistical dependence with Hilbert–Schmidt...

Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Schölkopf, Alexander J. Smola, A kernel method for the...

Chaoqun Hong et al.

Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval

IEEE Trans. Ind. Electron.

(2015)

Chaoqun Hong et al.

Multimodal deep autoencoder for human pose recovery

IEEE Trans. Image Process.

(2015)

Hsin-Chien Huang et al.

Multiple kernel fuzzy clustering

IEEE Trans. Fuzzy Syst.

(2012)

Cited by (12)

An active three-way clustering method via low-rank matrices for multi-view data
2020, Information Sciences
Citation Excerpt :
In contrast, the strategy in the ATCLM is to find the informative objects from the fringe regions by taking the advantage of three-way representation. To show the performance of dealing with multi-view data, we test the proposed method UCLM, the BSV, the FeatCon, the AWC, the CRSC [10], the WCFS [32] and the LRMIR [44] on the seven datasets. For fair comparison, the proposed UCLM is executed in this experiment because the compared algorithms are unsupervised methods.
In recent years, multi-view clustering algorithms have shown promising performance by combining multiple sources or views of datasets. A problem that has not been addressed satisfactorily is the uncertain relationship between an object and a cluster. Thus, this paper investigates an active three-way clustering method via low-rank matrices that can improve clustering accuracy as clustering proceeds for the multi-view data of high dimensionality. We adopt a three-way clustering representation to reflect the three types of relationships between an object and a cluster, namely, belong-to definitely, uncertain and not belong-to definitely. We construct the consensus low-rank matrix from each weighted low-rank matrix by taking account of the diversity of views, and give the method to solve the optimization problem of objective function based on the improved augmented Lagrangian multiplier algorithm. We suggest an active learning strategy to learn important informative pairwise constraints after measuring the uncertainty of an object based on the entropy concept. The experimental results conducted on real-world datasets have validated the effectiveness of the proposed method.
Auto-weighted multi-view clustering via kernelized graph learning
2019, Pattern Recognition
Citation Excerpt :
In the past decade, several methods have been proposed to solve the multi-view clustering problem. In general, existing multi-view clustering methods can be roughly classified into two categories including subspace approaches which aim to explore the hidden common subspace shared by all views [14–19], and graph-based methods which were designed based on the traditional spectral learning models [20–24]. However, most existing multi-view clustering methods still suffer from the following drawbacks: (1) These methods are sensitive to outliers and noisy data, thus impair the final clustering performance greatly; (2) Nonlinear relationships exist in real-world dataset have not been effectively searched by most existing methods. (
Datasets are often collected from different resources or comprised of multiple representations (i.e., views). Multi-view clustering aims to analyze the multi-view data in an unsupervised way. Owing to the efficiency of uncovering the hidden structures of data, graph-based approaches have been investigated widely for various multi-view learning tasks. However, similarity measurement in these methods is challenging since the construction of similarity graph is impacted by several factors such as the scale of data, neighborhood size, choice of similarity metric, noise and outliers. Moreover, nonlinear relationships usually exist in real-world datasets, which have not been considered by most existing methods. In order to address these challenges, a novel model which simultaneously performs multi-view clustering task and learns similarity relationships in kernel spaces is proposed in this paper. The target optimal graph can be directly partitioned into exact c connected components if there are c clusters. Furthermore, our model can assign ideal weight for each view automatically without additional parameters as previous methods do. Since the performance is often sensitive to the input kernel matrix, the proposed model is further extended with multiple kernel learning ability. With the proposed joint model, three subtasks including construct the most accurate similarity graph, automatically allocate optimal weight for each view and find the cluster indicator matrix can be simultaneously accomplished. By this joint learning, each subtask can be mutually enhanced. Experimental results on benchmark datasets demonstrate that our model outperforms other state-of-the-art multi-view clustering algorithms.
Diversity-induced fuzzy clustering
2019, International Journal of Approximate Reasoning
Granular computing plays an important role in human reasoning and problem solving, a reasonable granulation method is important in practical tasks. Clustering is one of the most common methods of granulation, learning clear and correct grouping structure of a data set is a key pursuit for clustering algorithm. An excellent clustering algorithm needs to not only explore similar characteristics of individual group but also to pay attention to ensure higher discrimination among different centers. Ignoring the between-cluster variation will lead to a phenomenon that multiple learned centers concentrate to one point, it happens especially when confronted with datasets exist overlapping regions among clusters. To overcome this issue, we model the diversity information in-between different clusters and measure it with a statistical dependence metric Hilbert Schmidt Independence Criterion (HSIC), and then develop a Diversity-induced Fuzzy C-Means clustering algorithm framework based on traditional Fuzzy C-Means algorithm, which can minimize the within-cluster dispersion and maximize between-clusters separation simultaneously. The formula of updating center attracts the points have the same group with it as well as excludes the impact from other clusters. We analyze the convergence of proposed method under the alternating minimizing optimization fashion, and discuss the sensitivity of parameters in algorithm for clustering performance. The reasonability and advantages of proposed method also have been explained by simulation study. Further, three types of DiFCM methods by using different HSIC form carry out on UCI and image data sets, all experimental results confirm the outstanding of the proposed method.
Robust multi-view data clustering with multi-view capped-norm K-means
2018, Neurocomputing
Real-world data sets are often comprised of multiple representations or views which provide different and complementary aspects of information. Multi-view clustering is an important approach to analyze multi-view data in a unsupervised way. Previous studies have shown that better clustering accuracy can be achieved using integrated information from all the views rather than just relying on each view individually. That is, the hidden patterns in data can be better explored by discovering the common latent structure shared by multiple views. However, traditional multi-view clustering methods are usually sensitive to noises and outliers, which greatly impair the clustering performance in practical problems. Furthermore, existing multi-view clustering methods, e.g. graph-based methods, are with high computational complexity due to the kernel/affinity matrix construction or the eigendecomposition. To address these problems, we propose a novel robust multi-view clustering method to integrate heterogeneous representations of data. To make our method robust to the noises and outliers, especially the extreme data outliers, we utilize the capped-norm loss as the objective. The proposed method is of low complexity, and in the same level as the classic K-means algorithm, which is a major advantage for unsupervised learning. We derive a new efficient optimization algorithm to solve the multi-view clustering problem. Finally, extensive experiments on benchmark data sets show that our proposed method consistently outperforms the state-of-the-art clustering methods.
Multi-view metric learning based on KL-divergence for similarity measurement
2017, Neurocomputing
Citation Excerpt :
This section will provide authors a brief comprehension of multi-view learning and distance metric learning. Meanwhile, researchers have proposed several related works [30–38] in this field. Zhai et al. proposed a multi-view metric learning [30] method which aims to reveal the shared latent feature space of the multiview observations by embodying global consistency constraints and preserving local geometric structures.
In the past decades, we have witnessed a surge of interests of learning distance metrics for various image processing tasks. However, facing with features from multiple views, most metric learning methods fail to integrate compatible and complementary information from multi-view features to train a common distance metric. Most information is thrown away by those single-view methods, which affects their performances severely. Therefore, how to fully exploit information from multiple views to construct an optimal distance metric is of vital importance but challenging. To address this issue, this paper constructs a multi-view metric learning method which utilizes KL-divergences to integrate features from multiple views. Minimizing KL-divergence between features from different views can lead to the consistency of multiple views, which enables MML to exploit information from multiple views. Various experiments on several benchmark multi-view datasets have verified the excellent performance of this novel method.
A Multi-View SVM Approach for Seizure Detection from Single Channel EEG Signals
2023, IETE Journal of Research

View all citing articles on Scopus

Yong Dou received his B.S., M.S., and Ph.D. degrees from National University of Defense Technology in 1989, 1992 and 1995. His research interests include high performance computer architecture, high performance embedded microprocessor, reconfigurable computing, machine learning, and bioinformatics. He is a member of the IEEE and the ACM.

Xinwang Liu received the M.S. and Ph.D. degree from National University of Defense Technology, China in 2008 and 2013, respectively. From January 2014, he works as a research assistant at National Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha, China. His research interests focus on designing algorithms on kernel learning, feature selection and multi-view clustering.

Teng Li received the B.S. degree and the M.S. degree in computer science from the National University of Defense Technology, Changsha, China, in 2013 and 2015, respectively. He is currently working toward the Ph.D. degree at the National University of Defense Technology. His research interests include machine learning and computer vision.

View full text

A novel multi-view clustering method via low-rank and matrix-induced regularization

Abstract

Introduction

Section snippets

Related work

Single view clustering

The proposed algorithm

Optimization

Experiments

Conclusion

Acknowledgments

Neurocomputing

Neurocomputing

Neurocomputing

A singular value thresholding algorithm for matrix completion

SIAM J. Optim.

Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval

IEEE Trans. Ind. Electron.

Multimodal deep autoencoder for human pose recovery

IEEE Trans. Image Process.

Multiple kernel fuzzy clustering

IEEE Trans. Fuzzy Syst.