Elsevier

Neurocomputing

Volume 475, 28 February 2022, Pages 38-52
Neurocomputing

Confidence level auto-weighting robust multi-view subspace clustering

https://doi.org/10.1016/j.neucom.2021.12.029Get rights and content

Abstract

For the subspace clustering task of multi-view data, it is an important research content to mine the complementary feature across all views. However, when learning the consensus representation of all views, each view may have different confidence levels. In addition, owing to the non-linearity and noise pollution of the data, different samples in the same view may have different confidence levels. Unfortunately, most of the existing methods only assign a uniform weight to each view, and may only obtain a suboptimal solution. In this work, we put forward a confidence level auto-weighting robust multi-view subspace clustering (CLWRMSC) model. Specifically, we designed an adaptive sample weighting strategy that enables our model to focus on the confidence of both views and samples while learning the consensus representation of all views. At the same time, using the self-expression property of subspace, an adaptive low-rank multi-kernel learning (MKL) strategy is designed. Further, the Weighted Truncated Schatten p-Norm (WTSN) is proposed to better solve the optimization problem of low-rank constraints. A lot of experiments prove that the proposed model is a superior clustering algorithm.

Introduction

In reality, the rapid growth of data scale makes it very difficult to directly process and understand raw data. Fortunately, by using the data structure characteristics to mine the compact representation of data, we can understand the raw data at minimal cost. As we all know, high-dimensional data can usually be obtained by modeling multiple samples of linear subspaces. So, subspace clustering has been widely used [1], [2], [3]. Recently, scholars have proposed various subspace clustering algorithms, such as statistical-based, iterative-based, algebraic-based, and spectral-based methods. Among them, the spectral-based methods have received extensive attention because of their effectiveness for data clustering. The typical ones are Sparse Subspace Clustering (SSC) [4], Low-Rank Representation (LRR) [5] and Block Diagonal Representation (BDR) [6]. They have been applied to image classification, motion segmentation, document clustering and so on. In general, the key point to improve clustering performance is how to use standard spectral algorithm to construct a robust affinity matrix (or graph). This is also the goal of the proposed algorithm. Before proceeding, let us introduce our notation conventions in Table 1.

For linear subspaces, data in the same subspace can be represented linearly with each other, i.e., subspace self-expression property, which can be mathematically expressed as:minZ12X-XZF2+λR(Z)s.t.Zii=0where λ>0 is called trade-off parameter, R(Z) is called penalty (or regularization) term. There are currently many options for R(Z), such as Z1 [7], [8], Z [9], [10], ZF2 [11], Zk [6] and so on.

To better treat the non-linearity and improve the accuracy of data reconstruction, some kernel-based SSC (KSSC) methods are designed [12], which perform clustering tasks in the linear feature space obtained by mapping the original data, i.e.,minZ12ϕ(X)-ϕ(X)ZF2+λR(Z)s.t.Zii=0

This problem is equivalent tominZ12tr[(I-2Z-ZZ)K]+λR(Z)s.t.Zii=0where K=ϕ(X)ϕ(X) is the unknown kernel Gram matrix.

In order to ensure that the space obtained by using the kernel trick is composed of multiple low-dimensional subspaces, LrKSC [13] adds a low-rank kernel constraint ϕ(X) within the SSC framework. Zhang et al. [14], [15] use more advanced low-rank constraint methods to optimize LrKSC (i.e., ·Spp and ·w,Spp), and then design two superior subspace clustering algorithms. Unfortunately, these methods are designed based on single kernel learning, so their dependence on kernel selection limits their clustering performance.

As a result, some scholars optimize their algorithms and propose multiple kernel learning (MKL) methods [16], [17], [18]. In recent years, as the superiority of MKL technology has been continuously highlighted, its application in subspace clustering tasks has become more and more extensive [19], [20], [21]. The advantages of MKL are usually achieved by assigning appropriate weights to each basic kernel. In [22], a spectral clustering model (AASC) is proposed by adopting the multiple affinity matrices, which is a successful attempt of multi-kernel clustering. Combining the MKL and K-means algorithms, [23] designs a robust multi-kernel k-means (RMKKM) method, which can independently select the appropriate kernel from the basic kernel pool and improve the efficiency of kernel design or selection. Inspired by this, an advanced method is designed and used for clustering tasks (SMKL) [24]. Similar to RMKKM, Kang et al. design a MKL clustering method (SCMK) by learning an optimal kernel [25]. Aiming the robustness of the model, Yang et al. [26] design a correntropy metric weighting method for MKL and achieve ideal clustering results. Paying full attention to the inner neighborhood structure between the base kernels, a neighbor-kernel-based MKC algorithm is proposed [27]. In short, a large number of research results prove that the learning effect and agility of MKL are better than single-kernel learning. Unfortunately, these multi-kernel clustering methods do not pay special attention to whether the data space obtained by using the kernel trick contains multiple low-dimensional subspaces.

Because multi-view data describes characteristics of the same target, there is naturally a strong correlation between them. So, fully mining the inherent features of data can effectively enhance the capability of multi-view subspace clustering (MSC) algorithms. Many scholars have done a lot of researches for this purpose [28]. Using collaborative learning, a consensus representation is obtained successfully between views (Co-Reg) [29]. Of course, a large amount of redundant features in the data will reduce the stability of clustering. In this regard, a robust clustering method (RMSC) is designed via applying Markov chain [30]. By using a common clustering structure to ensure consistency between different views, Gao et al. [31] propose to simultaneously cluster the subspace representations of each view. Solving models efficiently is also a concern of researchers, for this, Lu et al. [32] present a tight relaxation for SSC and propose a convex pairwise SSC model (CSMSC). Inspired by Co-Reg, [33] simultaneously performs · and ·1 on the obtained consensus representation matrix (MLRSSC). To suppress noise in data and deal with nonlinear structures, a low-rank kernel-based MSC model (RLKMSC) is proposed, which has good clustering performance and robustness [34]. In order to overcome the influence of noise and improve the ability of mining the underlying subspace structure of the model, a robust energy preserving embedding (REPE) method is proposed [35]. However, existing methods only assign a uniform weight to each view, but the damage degree of each sample in the same view may be different, which means that the confidence of each sample may be different (for example, the confidence levels of outliers or noisy samples should be lower than that without being destroyed), therefore, considering the uniform weight of a view may lead to a suboptimal solution.

Recently, with the rise of deep learning (DL) again, many researchers have applied it to single-view subspace clustering tasks. In [36], based on the deep autoencoder, a deep subspace clustering networks (DSC-Nets) is proposed, which creatively introduces a novel “self-representation layer” between the encoder and the decoder. Peng et al. [37] propose a deep extended structure of sparse subspace clustering, called 1 norm deep subspace clustering. Of course, some scholars have designed some deep subspace clustering networks for multi-view data [38], [39]. These methods have achieved amazing results. Generally speaking, the clustering performance of DL-based methods is better than traditional methods. However, the DL-based methods have higher requirements for the number of samples and computer hardware, which limits their applicability. The focus of this work is different from the DL-based methods, so we will not discuss too much about DL-based methods here.

In conclusion, there are many studies on dealing with non-linear structures and complex noises in data, but there is not yet a method to solve the problem that can simultaneously deal with the problem that samples may have different confidence levels under the same view. Therefore, we aim to design a novel robust MSC model that can learn the weights of the samples and consensus representation jointly. Considering the nonlinearity of high-dimensional data and the possibility of noise pollution, we propose a weighted truncated Schatten p-norm (WTSN) regularization, and design an adaptive low-rank multi-kernel learning strategy via WTSN and mixture correntropy [40].

Simply put, the main contributions are the following four aspects:

  • To highlight the confidence level of pure samples, we design a unique weighting strategy to consider the confidence of both views and samples.

  • To better cope with the nonlinearity and noise pollution of data, we design an advanced MKL strategy. The weights of all basic kernels are automatically allocated based on their similarity to the consensus kernel.

  • To obtain a better low-rank approximation, we propose a weighted truncated Schatten p-norm (WTSN), and give the corresponding optimization procedure.

The remaining parts of this paper include three sections. Section 3 provides the relevant preliminary knowledge used in this work. Section 3 proposes the model CLWRMSC, and introduces its’ optimization and solution. Meanwhile, the complexity and convergence of the model are analyzed. Section 4 introduces the relevant experiments in detail, and analyzes the results carefully. Section 5 summarizes this work.

Section snippets

Low-rank regularization

The truncated nuclear norm (TNN) can approximate to the rank function accurately and robustly [41]. Given XRD×N, its TNN isXr=i=r+1min(D,N)δi(X)=X-tr(B1XB2)

Suppose the SVD of X is UΔV, where U=(u1,···,uD)RD×D,ΔRD×N is a diagonal matrix composed of δi(X),V=(v1,···,vN)RN×N. B1Rr×D and B2Rr×N are the transpose of first r columns from U and V, respectively, and they satisfy B1B1=Ir×r and B2B2=Ir×r.

In addition, the weighted Schatten p-norm (WSNM) [42] can be expressed asXw,Spp=i=1

The proposed method

Next, we design a novel confidence level auto-weighting robust MSC (CLWRMSC) algorithm incorporating adaptive low-rank multi-kernel learning. The flowchart of CLWRMSC is shown in Fig. 1. In the first step, our model learns the coefficient matrix Z(v) independently for each view by adaptive low-rank multi-kernel learning (ALMKL) and BDR. Especially, the MKL strategy in our medel is designed via mixture correntropy, where the weights of all basic kernels are automatically allocated based on their

Experiments and analysis

Next, we execute several experiments to assess the performance of CLWRMSC from multiple aspects. In particular, our model mainly includes three parts: (1) adaptive low-rank multi-kernel learning (ALMKL), (2) block diagonal representation (BDR), (3) confidence level auto-weighting. To better confirm the capability of the first two parts, model (13) is used for ablation research, which can be seen as the single-view form of model CLWRMSC.

Conclusions

For multi-view data, an advanced robust subspace clustering model (CLWRMSC) is designed, which allows our model to pay attention to the confidence levels of the views and the samples at the same time when learning the consensus representation of all views. Specifically, we proposed a robust MKL strategy based on MCIM to train a consensus kernel, and learned an affinity graph in kernel space. In this process, low-rank constraint is applied to the learned consensus kernel to encourage the

CRediT authorship contribution statement

Xiaoqian Zhang: Conceptualization, Methodology, Software. Jing Wang: Supervision, Writing - review & editing. Xuqian Xue: Writing - original draft, Data curation, Software. Huaijiang Sun: Supervision. Jiangmei Zhang: Software, Validation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant 62102331, 62176125 and 61772272, the Sichuan Science and Technology Program under Grant 2020YFS0360 and 2021YJ0334.

The authors appreciate Zhenwen Ren for providing the code of JMKSC. The authors also appreciate the anonymous reviewers who provided constructive comments for improving our work.

Xiaoqian Zhang received M.S. degree in Circuits and Systems from the Southwest University of Science and Technology (SWUST), Manyang, China, in 2013. Then he received the Ph.D. degree in Control Science and Engineering from the Nanjing University of Science and Technology, Nanjing, China, in 2021. He is also working as a teacher at SWUST. His research interest covers subspace clustering, sparse representation, deep learning and their applications in image processing.

References (54)

  • Liang Zhao et al.

    Incomplete multi-view clustering via deep semantic mapping

    Neurocomputing

    (2018)
  • Badong Chen et al.

    Mixture correntropy for robust learning

    Pattern Recogn.

    (2018)
  • Lei Feng et al.

    Image compressive sensing via truncated schatten-p norm regularization

    Signal Process.: Image Commun.

    (2016)
  • Zhao Kang et al.

    Low-rank kernel learning for graph-based clustering

    Knowl.-Based Syst.

    (2019)
  • Zhenwen Ren et al.

    Multiple kernel subspace clustering with local structural graph and low-rank consensus kernel learning

    Knowl.-Based Syst.

    (2020)
  • Lijuan Wang et al.

    Block diagonal representation learning for robust subspace clustering

    Inf. Sci.

    (2020)
  • Qi Wang et al.

    Spectral embedded adaptive neighbors clustering

    IEEE Trans. Neural Networks Learn. Syst.

    (2019)
  • Ehsan Elhamifar et al.

    Sparse subspace clustering

  • Guangcan Liu, Zhouchen Lin, and Yong Yu. Robust subspace segmentation by low-rank representation. In Proceedings of the...
  • Lu. Canyi et al.

    Subspace clustering by block diagonal representation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2018)
  • Ehsan Elhamifar et al.

    Sparse subspace clustering: Algorithm, theory, and applications

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • Guangcan Liu et al.

    Robust recovery of subspace structures by low-rank representation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2013)
  • Canyi Lu, Hai Min, Zhongqiu Zhao, Lin Zhu, Deshuang Huang, and Shuicheng Yan. Robust and efficient subspace...
  • Vishal M Patel et al.

    Kernel sparse subspace clustering

  • Pan Ji, Ian Reid, Ravi Garg, Hongdong Li, and Mathieu Salzmann. Low-rank kernel subspace clustering. arXiv preprint...
  • Xiaoqian Zhang, Beijia Chen, Huaijiang Sun, Zhigui Liu, Zhenwen Ren, and Yanmeng Li. Robust low-rank kernel subspace...
  • Xinwang Liu, Sihang Zhou, Yueqing Wang, Miaomiao Li, Yong Dou, En Zhu, and Jianping Yin. Optimal neighborhood kernel...
  • Cited by (0)

    Xiaoqian Zhang received M.S. degree in Circuits and Systems from the Southwest University of Science and Technology (SWUST), Manyang, China, in 2013. Then he received the Ph.D. degree in Control Science and Engineering from the Nanjing University of Science and Technology, Nanjing, China, in 2021. He is also working as a teacher at SWUST. His research interest covers subspace clustering, sparse representation, deep learning and their applications in image processing.

    Jing Wang received B.A. degree in Process Equipment and Control Engineering from the Tianjin University of Technology, Tianjin, China, in 2019. She is currently pursuing the M.S. degree in control science and engineering at SWUST, Mianyang, China. Her research interest covers subspace clustering, sparse representation, multi-view clustering and their applications in image processing.

    Xuqian Xue received B.S. degree from Changzhi College in 2018, and received M.S. degree from the SWUST, Mianyang, China,in 2021. Her research interests include low-rank representation, subspace clustering, and the application of subspace clustering in image segmentation.

    Huaijiang Sun received the B.S. and Ph.D. degrees from Northwestern Polytechnical University, Xi’an, China, in 1990 and 1995, respectively. He is currently a Professor with the School of Computer Science and Engineering, Nanjing University of Science and Technology, China. His research interests include neural networks, machine learning, and human motion analysis and synthesis.

    Jiangmei Zhang, Doctor, Professor, Doctoral Supervisor, Leader of the National Innovation Team, Council Member of China Society of Radiation Protection, Graduated from the University of Science and Technology of China and obtained a doctorate in control science and engineering. She has been committed to the researches on radiation environment perception, intelligent information processing, application of nuclear technology and robot technology, and obtained certain achievements.

    View full text