Elsevier

Pattern Recognition

Volume 134, February 2023, 109114
Pattern Recognition

Semi-supervised adaptive kernel concept factorization

https://doi.org/10.1016/j.patcog.2022.109114Get rights and content

Abstract

Kernelized concept factorization (KCF) has shown its advantage on handling data with nonlinear structures; however, the kernels involved in the existing KCF-based methods are empirically predefined, which may compromise the performance. In this paper, we propose semi-supervised adaptive kernel concept factorization (SAKCF), which integrates the data representation and kernel learning into a unified model to make the two learning processes adapt to each other. SAKCF extends traditional KCF in a semi-supervised manner, which encourages the high-dimensional representation to be consistent with both the limited supervisory and local geometric information. Besides, an alternating iterative algorithm is proposed to solve the resulting constrained optimization problem. Experimental results on six real-world data sets verify the effectiveness and advantages of our SAKCF over state-of-the-art methods when applied on the clustering task.

Introduction

Nonnegative matrix factorization (NMF) [1], [2] has shown its effectiveness for the representation of linear data, and many extended NMF algorithms have been proposed to improve its performance on data representation [3], [4], [5], [6], [7] and clustering [8], [9], [10] in the past decade. Particularly, as a variant of NMF, concept factorization (CF) [11] models its basis as a linear combination of the input data samples in the original space, which inherits the strengths of NMF, such as the physically meaningful representations and effectiveness for representing linear data. To better capture the underlying structure, many CF-based models incorporating local geometric information were proposed. For examples, Cai et al. [12] proposed locally consistent CF, in which the low-dimensional representation preserves the manifold structure by introducing a graph regularization. Liu et al. [13] imposed a locality constraint into CF and proposed local-coordinate CF, where the sample is represented by only a few nearby bases. Pei et al. [14] improved the locally consistent CF by jointly learning an adaptive weight matrix for graph regularization and factored matrices for CF. Zhang et al. [15] introduced an auto-weighted reconstruction graph learning strategy into local-coordinate CF, where L2,1 norm is imposed to enhance the robustness of CF to noise. Instead of using F-norm in locally consistent CF, Peng et al. [16] proposed a correntropy based CF model which adopts correntropy to measure the similarity between the input data matrix and reconstructed one. Ma et al. [17] improved the conventional CF by introducing both manifold and adaptive neighbor structures into a unified model.

In addition, considering that a small amount of supervisory information may be available in practical applications, CF has been extended in a semi-supervised manner. Liu et al. [18] proposed constrained CF, which introduces a label indicator matrix into the low-dimensional representation to enforce the data samples in the same class to share the identical low-dimensional representation. Based on [18], Shu et al. [19] proposed a local regularization constrained CF, which takes both geometric and discriminative information into account. Zhang et al. [20] relaxed the reconstruction error of the constrained CF, and proposed a joint classification and data representation framework. Zhou et al. [21] proposed a novel robust semi-supervised CF, which integrates the robust adaptive embedding and maximum correntropy criterion based CF into a unified model.

Technically, to ensure basis and original data sample are in the same space, CF aims to represent basis by the linear combination of original data samples. However, when applied on nonlinearly distributed data, the way that CF formulates the basis may lead to an uninterpretable basis. As an example, Fig. 1 intuitively illustrates the linear combination of the data samples in the original data space, where the weight of linear combination is set to 1. Since kernel methods can deal with nonlinear correlation in data, both CF [11] and locally consistent CF [12] have been extended into their kernel versions, respectively, which improve the clustering performance. Li et al. [22] proposed a manifold kernel concept factorization (MKCF), which introduces the manifold kernel learning into CF to encode the geometric information in the kernel space. Similar to MKCF, Li et al. [23] presented a manifold kernel based local-coordinate CF, which utilizes a manifold kernel to improve the capability of local-coordinate CF on handling nonlinearly distributed data. Instead of selecting an optimal kernel, Li et al. [24] proposed a multi-kernel based CF, which takes advantage of several kernels by linearly fusing them.

Although the existing kernel CF (KCF)-based methods have demonstrated their advantages and effectiveness on handling linearly inseparable data, they construct the kernel empirically, and it is not clear whether the predefined kernel is able to adapt the subsequent learning of the low-dimensional representation well. To this end, in this paper, we propose a model, namely semi-supervised adaptive kernel concept factorization (SAKCF), to jointly learn an adaptive kernel and low-dimensional data representation. In addition to incorporating the two learning processes into a unified model, we consider the situation where the limited supervisory information is given, and formulate the proposed method in a semi-supervised scenario. As illustrated in Fig. 2, the proposed SAKCF performs CF in a learned high-dimensional kernel space, in which the high-dimensional representation is consistent with the limited supervisory and local geometric information. Different from locally consistent CF (LCCF), constrained CF (CCF) or local regularization constrained CF (LRCCF), the proposed SAKCF encodes supervisory and geometric information into the kernel space to obtain an informative kernel, rather than directly uses these two types of information to regularize the low-dimensional representation. Additionally, the way that the proposed model utilizes supervisory and geometric information can be extended to LCCF, CCF or LRCCF as well.

The main contributions of SAKCF lie in the following three-fold:

  • SAKCF performs CF in an adaptively learned kernel space, where the limited supervisory and local geometrical information are encoded. The proposed model leverages the power of kernel and limited supervisory information, and such a semi-supervised learning framework can be naturally extended to several variants of CF (e.g., LCCF and CCF).

  • SAKCF integrates the data representation and kernel learning in a unified model to make the two learning processes adapt to each other.

  • We also propose an iterative updating optimization scheme for solving the optimization problem. The theoretic analysis including convergence and complexity analyses of the proposed algorithm is presented.

The rest of this paper is organized as follows. In Section 2, we briefly review the rationales of NMF, CF, LCCF, and CCF. In Section 3, we present the proposed SAKCF. We evaluate the performance of SAKCF when used for the clustering task in Section 4. Finally, Section 5 draws the conclusion of this paper.

Notation: Throughout this paper, we use lowercase letters, boldface lowercase letters and uppercase letters to denote scalars, vectors, and matrices, respectively. The elements of a matrix are denoted by lowercase letters with two subscripts, which indicate the row and column indices, respectively. For a matrix A, we use AF, Tr(A), and AT to denote its Frobenius norm, trace, and transpose, respectively. Nk(·) indicates the set containing the k nearest neighbors of a typical data sample. For some space F, ·F and <·,·>F denote the distance metric and inner product in this space, respectively. Il denotes an identity matrix of size l×l, and 0 is a matrix with all entries being 0.

Section snippets

Nonnegative matrix factorization and concept factorization

Given a nonnegative data set X={x1,x2,,xn} including n samples of each dimension m, it can be organized as a matrix XRm×n=[x1,x2,,xn], where each column represents a vectorial data sample. NMF aims to decompose XRm×n into the product of two nonnegative matrices with lower dimensions, i.e., URm×d and VRn×d where d<min{m,n}. Mathematically, to find such pair of matrices, NMF proposes to solve the following problemminU,VXUVTF2s.t.U0,V0,where U(resp.V)0 denotes each element of U(resp.V)

Problem formulation

As mentioned earlier, for linearly inseparable data, the performance of CF applied in the original data space will be limited since its basis is approximated by the linear combination of the original data. KCF is consequently proposed by projecting the data into a high-dimensional reproducing kernel Hilbert space. Kernel propagation [25], [26] utilizes insufficient supervisory information to learn an informative kernel, which will facilitate the following learning tasks, and can be used in KCF

Experiment

In this section, we carried out extensive experiments in terms of the clustering task to evaluate the proposed method.

Conclusion

In this paper, we have proposed a semi-supervised adaptive kernel concept factorization model, which extends the conventional CF to the kernel version and introduces supervisory and local geometric information to construct an informative kernel. The advantage of the proposed method is that SAKCF can learn the kernel and data representation simultaneously, which makes two learning processes mutually adapt to each other. In addition, we proposed an algorithm to efficiently solve the problem,

CRediT authorship contribution statement

Wenhui Wu: Conceptualization, Formal analysis, Writing – original draft, Writing – review & editing, Validation. Junhui Hou: Conceptualization, Formal analysis, Writing – original draft, Writing – review & editing, Validation. Shiqi Wang: Conceptualization, Formal analysis, Writing – original draft, Writing – review & editing, Validation. Sam Kwong: Conceptualization, Formal analysis, Writing – original draft, Writing – review & editing, Validation. Yu Zhou: Conceptualization, Formal analysis,

Declaration of Competing Interest

Authors declare that they have no conflict of interest.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (Grants 62006158 and 62176160), in part by the Hong Kong Innovation and Technology Commission (InnoHK Project CIMDA), in part by the Hong Kong GRF-RGC General Research Fund under Grant 11209819 (CityU 9042816) and Grant 11203820 (9042598), and in part by the Natural Science Foundation of Shenzhen (University Stability Support Program nos. 20200810150732001).

Wenhui Wu, Shenzhen University, [email protected]

References (44)

  • D.D. Lee et al.

    Learning the parts of objects by non-negative matrix factorization

    Nature

    (1999)
  • D.D. Lee et al.

    Algorithms for non-negative matrix factorization

    Proc. Adv. Neural Inf. Process. Syst.

    (2001)
  • D. Wang et al.

    Semi-supervised nonnegative matrix factorization via constraint propagation

    IEEE Trans. Cybern.

    (2016)
  • W. Wu et al.

    Simultaneous dimensionality reduction and classification via dual embedding regularized nonnegative matrix factorization

    IEEE Trans. Image Process.

    (2019)
  • W. Wu et al.

    Positive and negative label-driven nonnegative matrix factorization

    IEEE Trans. Circ. Syst. Vid.

    (2020)
  • W. Xu et al.

    Document clustering by concept factorization

    Proc. Int’l Conf. Res. Develop. Inf. Retrieval

    (2004)
  • D. Cai et al.

    Locally consistent concept factorization for document clustering

    IEEE Trans. Knowl. Data Eng.

    (2010)
  • H. Liu et al.

    Local coordinate concept factorization for image representation

    IEEE Trans. Neural Netw. Learn. Syst.

    (2013)
  • X. Pei et al.

    Concept factorization with adaptive neighbors for document clustering

    IEEE Trans. Neural Netw. Learn. Syst.

    (2018)
  • Z. Zhang et al.

    Flexible auto-weighted local-coordinate concept factorization: a robust framework for unsupervised clustering

    IEEE Trans. Knowl. Data Eng.

    (2021)
  • S. Ma et al.

    Self-representative manifold concept factorization with adaptive neighbors for clustering

    Proc. 27th Int. Joint Conf. Artif. Intell.

    (2018)
  • H. Liu et al.

    Constrained concept factorization for image representation

    IEEE Trans. Cybern.

    (2014)
  • Cited by (2)

    • Latent-space Unfolding for MRI Reconstruction

      2023, MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia

    Wenhui Wu, Shenzhen University, [email protected]

    Junhui Hou, City University of Hong Kong, [email protected]

    Shiqi Wang, City University of Hong Kong, [email protected]

    Sam Kwong, City University of Hong Kong, [email protected]

    Yu Zhou, Shenzhen University, [email protected]

    View full text