Semi-supervised adaptive kernel concept factorization

doi:10.1016/j.patcog.2022.109114

Pattern Recognition

Volume 134, February 2023, 109114

https://doi.org/10.1016/j.patcog.2022.109114 Get rights and content

Abstract

Kernelized concept factorization (KCF) has shown its advantage on handling data with nonlinear structures; however, the kernels involved in the existing KCF-based methods are empirically predefined, which may compromise the performance. In this paper, we propose semi-supervised adaptive kernel concept factorization (SAKCF), which integrates the data representation and kernel learning into a unified model to make the two learning processes adapt to each other. SAKCF extends traditional KCF in a semi-supervised manner, which encourages the high-dimensional representation to be consistent with both the limited supervisory and local geometric information. Besides, an alternating iterative algorithm is proposed to solve the resulting constrained optimization problem. Experimental results on six real-world data sets verify the effectiveness and advantages of our SAKCF over state-of-the-art methods when applied on the clustering task.

Introduction

Nonnegative matrix factorization (NMF) [1], [2] has shown its effectiveness for the representation of linear data, and many extended NMF algorithms have been proposed to improve its performance on data representation [3], [4], [5], [6], [7] and clustering [8], [9], [10] in the past decade. Particularly, as a variant of NMF, concept factorization (CF) [11] models its basis as a linear combination of the input data samples in the original space, which inherits the strengths of NMF, such as the physically meaningful representations and effectiveness for representing linear data. To better capture the underlying structure, many CF-based models incorporating local geometric information were proposed. For examples, Cai et al. [12] proposed locally consistent CF, in which the low-dimensional representation preserves the manifold structure by introducing a graph regularization. Liu et al. [13] imposed a locality constraint into CF and proposed local-coordinate CF, where the sample is represented by only a few nearby bases. Pei et al. [14] improved the locally consistent CF by jointly learning an adaptive weight matrix for graph regularization and factored matrices for CF. Zhang et al. [15] introduced an auto-weighted reconstruction graph learning strategy into local-coordinate CF, where L $_{2, 1}$ norm is imposed to enhance the robustness of CF to noise. Instead of using F-norm in locally consistent CF, Peng et al. [16] proposed a correntropy based CF model which adopts correntropy to measure the similarity between the input data matrix and reconstructed one. Ma et al. [17] improved the conventional CF by introducing both manifold and adaptive neighbor structures into a unified model.

In addition, considering that a small amount of supervisory information may be available in practical applications, CF has been extended in a semi-supervised manner. Liu et al. [18] proposed constrained CF, which introduces a label indicator matrix into the low-dimensional representation to enforce the data samples in the same class to share the identical low-dimensional representation. Based on [18], Shu et al. [19] proposed a local regularization constrained CF, which takes both geometric and discriminative information into account. Zhang et al. [20] relaxed the reconstruction error of the constrained CF, and proposed a joint classification and data representation framework. Zhou et al. [21] proposed a novel robust semi-supervised CF, which integrates the robust adaptive embedding and maximum correntropy criterion based CF into a unified model.

Technically, to ensure basis and original data sample are in the same space, CF aims to represent basis by the linear combination of original data samples. However, when applied on nonlinearly distributed data, the way that CF formulates the basis may lead to an uninterpretable basis. As an example, Fig. 1 intuitively illustrates the linear combination of the data samples in the original data space, where the weight of linear combination is set to 1. Since kernel methods can deal with nonlinear correlation in data, both CF [11] and locally consistent CF [12] have been extended into their kernel versions, respectively, which improve the clustering performance. Li et al. [22] proposed a manifold kernel concept factorization (MKCF), which introduces the manifold kernel learning into CF to encode the geometric information in the kernel space. Similar to MKCF, Li et al. [23] presented a manifold kernel based local-coordinate CF, which utilizes a manifold kernel to improve the capability of local-coordinate CF on handling nonlinearly distributed data. Instead of selecting an optimal kernel, Li et al. [24] proposed a multi-kernel based CF, which takes advantage of several kernels by linearly fusing them.

Although the existing kernel CF (KCF)-based methods have demonstrated their advantages and effectiveness on handling linearly inseparable data, they construct the kernel empirically, and it is not clear whether the predefined kernel is able to adapt the subsequent learning of the low-dimensional representation well. To this end, in this paper, we propose a model, namely semi-supervised adaptive kernel concept factorization (SAKCF), to jointly learn an adaptive kernel and low-dimensional data representation. In addition to incorporating the two learning processes into a unified model, we consider the situation where the limited supervisory information is given, and formulate the proposed method in a semi-supervised scenario. As illustrated in Fig. 2, the proposed SAKCF performs CF in a learned high-dimensional kernel space, in which the high-dimensional representation is consistent with the limited supervisory and local geometric information. Different from locally consistent CF (LCCF), constrained CF (CCF) or local regularization constrained CF (LRCCF), the proposed SAKCF encodes supervisory and geometric information into the kernel space to obtain an informative kernel, rather than directly uses these two types of information to regularize the low-dimensional representation. Additionally, the way that the proposed model utilizes supervisory and geometric information can be extended to LCCF, CCF or LRCCF as well.

The main contributions of SAKCF lie in the following three-fold:

•
SAKCF performs CF in an adaptively learned kernel space, where the limited supervisory and local geometrical information are encoded. The proposed model leverages the power of kernel and limited supervisory information, and such a semi-supervised learning framework can be naturally extended to several variants of CF (e.g., LCCF and CCF).
•
SAKCF integrates the data representation and kernel learning in a unified model to make the two learning processes adapt to each other.
•
We also propose an iterative updating optimization scheme for solving the optimization problem. The theoretic analysis including convergence and complexity analyses of the proposed algorithm is presented.

The rest of this paper is organized as follows. In Section 2, we briefly review the rationales of NMF, CF, LCCF, and CCF. In Section 3, we present the proposed SAKCF. We evaluate the performance of SAKCF when used for the clustering task in Section 4. Finally, Section 5 draws the conclusion of this paper.

Notation: Throughout this paper, we use lowercase letters, boldface lowercase letters and uppercase letters to denote scalars, vectors, and matrices, respectively. The elements of a matrix are denoted by lowercase letters with two subscripts, which indicate the row and column indices, respectively. For a matrix $A$ , we use ${∥ A ∥}_{F}$ , Tr( $A$ ), and $A^{T}$ to denote its Frobenius norm, trace, and transpose, respectively. $N_{k} (\cdot)$ indicates the set containing the $k$ nearest neighbors of a typical data sample. For some space $F$ , ${∥ \cdot ∥}_{F}$ and $< \cdot, \cdot >_{F}$ denote the distance metric and inner product in this space, respectively. $I_{l}$ denotes an identity matrix of size $l \times l$ , and $0$ is a matrix with all entries being 0.

Section snippets

Nonnegative matrix factorization and concept factorization

Given a nonnegative data set $X = {x_{1}, x_{2}, \dots, x_{n}}$ including $n$ samples of each dimension $m$ , it can be organized as a matrix $X \in R^{m \times n} = [x_{1}, x_{2}, \dots, x_{n}]$ , where each column represents a vectorial data sample. NMF aims to decompose $X \in R^{m \times n}$ into the product of two nonnegative matrices with lower dimensions, i.e., $U \in R^{m \times d}$ and $V \in R^{n \times d}$ where $d < \min {m, n}$ . Mathematically, to find such pair of matrices, NMF proposes to solve the following problem $\min_{U, V} {∥ X - U V^{T} ∥}_{F}^{2} s . t . U \geq 0, V \geq 0,$ where $U (resp. V) \geq 0$ denotes each element of $U (resp. V)$

Problem formulation

As mentioned earlier, for linearly inseparable data, the performance of CF applied in the original data space will be limited since its basis is approximated by the linear combination of the original data. KCF is consequently proposed by projecting the data into a high-dimensional reproducing kernel Hilbert space. Kernel propagation [25], [26] utilizes insufficient supervisory information to learn an informative kernel, which will facilitate the following learning tasks, and can be used in KCF

Experiment

In this section, we carried out extensive experiments in terms of the clustering task to evaluate the proposed method.

Conclusion

In this paper, we have proposed a semi-supervised adaptive kernel concept factorization model, which extends the conventional CF to the kernel version and introduces supervisory and local geometric information to construct an informative kernel. The advantage of the proposed method is that SAKCF can learn the kernel and data representation simultaneously, which makes two learning processes mutually adapt to each other. In addition, we proposed an algorithm to efficiently solve the problem,

CRediT authorship contribution statement

Wenhui Wu: Conceptualization, Formal analysis, Writing – original draft, Writing – review & editing, Validation. Junhui Hou: Conceptualization, Formal analysis, Writing – original draft, Writing – review & editing, Validation. Shiqi Wang: Conceptualization, Formal analysis, Writing – original draft, Writing – review & editing, Validation. Sam Kwong: Conceptualization, Formal analysis, Writing – original draft, Writing – review & editing, Validation. Yu Zhou: Conceptualization, Formal analysis,

Declaration of Competing Interest

Authors declare that they have no conflict of interest.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (Grants 62006158 and 62176160), in part by the Hong Kong Innovation and Technology Commission (InnoHK Project CIMDA), in part by the Hong Kong GRF-RGC General Research Fund under Grant 11209819 (CityU 9042816) and Grant 11203820 (9042598), and in part by the Natural Science Foundation of Shenzhen (University Stability Support Program nos. 20200810150732001).

Wenhui Wu, Shenzhen University, [email protected]

References (44)

H. Xiong et al.
Elastic nonnegative matrix factorization
Pattern Recognit.
(2019)
E. Arabnejad et al.
PSI: patch-based script identification using non-negative matrix factorization
Pattern Recognit.
(2017)
S. Peng et al.
Robust semi-supervised nonnegative matrix factorization for image clustering
Pattern Recognit.
(2021)
R. Hedjam et al.
Nmf with feature relationship preservation penalty term for clustering problems
Pattern Recognit.
(2021)
D. Tolić et al.
A nonlinear orthogonal non-negative matrix factorization approach to subspace clustering
Pattern Recognit.
(2018)
S. Peng et al.
Correntropy based graph regularized concept factorization for clustering
Neurocomputing
(2018)
Z. Shu et al.
Local regularization concept factorization and its semi-supervised extension for image representation
Neurocomputing
(2015)
P. Li et al.
Clustering analysis using manifold kernel concept factorization
Neurocomputing
(2012)
L. Fei-Fei et al.
Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories
Comput. Vis. Image Underst.
(2007)
B. Pan et al.
Ideal regularization for learning kernels from labels
Neural Netw.
(2014)

D.D. Lee et al.

Learning the parts of objects by non-negative matrix factorization

Nature

(1999)

D.D. Lee et al.

Algorithms for non-negative matrix factorization

Proc. Adv. Neural Inf. Process. Syst.

(2001)

D. Wang et al.

Semi-supervised nonnegative matrix factorization via constraint propagation

IEEE Trans. Cybern.

(2016)

W. Wu et al.

Simultaneous dimensionality reduction and classification via dual embedding regularized nonnegative matrix factorization

IEEE Trans. Image Process.

(2019)

W. Wu et al.

Positive and negative label-driven nonnegative matrix factorization

IEEE Trans. Circ. Syst. Vid.

(2020)

W. Xu et al.

Document clustering by concept factorization

Proc. Int’l Conf. Res. Develop. Inf. Retrieval

(2004)

D. Cai et al.

Locally consistent concept factorization for document clustering

IEEE Trans. Knowl. Data Eng.

(2010)

H. Liu et al.

Local coordinate concept factorization for image representation

IEEE Trans. Neural Netw. Learn. Syst.

(2013)

X. Pei et al.

Concept factorization with adaptive neighbors for document clustering

IEEE Trans. Neural Netw. Learn. Syst.

(2018)

Z. Zhang et al.

Flexible auto-weighted local-coordinate concept factorization: a robust framework for unsupervised clustering

IEEE Trans. Knowl. Data Eng.

(2021)

S. Ma et al.

Self-representative manifold concept factorization with adaptive neighbors for clustering

Proc. 27th Int. Joint Conf. Artif. Intell.

(2018)

H. Liu et al.

Constrained concept factorization for image representation

IEEE Trans. Cybern.

(2014)

Cited by (2)

Self-representative kernel concept factorization
2023, Knowledge-Based Systems
Kernel concept factorization (KCF) has successfully utilized kernel trick to conduct matrix factorization in the kernel space. However, conventional KCF methods usually define kernel in advance, which limits their ability to exploit the power of kernel. This paper proposes a semi-supervised self-representative kernel concept factorization (S $^{3}$ RKCF) method to integrate adaptive kernel learning and low-dimensional data representation learning into a unified model. Technically, an adaptive local geometric structure is acquired in the KCF-induced self-representation space, and then it facilitates data representation learning simultaneously. Furthermore, to enhance the discriminability of data representation, limited supervisory information is imposed in the formulated optimization problem as constraints. In this way, our model can learn kernel and discriminative low-dimensional representation adaptively and iteratively. To solve the optimization problem, an alternating iterative algorithm is designed with convergence guarantee. The performance of our proposed S $^{3}$ RKCF is evaluated through clustering and classification tasks, and the experimental results on eight real-world data sets demonstrate its effectiveness compared to state-of-the-art methods.
Latent-space Unfolding for MRI Reconstruction
2023, MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia

Wenhui Wu, Shenzhen University, [email protected]

Junhui Hou, City University of Hong Kong, [email protected]

Shiqi Wang, City University of Hong Kong, [email protected]

Sam Kwong, City University of Hong Kong, [email protected]

Yu Zhou, Shenzhen University, [email protected]

View full text

Semi-supervised adaptive kernel concept factorization

Abstract

Introduction

Section snippets

Nonnegative matrix factorization and concept factorization

Problem formulation

Experiment

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Neurocomputing

Neurocomputing

Neurocomputing

Comput. Vis. Image Underst.

Neural Netw.

Learning the parts of objects by non-negative matrix factorization

Nature

Algorithms for non-negative matrix factorization

Proc. Adv. Neural Inf. Process. Syst.

Semi-supervised nonnegative matrix factorization via constraint propagation

IEEE Trans. Cybern.

Simultaneous dimensionality reduction and classification via dual embedding regularized nonnegative matrix factorization

IEEE Trans. Image Process.

Positive and negative label-driven nonnegative matrix factorization

IEEE Trans. Circ. Syst. Vid.

Document clustering by concept factorization

Proc. Int’l Conf. Res. Develop. Inf. Retrieval

Locally consistent concept factorization for document clustering

IEEE Trans. Knowl. Data Eng.

Local coordinate concept factorization for image representation

IEEE Trans. Neural Netw. Learn. Syst.

Concept factorization with adaptive neighbors for document clustering

IEEE Trans. Neural Netw. Learn. Syst.

Flexible auto-weighted local-coordinate concept factorization: a robust framework for unsupervised clustering

IEEE Trans. Knowl. Data Eng.

Self-representative manifold concept factorization with adaptive neighbors for clustering

Proc. 27th Int. Joint Conf. Artif. Intell.

Constrained concept factorization for image representation

IEEE Trans. Cybern.