Elsevier

Knowledge-Based Systems

Volume 236, 25 January 2022, 107787
Knowledge-Based Systems

Low-rank supervised and semi-supervised multi-metric learning for classification

https://doi.org/10.1016/j.knosys.2021.107787Get rights and content

Abstract

Multi-metric learning is an important technique for improving classification performance since learning a single metric is usually insufficient for complex data. Most of the existing multi-metric learning approaches have high computational complexity. In this work, two multi-metric learning frameworks proposed to perform supervised and semi-supervised classifications respectively. Based on the frameworks, we first design a low-rank multi-metric learning model (LSMML) for supervised classification, in which multiple local class metrics as well as one global metric are jointly trained. A joint regularization scheme, composed of LogDet divergence and low-rank term, is also well-designed to incorporate prior knowledge for improving generalization. By learning appropriate metrics, LSMML not only captures the local nonlinear discriminate information of each class to reduce the probability of misclassification but also enhances the stability, alleviates the computational burden, and avoids the risk of overfitting. Then, we extend LSMML to a semi-supervised learning scenario and propose a low-rank semi-supervised multi-metric learning approach (LSeMML) to process data with scarce labels. Alternating iterative algorithms are designed to optimize both LSMML and LSeMML. At each iteration, we only require to perform geodesic convex optimizations, with closed-form solutions and low computational cost. In supervised and semi-supervised settings respectively, numerical simulations are carried out on different databases, which shows that the proposed LSMML and LSeMML have a simple form, fast training speed, and good classification performance.

Introduction

Heterogeneous data is constantly booming with the diversification of information. How to appropriately measure the similarity between data with complex structures has been comprehensively surveyed in recent years, yet it remains an open problem. Distance is usually regarded as one of the most common measurements of similarity, which is successfully applied to many classic approaches such as kNN classifier [1], k-mean cluster [2], and so on.

As a hot and meaningful research direction, distance metric learning has attracted widespread attention and application over the past years, such as person identification [3], [4], [5], image retrieval [6], [7], robust learning [8], [9] and so on. Most existing approaches [10], [11], [12] aim at learning one global metric to capture the overall geometric information over the input space. For example, Davis et al. [13] proposed the information-theoretic metric learning (ITML) algorithm, which makes full use of the prior information to pull samples in the same class close and push samples in different classes far away. The large margin nearest neighbors (LMNNs) algorithms [14] were designed by Weinberger and Saul, which incorporates the local discriminant information to pull similar neighbors closer. Regrettably, in numerous practical applications [15], [16], [17], there is no single distance metric that can satisfy all constraints of complex data, hence, single metric learning algorithms have a high probability of failure to capture the inherent nonlinear structure of complex data. One of the most common treatment measures is to kernelize most line metric learning methods [18], [19] to make up the deficiency of single metric learning. But the computational complexity is intolerable when the number of samples is huge. Multi-metric learning is another way to perform complex data analysis. Usually, learning multiple metrics for the input space is a more effective and low-cost strategy than kernel metric learning.

To our best knowledge, multi-metric learning approaches can be divided into several directions. The first research direction is the multi-feature representation which mainly processes high-dimensional data. We first extract multiple low-dimensional data from high-dimensional data according to task requirements, and then learn a local metric on each low-dimensional data to capture the local attribute information. Based this guideline, a large number of multi-metric algorithms were born [20], [21], [22], [23] to deal with complex data. The second research direction is to learn a local distance metric for each sample, including test samples or train samples [24], [25]. For instance, Domeniconi et al. [26] learned one local distance metric for each test sample, so adaptive metric nearest neighbor (ADAMENN) was developed. Mu et al. [27] proposed local discriminative distance metrics (LDDM) algorithm to learn a distance metric for each training sample. By doing this, each local metric can mine the feature information of each sample to the maximum extent to achieve a good classification effect. But this learning strategy also undergoes the risk of overfitting, and it will bear the curse of time when the number of samples is too large. The third research direction is to learn a local class metric for each local region [28], [29], [30]. For example, Weinberger and Saul [14] proposed multi-metric LMNN (mmLMNN), which treats each class as a local region to learn multiple local class metrics. Cluster multi-metric learning (CMML) [31] proposed by Nguyen first utilizes k-means for clustering, and then learns one local metric for each cluster. The third direction of multi-metric learning makes different local metrics can be learned, which facilitates processing heterogeneous data. Compared to the second research direction, it reduces the probability of overfitting and enjoys less time consumption. Unfortunately, the time complexity of most existing multi-metric learning approaches of the third direction is still not optimistic. Many approaches, such as mmLMNN, ignore the learning of the global metric, which hinders the spread of side information and reduces the stability of models. CMML, as supervised multi-metric learning based on clustering, weakens the impact of label information.

In this paper, motivated by the above models, we study a joint learning scenario illustrated in Fig. 1, which aims at learning one local class metric for each class and one global metric for all data. The local class metric is utilized to capture the nonlinear geometry distribution of each class when the global metric preserves the shared attribute information and promotes the dissemination of side information. On the one hand, this scenario gives full play to the role of label information to increase classification performance and improve some shortcomings of multiple local metric learning such as high time consumption and instability; on the other hand, it can adaptively perform multi-metric learning or single-metric learning by adjusting parameters as required. Furthermore, we attempt to jointly learn one global metric and multiple local class metrics in a semi-supervised framework. Based on the guideline of joint metric learning, a low-rank supervised multi-metric approach (called LSMML) is proposed to better deal with heterogeneously distributed data at low time cost. Specifically, LSMML exists the following merits:

(1) LSMML adopts a joint multi-metric learning strategy to improve classification performance and maintain the stability. If the global metric M0 performs well in every local counterpart, the local metric McM0,c=1,2, i.e., it can degenerate into global metric learning. But if local metrics Mc,c=1,2 are dissimilar to global metric M0, it will conduct local multi-metric learning.

(2) The low-rank term is introduced to constrain the generalization error bound of LSMML. The smaller the generalization error bound is, the higher the generalization performance of LSMML is. Moreover, the introduction of low-rank ideas enables the global metric to alleviate the impact of redundant features and reduce the risk of overfitting.

(3) LSMML adopts the Riemann metric (LogDet divergence) instead of the Euclidean metric to better exploit the difference and complementary between global metric and local class metrics in the Riemannian manifold to improve classification performance.

(4) LSMML deals with a convex and geodesic convex function and obtains a globally optimal closed-form solution in each iteration so that LSMML enjoys less computational complexity and good classification performance.

(5) Further, we extend the LSMML into semi-supervised learning and propose a low-rank semi-supervised multi-metric classification approach (LSeMML). In LSeMML, the manifold regularization term is introduced to make full use of the geometry structure information embedded in labeled and unlabeled data. This is a successful attempt to incorporate the idea of multi-metric learning into semi-supervised learning.

The rest of this paper is organized as follows. Section 2 briefly reviews supervised and semi-supervised metric learning approaches and GMML. Section 3 and Section 4 give an intuitive description of the proposed LSMML and LSeMML. Section 5 analyzes and compares the experimental results of all approaches to show the performance of the proposed LSMML and LSeMML. Ultimately, we present the conclusion in Section 6.

Section snippets

Related work

In this section, we briefly review three topics: the development of supervised and semi-supervised metric learning, the approach GMML [32] and the framework of semi-supervised metric learning.

Low-rank geometric mean multi-metric learning

In this section, a novel supervised multi-metric learning approach is proposed, which jointly learns one global metric and multiple local class metrics to capture the complex structure information of each class.

Semi-supervised low-rank geometric mean multi-metric learning

Supervised metric learning algorithms take advantage of the abundant label information of data to get the distance metric we want. However, in practice, collecting labels is a time-consuming and challenging task, which gives birth to a large number of semi-supervised metric learning approaches. In this section, we extend the proposed LSMML into the semi-supervised learning framework.

Experiment

In this section, the numerical experiments are elaborated to explore the validity and scalability of the proposed LSMML and LSeMML.

Conclusion

In this work, we adopt a divide-and-conquer strategy to handle supervised and semi-supervised classification problems. First, a supervised multi-metric learning framework (ASMMLF) is proposed, which can perform global metric learning or multi-metric learning under different parameter settings. Based on ASMMLF, we design a specific model: low-rank supervised multi-metric approach (LSMML), which trains multiple local class metrics and one global metric according to sample distribution of each

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is supported by the National Nature Science Foundation of China (No. 11471010 and No. 11271367). Moreover, the authors thank the referees and editor for their constructive comments to improve the paper.

References (54)

  • WangQ. et al.

    Semi-supervised metric learning via topology preserving multiple semi-supervised assumptions

    Pattern Recognit.

    (2013)
  • CoverT. et al.

    Nearest neighbor pattern classification

    IEEE Trans. Inform. Theory

    (1967)
  • JainA.K.

    Data Clustering: 50 Years beyond K-Means

    (2008)
  • MaF. et al.

    True-color and grayscale video person re-identification

    IEEE Trans. Inf. Forensics Secur.

    (2020)
  • ZhuX. et al.

    Video-based person re-identification by simultaneously learning intra-video and inter-video distance metrics

    IEEE Trans. Image Process.

    (2018)
  • X. Jing, X. Zhu, F. Wu, X. You, Q. Liu, D. Yue, R. Hu, B. Xu, Super-resolution Person re-identification with...
  • YangP. et al.

    A deep metric learning approach for histopathological image retrieval

    Methods

    (2020)
  • ZhangZ. et al.

    Robust neighborhood preserving projection by nuclear/L2, 1-norm regularization for image feature extraction

    IEEE Trans. Image Process.

    (2017)
  • FuL. et al.

    Learning robust discriminant subspace based on joint L2, p- and L2, s-norm distance metrics

    IEEE Trans. Neural Netw. Learn. Syst.

    (2020)
  • GoldbergerJ. et al.

    Neighbourhood components analysis

  • ZuoW. et al.

    Distance metric learning via iterated support vector machines

    IEEE Trans. Image Process.

    (2017)
  • RuanY. et al.

    A convex model for support vector distance metric learning

    IEEE Trans. Neural Netw. Learn. Syst.

    (2021)
  • DavisJ.V. et al.

    Information-theoretic metric learning

  • Weinberger KilianQ. et al.

    Distance metric learning for large margin nearest neighbor classification

    J. Mach. Learn. Res.

    (2009)
  • ShenC. et al.

    Efficient dual approach to distance metric learning

    IEEE Trans. Neural Netw. Learn. Syst.

    (2014)
  • DuttaU.K. et al.

    Affinity Propagation Based Closed-Form Semi-Supervised Metric Learning Framework

    (2018)
  • KulisB.

    Metric learning: A survey

    Found. Trends Mach. Learn.

    (2013)
  • Cited by (0)

    View full text