Low-rank supervised and semi-supervised multi-metric learning for classification
Introduction
Heterogeneous data is constantly booming with the diversification of information. How to appropriately measure the similarity between data with complex structures has been comprehensively surveyed in recent years, yet it remains an open problem. Distance is usually regarded as one of the most common measurements of similarity, which is successfully applied to many classic approaches such as kNN classifier [1], k-mean cluster [2], and so on.
As a hot and meaningful research direction, distance metric learning has attracted widespread attention and application over the past years, such as person identification [3], [4], [5], image retrieval [6], [7], robust learning [8], [9] and so on. Most existing approaches [10], [11], [12] aim at learning one global metric to capture the overall geometric information over the input space. For example, Davis et al. [13] proposed the information-theoretic metric learning (ITML) algorithm, which makes full use of the prior information to pull samples in the same class close and push samples in different classes far away. The large margin nearest neighbors (LMNNs) algorithms [14] were designed by Weinberger and Saul, which incorporates the local discriminant information to pull similar neighbors closer. Regrettably, in numerous practical applications [15], [16], [17], there is no single distance metric that can satisfy all constraints of complex data, hence, single metric learning algorithms have a high probability of failure to capture the inherent nonlinear structure of complex data. One of the most common treatment measures is to kernelize most line metric learning methods [18], [19] to make up the deficiency of single metric learning. But the computational complexity is intolerable when the number of samples is huge. Multi-metric learning is another way to perform complex data analysis. Usually, learning multiple metrics for the input space is a more effective and low-cost strategy than kernel metric learning.
To our best knowledge, multi-metric learning approaches can be divided into several directions. The first research direction is the multi-feature representation which mainly processes high-dimensional data. We first extract multiple low-dimensional data from high-dimensional data according to task requirements, and then learn a local metric on each low-dimensional data to capture the local attribute information. Based this guideline, a large number of multi-metric algorithms were born [20], [21], [22], [23] to deal with complex data. The second research direction is to learn a local distance metric for each sample, including test samples or train samples [24], [25]. For instance, Domeniconi et al. [26] learned one local distance metric for each test sample, so adaptive metric nearest neighbor (ADAMENN) was developed. Mu et al. [27] proposed local discriminative distance metrics (LDDM) algorithm to learn a distance metric for each training sample. By doing this, each local metric can mine the feature information of each sample to the maximum extent to achieve a good classification effect. But this learning strategy also undergoes the risk of overfitting, and it will bear the curse of time when the number of samples is too large. The third research direction is to learn a local class metric for each local region [28], [29], [30]. For example, Weinberger and Saul [14] proposed multi-metric LMNN (mmLMNN), which treats each class as a local region to learn multiple local class metrics. Cluster multi-metric learning (CMML) [31] proposed by Nguyen first utilizes k-means for clustering, and then learns one local metric for each cluster. The third direction of multi-metric learning makes different local metrics can be learned, which facilitates processing heterogeneous data. Compared to the second research direction, it reduces the probability of overfitting and enjoys less time consumption. Unfortunately, the time complexity of most existing multi-metric learning approaches of the third direction is still not optimistic. Many approaches, such as mmLMNN, ignore the learning of the global metric, which hinders the spread of side information and reduces the stability of models. CMML, as supervised multi-metric learning based on clustering, weakens the impact of label information.
In this paper, motivated by the above models, we study a joint learning scenario illustrated in Fig. 1, which aims at learning one local class metric for each class and one global metric for all data. The local class metric is utilized to capture the nonlinear geometry distribution of each class when the global metric preserves the shared attribute information and promotes the dissemination of side information. On the one hand, this scenario gives full play to the role of label information to increase classification performance and improve some shortcomings of multiple local metric learning such as high time consumption and instability; on the other hand, it can adaptively perform multi-metric learning or single-metric learning by adjusting parameters as required. Furthermore, we attempt to jointly learn one global metric and multiple local class metrics in a semi-supervised framework. Based on the guideline of joint metric learning, a low-rank supervised multi-metric approach (called LSMML) is proposed to better deal with heterogeneously distributed data at low time cost. Specifically, LSMML exists the following merits:
(1) LSMML adopts a joint multi-metric learning strategy to improve classification performance and maintain the stability. If the global metric performs well in every local counterpart, the local metric , i.e., it can degenerate into global metric learning. But if local metrics are dissimilar to global metric , it will conduct local multi-metric learning.
(2) The low-rank term is introduced to constrain the generalization error bound of LSMML. The smaller the generalization error bound is, the higher the generalization performance of LSMML is. Moreover, the introduction of low-rank ideas enables the global metric to alleviate the impact of redundant features and reduce the risk of overfitting.
(3) LSMML adopts the Riemann metric (LogDet divergence) instead of the Euclidean metric to better exploit the difference and complementary between global metric and local class metrics in the Riemannian manifold to improve classification performance.
(4) LSMML deals with a convex and geodesic convex function and obtains a globally optimal closed-form solution in each iteration so that LSMML enjoys less computational complexity and good classification performance.
(5) Further, we extend the LSMML into semi-supervised learning and propose a low-rank semi-supervised multi-metric classification approach (LSeMML). In LSeMML, the manifold regularization term is introduced to make full use of the geometry structure information embedded in labeled and unlabeled data. This is a successful attempt to incorporate the idea of multi-metric learning into semi-supervised learning.
The rest of this paper is organized as follows. Section 2 briefly reviews supervised and semi-supervised metric learning approaches and GMML. Section 3 and Section 4 give an intuitive description of the proposed LSMML and LSeMML. Section 5 analyzes and compares the experimental results of all approaches to show the performance of the proposed LSMML and LSeMML. Ultimately, we present the conclusion in Section 6.
Section snippets
Related work
In this section, we briefly review three topics: the development of supervised and semi-supervised metric learning, the approach GMML [32] and the framework of semi-supervised metric learning.
Low-rank geometric mean multi-metric learning
In this section, a novel supervised multi-metric learning approach is proposed, which jointly learns one global metric and multiple local class metrics to capture the complex structure information of each class.
Semi-supervised low-rank geometric mean multi-metric learning
Supervised metric learning algorithms take advantage of the abundant label information of data to get the distance metric we want. However, in practice, collecting labels is a time-consuming and challenging task, which gives birth to a large number of semi-supervised metric learning approaches. In this section, we extend the proposed LSMML into the semi-supervised learning framework.
Experiment
In this section, the numerical experiments are elaborated to explore the validity and scalability of the proposed LSMML and LSeMML.
Conclusion
In this work, we adopt a divide-and-conquer strategy to handle supervised and semi-supervised classification problems. First, a supervised multi-metric learning framework (ASMMLF) is proposed, which can perform global metric learning or multi-metric learning under different parameter settings. Based on ASMMLF, we design a specific model: low-rank supervised multi-metric approach (LSMML), which trains multiple local class metrics and one global metric according to sample distribution of each
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work is supported by the National Nature Science Foundation of China (No. 11471010 and No. 11271367). Moreover, the authors thank the referees and editor for their constructive comments to improve the paper.
References (54)
- et al.
Learning non-metric visual similarity for image retrieval
Image Vis. Comput.
(2019) - et al.
A nearest-neighbor search model for distance metric learning
Inform. Sci.
(2021) - et al.
Efficient multi-modal geometric mean metric learning
Pattern Recognit.
(2018) - et al.
Multiview discriminative marginal metric learning for makeup face verification
Neurocomputing
(2019) - et al.
Local discriminative distance metrics ensemble learning
Pattern Recognit.
(2013) - et al.
An efficient method for clustered multi-metric learning
Inform. Sci.
(2019) - et al.
Global and local metric learning via eigenvectors
Knowl.-Based Syst.
(2017) - et al.
Marginal semi-supervised sub-manifold projections with informative constraints for dimensionality reduction and recognition
Neural Netw. : Official J. Int. Neural Netw. Soc.
(2012) - et al.
A selection metric for semi-supervised learning based on neighborhood construction
Inf. Process. Manage.
(2021) - et al.
Semi-supervised learning framework based on statistical analysis for image set classification
Pattern Recognit.
(2020)