Multi-metric learning by a pair of twin-metric learning framework

Zhang, Min; Yang, Liming; Yuan, Chao; Ren, Qiangqiang

doi:10.1007/s10489-022-03330-9

Multi-metric learning by a pair of twin-metric learning framework

Published: 02 April 2022

Volume 52, pages 17490–17507, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Min Zhang¹,
Liming Yang¹,
Chao Yuan² &
…
Qiangqiang Ren²

290 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Multi-metric learning is important for improving classification performance since learning a single metric is usually insufficient for complex data. The existing multi-metric learning methods are based on the triplet constraints, and thus are with high computing complexity. In this work, we propose an efficient multi-metric learning framework by a pair of two-metric learning schemes (called TMML) to jointly train two local metrics and a global metric, where the distances between samples are automatically adjusted to maximize classification margin. Instead of the triplet constraints, the proposed TMML is based on the pair constraints to reduce the computational burden. Moreover, a global regularization is introduced to improve generalization and control overfitting. The proposed TMML improves the limitation of a single metric, where a pair of local metrics are interrelated to conduct adaptation for the local characteristics, while global metrics are to depict the common properties from all the data. Furthermore, we develop an alternating direction iterative algorithm to optimize the proposed TMML. The convergence of the algorithm is analyzed theoretically. Numerical experiments are carried out on different scale datasets. Under different evaluation criteria, experiments show that the proposed TMML is superior to the single metric learning methods, and achieves better performance than other state-of-the-art multi-metric learning methods in most cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization

Article 15 April 2024

Hyperbolic Deep Learning in Computer Vision: A Survey

Article Open access 26 March 2024

Sparse semi-supervised multi-label feature selection based on latent representation

Article Open access 17 April 2024

References

Davis J V, Kulis B, Jain P, Sra S, Dhillon I. S. (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on Machine learning, pp 209–216
Dewei L., Yingjie T (2018) Survey and experimental study on metric learning methods. Neural Netw 105:447–462
Article Google Scholar
Li C. H., Jing L., Li H (2014) Naive bayes for value difference metric. Front Comput Sci 8(2):255–264
Article MathSciNet Google Scholar
Nguyen B, Baets BD (2018) An approach to supervised distance metric learning based on difference of convex functions programming. Pattern Recogn 81:562–574
Article Google Scholar
Zadeh P. H., Hosseini R., Sra S. (2016) Geometric mean metric learning. In: international conference on machine learning (ICML), pp 2464–2471
Ye H J, Zhan D., Li CN, Jiang Y (2020) Learning multiple local metrics: Global consideration helps. IEEE Trans Pattern Anal Mach Intell 42(7):1968–1712
Article Google Scholar
Li C H, Jing L, Li H., Wu J., Zhang P (2017) Toward value difference metric with attribute weighting. Knowl Inf Syst 50(3):795–825
Article Google Scholar
Noh Y K, Zhang BT, Lee DD (2018) Generative local metric learning for nearest neighbor classification. IEEE Trans Pattern Anal Mach Intell 40(1):106–118
Article Google Scholar
Dong M, Wang Y, Yang X et al (2018) Learning Local Metrics and Influential Regions for Classification. IEEE Trans Pattern Anal Mach Intell:1–8. https://doi.org/10.1109/TPAMI.2019.2914899
Ye H J, Zhan DC, Jiang Y et al (2019) What makes objects similar: a unified Multi-Metric learning approach. IEEE Trans Pattern Anal Mach Intell 41(5):1257–1270
Article Google Scholar
Bac N, Ferri F J, Carlos M, Bernard D B (2019) An efficient method for clustered multi-metric learning. Inf Sci 471:149– 163
Article MathSciNet MATH Google Scholar
Zuo W, Wang F, Zhang D, Lin L, Huang Y, Meng D, Zhang L (2017) Distance metric learning via iterated support vector machines. IEEE Trans Image Process 99:1–1
MathSciNet MATH Google Scholar
Shichao Kan A, Shichao K, zhang L, He ZH, Cen Y, Chen SH, Zhou J et al (2020) Metric learning-based kernel transformer with triplets and label constraints for feature fusion. Pattern Recogn 99:107086
Article Google Scholar
Meyer G, Bonnabel S, Sepulchre RJ (2011) Regression on Fixed-Rank Positive Semidefinite Matrices: A Riemannian Approach. J Mach Learn Res 12(3):593–625
MathSciNet MATH Google Scholar
Weinberger K Q, Saul L. K. (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(1):207–244
MATH Google Scholar
Weinberger K Q, Saul L. K. (2008) Fast solvers and efficient implementations for distance metric learning. Proceedings of the Twenty-Fifth International Conference on Machine Learning, pp 1160–1167
Liang J., Hu Q et al (2018) Efficient multi-modal geometric mean metric learning. Pattern Recogn 75:188–198
Article Google Scholar
Li D. W., Tian Y. J. (2017) Global and local metric learning via eigenvectors. Knowl-Based Syst 116:152–162
Article Google Scholar
Domeniconi C, Peng J, Gunopulos D. (2001) An Adaptive Metric Machine for Pattern Classification. In: Adv Neural Inf Process Syst 13:458-464
Bohne J, Ying Y, Gentric S, Pontil M. (2004) Large margin local metric learning. In: Proceedings of the European Conference on Computer Vision, pp 679–694
Ying Y, Li P (2012) Distance metric learning with eigenvalue optimization. J Mach Learn Res 13(1):1–26
MathSciNet MATH Google Scholar
Zuo W, Wang F, Zhang D, et al. (2017) Distance metric learning via iterated support vector machines. IEEE Trans Image Process 26(10):4937–4950
Article MathSciNet MATH Google Scholar
Wang F, Zuo W, Zhang L et al (2015) A kernel classification framework for metric learning. IEEE Trans Neural Netw Learn Syst 26(9):1950–1962
Article MathSciNet Google Scholar
Dong M, Wang Y, Yang X, et al. (2018) Local metrics and influential regions for classification. IEEE Transactions on Pattern Analysis & Machine Intelligence
Parameswaran K. Q. (2010) Weinberger Large margin multi-task metric learning. In: Advances in neural information processing systems, pp 1867–1875
Shalev-Shwartz S, Singer Y, Srebro N , Cotter A (2011) Pegasos: primal estimated sub-gradient solver for SVM. Math Programm 127(1):3–30
Article MathSciNet MATH Google Scholar
Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494
Article MathSciNet MATH Google Scholar
Ruan Y., Xiao Y., Hao Z, Liu B (2021) A Convex Model for Distance Metric Learning. IEEE Trans Neural Netw Learn Syst: 1–14
Blake C, Merz C (1998) UCI Repository for Machine Learning Databases. http://www.ics.uci.edu/~mlearn/MLRepository.html
Chang C, Lin C (2001) LIBSVM data set. https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874
Article MathSciNet Google Scholar
Demisar J, Schuurmans D. (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30
MathSciNet Google Scholar
Dunn O. J. (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52–64
Article MathSciNet MATH Google Scholar
Wu Z., Zhu H., Li G, Cui Z, huang H, Li J, chen E, Xu G (2017) An efficient wikipedia semantic matching approach to text document classification. Inf Sci 393:15–28
Article MathSciNet Google Scholar
Wu Z, Li G, Liu Q, Xu G, Che E (2018) Covering the sensitive subjects to protect personal privacy in personalized recommendation. IEEE Trans Serv Comput 11(3):493–506
Article Google Scholar
Bai B., Li G., Wang S., Wu Z., Yan W (2021) Time series classification based on multi-feature dictionary representation and ensemble learning. Expert Syst Appl Expert Syst Appl 169:114162
Article Google Scholar

Download references

Acknowledgements

This work is supported by National Nature Science Foundation of China (11471010, 11271367). Moreover, the authors thank the referees and editor for their constructive comments to improve the paper.

Author information

Authors and Affiliations

College of Science, China Agricultural University, Beijing, China
Min Zhang & Liming Yang
College of Information and Electrical Engineering, China Agricultural University, Beijing, China
Chao Yuan & Qiangqiang Ren

Authors

Min Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Liming Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Qiangqiang Ren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liming Yang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix:

Conditions to guarantee the convergence of block-coordinate descent method [27].

Assume that the objective function to be optimized has the following form:

$$ \begin{array}{@{}rcl@{}} f(x_{1},\cdots,x_{n})=f_{0}(x_{1},\cdots,x_{n})+\sum\limits_{k=1}^{n}f_{k}(x_{k}) \end{array} $$

(42)

Suppose that f,f₀,f₁,⋯ ,f_N satisfy Assumptions B1-B3 and that f₀ satisfies either the assumption C1 or C2. Also, assume that the sequence $\{x^{r}=({x_{1}^{r}},\cdots ,{x_{N}^{r}})\}_{r=0,1,\cdots }$ generated by the BCD method using the essentially cyclic rule is defined. Then, either $\{f(x^{r})\}\downarrow -\infty $, or else every cluster point z = (z₁,⋯ ,z_N) is a coordinatewise minimum point of f.

(B1)
f₀ is continuous on dom f₀.
(B2)
For each k ∈{1,⋯ ,N} and (x_j)_j≠k, the function $x_{k}\rightarrow f(x_{1},\cdots ,x_{N})$ is quasiconvex and hemivariate.
(B3)
f₀,f₁,⋯ ,f_N are lower semicontinuous.
(C1)
dom f₀ is open and f₀ tends to $\infty $ at every boundary point of dom f₀.
(C2)
dom f₀ = Y₁×,⋯ ,×Y_N, for some $Y_{k} \subseteq R^{n_{k}},k=1,\cdots ,N$

Theorem

[27]. Suppose that f,f₀,⋯ ,f_N satisfy assumptions B1-B3 and that f₀ satisfies either assumption C1 or C2. Using the essentially cyclic, the block-coordinate descent method converges to an optimal point of f.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, M., Yang, L., Yuan, C. et al. Multi-metric learning by a pair of twin-metric learning framework. Appl Intell 52, 17490–17507 (2022). https://doi.org/10.1007/s10489-022-03330-9

Download citation

Accepted: 01 February 2022
Published: 02 April 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s10489-022-03330-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-metric learning by a pair of twin-metric learning framework

Abstract

Access this article

Similar content being viewed by others

PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization

Hyperbolic Deep Learning in Computer Vision: A Survey

Sparse semi-supervised multi-label feature selection based on latent representation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix:

Theorem

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-metric learning by a pair of twin-metric learning framework

Abstract

Access this article

Similar content being viewed by others

PolieDRO: a novel classification and regression framework with non-parametric data-driven regularization

Hyperbolic Deep Learning in Computer Vision: A Survey

Sparse semi-supervised multi-label feature selection based on latent representation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix:

Appendix:

Theorem

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation