research-article

Metric Learning via Penalized Optimization

Authors:

Sai WuAuthors Info & Claims

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Pages 656 - 664

https://doi.org/10.1145/3447548.3467369

Published: 14 August 2021 Publication History

Abstract

Metric learning aims to project original data into a new space, where data points can be classified more accurately using kNN or similar types of classification algorithms. To avoid trivial learning results such as indistinguishably projecting the data onto a line, many existing approaches formulate metric learning as a constrained optimization problem, like finding a metric that minimizes the distance between data points from the same class, with a constraint of ensuring a certain separation for data points from different classes, and then they approximate the optimal solution to the constrained optimization in an iterative way. In order to improve the classification accuracy as much as possible, we try to find a metric that is able to minimize the intra-class distance and maximize the inter-class distance simultaneously. Towards this, we formulate metric learning as a penalized optimization problem, and provide design guideline, paradigms with a general formula, as well as two representative instantiations for the penalty term. In addition, we provide an analytical solution for the penalized optimization, with which costly computation can be avoid, and more importantly, there is no need to worry about the convergence rates or approximation ratios any more. Extensive experiments on real-world data sets are conducted, and the results verify the effectiveness and efficiency of our approach.

Supplementary Material

WEBM File (slrec_20210619063028_sp.webm)

The presentation video for our work ?metric learning via penalized optimization?.

Download
175.97 MB

References

[1]

Aharon Bar-Hillel, Tomer Hertz, Noam Shental, and Daphna Weinshall. 2003. Learning distance functions using equivalence relations. In Proceedings of the 20th International Conference on Machine Learning (ICML 2003). 11--18.

[2]

Aurélien Bellet, Amaury Habrard, and Marc Sebban. 2013. A survey on metric learning for feature vectors and structured data. arXiv preprint arXiv:1306.6709 (2013).

[3]

Fatih Cakir, Kun He, Xide Xia, Brian Kulis, and Sclaroff Stan. 2019. Deep metric learning to rank. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019). 1861--1870.

[4]

Pierre Comon. 1994. Independent component analysis, a new concept? Signal Processing, Vol. 36, 3 (1994), 287--314.

Digital Library

[5]

Jason V Davis, Brian Kulis, Prateek Jain, Suvrit Sra, and Inderjit S Dhillon. 2007. Information-theoretic metric learning. In Proceedings of the 24th International Conference on Machine Learning (ICML 2007). 209--216.

Digital Library

[6]

Dheeru Dua and Casey Graff. 2017. UCI machine learning repository. http://archive.ics.uci.edu/ml

[7]

Jacob Goldberger, Sam Roweis, Geoffrey Hinton, and Ruslan Salakhutdinov. 2004. Neighbourhood components analysis. In Advances in Neural Information Processing Systems 17 (NIPS 2004). 513--520.

[8]

Jianping Gou and Zhang Yi. 2013. Locality-based discriminant neighborhood embedding. The Computer Journal, Vol. 56, 9 (2013), 1063--1082.

[9]

Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. 2009. TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV 2019). 309--316.

[10]

Xiaofei He, Deng Cai, Shuicheng Yan, and Hong-Jiang Zhang. 2005. Neighborhood preserving embedding. In Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV 2005), Vol. 2. 1208--1213.

[11]

Xiaofei He and Partha Niyogi. 2003. Locality preserving projections. In Advances in Neural Information Processing Systems 16 (NIPS 2003). 153--160.

[12]

Steven CH Hoi, Wei Liu, Michael R Lyu, and Wei-Ying Ma. 2006. Learning distance metrics with contextual constraints for image retrieval. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), Vol. 2. 2072--2078.

Digital Library

[13]

Junlin Hu, Jiwen Lu, and Yap-Peng Tan. 2014. Discriminative deep metric learning for face verification in the wild. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014). 1875--1882.

Digital Library

[14]

Sho Inaba, Carl T Fakhry, Rahul V Kulkarni, and Kourosh Zarringhalam. 2019. A free energy based approach for distance metric learning. In Proceedings of the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2019). 5--13.

Digital Library

[15]

Joseph B Kruskal. 1964. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, Vol. 29, 1 (1964), 1--27.

[16]

Daniel D Lee and H Sebastian Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature, Vol. 401, 6755 (1999), 788--791.

[17]

Xi Li, Chunhua Shen, Qinfeng Shi, Anthony Dick, and Anton Van Den Hengel. 2012. Non-sparse linear representations for visual tracking with online reservoir metric learning. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012). 1760--1767.

[18]

Daryl Lim, Gert Lanckriet, and Brian McFee. 2013. Robust structural metric learning. In Proceedings of the 30th International Conference on Machine Learning (ICML 2013). 615--623.

[19]

Thomas Mensink, Jakob Verbeek, Florent Perronnin, and Gabriela Csurka. 2012. Metric learning for large scale image classification: Generalizing to new classes at near-zero cost. In Proceedings of the 12th European Conference on Computer Vision (ECCV 2012), Part II. 488--501.

[20]

Fernando Pérez-Cruz, A Navia-Vazquez, Pedro Luis Alarcón-Diana, and Antonio Artes-Rodriguez. 2000. Support vector classifier with hyperbolic tangent penalty function. In Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2000). 3458--3461.

Digital Library

[21]

Guo-Jun Qi, Jinhui Tang, Zheng-Jun Zha, Tat-Seng Chua, and Hong-Jiang Zhang. 2009. An efficient sparse metric learning in high-dimensional space via l_1-penalized log-determinant regularization. In Proceedings of the 26th International Conference on Machine Learning (ICML 2009). 841--848.

Digital Library

[22]

Kihyuk Sohn. 2016. Improved deep mdetric learning with multi-class n-pair loss objective. In Advances in Neural Information Processing Systems 29 (NIPS 2016). 1857--1865.

[23]

Masashi Sugiyama. 2007. Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. Journal of Machine Learning Research, Vol. 8, May (2007), 1027--1061.

Digital Library

[24]

Daniel Tarlow, Kevin Swersky, Laurent Charlin, Ilya Sutskever, and Rich Zemel. 2013. Stochastic k-neighborhood selection for supervised and unsupervised learning. In Proceedings of the 30th International Conference on Machine Learning (ICML 2013). 199--207.

[25]

Matthew A Turk and Alex P Pentland. 1991. Face recognition using eigenfaces. In Proceedings of the 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 1991). 586--587.

[26]

Nakul Verma, Dhruv Mahajan, Sundararajan Sellamanickam, and Vinod Nair. 2012. Learning hierarchical similarity metrics. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2012). 2280--2287.

[27]

Jingyan Wang, Xin Gao, Quanquan Wang, and Yongping Li. 2012. ProDis-ContSHC: Learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval. In BMC Bioinformatics, Vol. 13. S2.

[28]

Shijun Wang and Rong Jin. 2009. An information geometry approach for distance metric learning. In Proceedings of the 12th International Conference on Artificial Intelligence and Sstatistics (AISTATS 2009). 591--598.

[29]

Xun Wang, Xintong Han, Weiling Huang, Dengke Dong, and Matthew R. Scott. 2019. Multi-similarity loss with general pair weighting for deep metric learning. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019). 5022--5030.

[30]

Kilian Q Weinberger and Lawrence K Saul. 2009. Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, Vol. 10, 2 (2009).

Digital Library

[31]

Eric P Xing, Michael I Jordan, Stuart J Russell, and Andrew Y Ng. 2003. Distance metric learning with application to clustering with side-information. In Advances in Neural Information Processing Systems 16 (NIPS 2003). 521--528.

[32]

Yiming Ying and Peng Li. 2012. Distance metric learning with eigenvalue optimization. The Journal of Machine Learning Research, Vol. 13, Jan (2012), 1--26.

Digital Library

Cited By

Deng HMeng XDeng FFeng L(2023)UNIT: A unified metric learning framework based on maximum entropy regularizationApplied Intelligence10.1007/s10489-023-04831-x53:20(24509-24529)Online publication date: 26-Jul-2023
https://dl.acm.org/doi/10.1007/s10489-023-04831-x

Index Terms

Metric Learning via Penalized Optimization
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
      1. Classification and regression trees
  2. Modeling and simulation
    1. Model development and analysis
      1. Model verification and validation
2. Theory of computation
  1. Design and analysis of algorithms
    1. Mathematical optimization
      1. Continuous optimization
        Convex optimization

Recommendations

Global and local metric learning via eigenvectors

Distance metric plays a significant role in machine learning methods(classification, clustering, etc.), especially in k-nearest neighbor classification(kNN), where the Euclidean distances are computed to decide the labels of unknown points. But ...
Multiple metric learning via local metric fusion
Abstract
Adaptive distance metric learning based on the characteristics of data can significantly improve the learner’s performance. Due to the limitations of single metric learning for heterogeneous data, multiple local metric learning has become an ...
Risk-based adaptive metric learning for nearest neighbour classification

The performance of k-nearest neighbour classification highly depends on the appropriateness of distance metric designation. Optimal performance can be obtained when the distance metric is matched to the characteristics of data. Existing works on ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

August 2021

4259 pages

ISBN:9781450383325

DOI:10.1145/3447548

General Chairs:
Feida Zhu
Singapore Management University
,
Beng Chin Ooi
National University of Singapore
,
Chunyan Miao
Nanyang Technology University
,
Program Chairs:
Haixun Wang,
Iryna Skrypnyk,
Wynne Hsu,
Sanjay Chawla

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '21

Sponsor:

KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2021

Virtual Event, Singapore

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
307
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)3

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Deng HMeng XDeng FFeng L(2023)UNIT: A unified metric learning framework based on maximum entropy regularizationApplied Intelligence10.1007/s10489-023-04831-x53:20(24509-24529)Online publication date: 26-Jul-2023
https://dl.acm.org/doi/10.1007/s10489-023-04831-x

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten