Spherical Feature Transform for Deep Metric Learning

Zhu, Yuke; Bai, Yan; Wei, Yichen

doi:10.1007/978-3-030-58529-7_25

Yuke Zhu¹²,
Yan Bai¹³ &
Yichen Wei¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12364))

Included in the following conference series:

European Conference on Computer Vision

3424 Accesses

Abstract

Data augmentation in feature space is effective to increase data diversity. Previous methods assume that different classes have the same covariance in their feature distributions. Thus, feature transform between different classes is performed via translation. However, this approach is no longer valid for recent deep metric learning scenarios, where feature normalization is widely adopted and all features lie on a hypersphere.

This work proposes a novel spherical feature transform approach. It relaxes the assumption of identical covariance between classes to an assumption of similar covariances of different classes on a hypersphere. Consequently, the feature transform is performed by a rotation that respects the spherical data distributions. We provide a simple and effective training method, and in depth analysis on the relation between the two different transforms. Comprehensive experiments on various deep metric learning benchmarks and different baselines verify that our method achieves consistent performance improvement and state-of-the-art results.

Y. Zhu and Y. Bai—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep metric learning with in-batch feature vector constraints and unsupervised label integration

Article Open access 15 November 2024

Deep Variational Metric Learning

DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning

References

Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: VGGFace2: a dataset for recognising faces across pose and age. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 67–74. IEEE (2018)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009 (2009)
Google Scholar
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
Google Scholar
Dixit, M., Kwitt, R., Niethammer, M., Vasconcelos, N.: AGA: attribute-guided augmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7455–7463 (2017)
Google Scholar
Duan, Y., Zheng, W., Lin, X., Lu, J., Zhou, J.: Deep adversarial metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2780–2789 (2018)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 87–102. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_6
Chapter Google Scholar
Hamsici, O.C., Martinez, A.M.: Spherical-homoscedastic distributions: the equivalency of spherical and normal distributions in classification. J. Mach. Learn. Res. 8, 1583–1623 (2007)
Google Scholar
Huang, G.B., Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments (2008)
Google Scholar
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 554–561 (2013)
Google Scholar
Lin, X., Duan, Y., Dong, Q., Lu, J., Zhou, J.: Deep variational metric learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 714–729. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_42
Chapter Google Scholar
Liu, B., Wang, X., Dixit, M., Kwitt, R., Vasconcelos, N.: Feature space transfer for data augmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9090–9098 (2018)
Google Scholar
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., Song, L.: SphereFace: deep hypersphere embedding for face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 212–220 (2017)
Google Scholar
Maze, B., et al.: IARPA Janus benchmark-C: face dataset and protocol. In: 2018 International Conference on Biometrics (ICB), pp. 158–165. IEEE (2018)
Google Scholar
Oh Song, H., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4004–4012 (2016)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Ranjan, R., Castillo, C.D., Chellappa, R.: L2-constrained softmax loss for discriminative face verification. arXiv preprint arXiv:1703.09507 (2017)
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Google Scholar
Sohn, K.: Improved deep metric learning with multi-class n-pair loss objective. In: Advances in Neural Information Processing Systems, pp. 1857–1865 (2016)
Google Scholar
Sohn, K., Liu, S., Zhong, G., Yu, X., Yang, M.H., Chandraker, M.: Unsupervised domain adaptation for face recognition in unlabeled videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3210–3218 (2017)
Google Scholar
Sun, Y., Wang, X., Tang, X.: Deeply learned face representations are sparse, selective, and robust. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2892–2900 (2015)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Computer Vision and Pattern Recognition (CVPR) (2015). http://arxiv.org/abs/1409.4842
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)
Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD birds-200-2011 dataset (2011)
Google Scholar
Wang, F., Xiang, X., Cheng, J., Yuille, A.L.: NormFace: L2 hypersphere embedding for face verification. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 1041–1049. ACM (2017)
Google Scholar
Wang, H., et al.: CosFace: large margin cosine loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018)
Google Scholar
Wang, X., Hua, Y., Kodirov, E., Hu, G., Garnier, R., Robertson, N.M.: Ranked list loss for deep metric learning. arXiv preprint arXiv:1903.03238 (2019)
Wang, X., Han, X., Huang, W., Dong, D., Scott, M.R.: Multi-similarity loss with general pair weighting for deep metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5022–5030 (2019)
Google Scholar
Wolf, L., Hassner, T., Maoz, I.: Face recognition in unconstrained videos with matched background similarity. IEEE (2011)
Google Scholar
Yin, X., Yu, X., Sohn, K., Liu, X., Chandraker, M.: Feature transfer learning for face recognition with under-represented data. In: In Proceeding of IEEE Computer Vision and Pattern Recognition, Long Beach, CA, June 2019
Google Scholar
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Sig. Process. Lett. 23(10), 1499–1503 (2016)
Article Google Scholar
Zhao, Y., Jin, Z., Qi, G., Lu, H., Hua, X.: An adversarial approach to hard triplet generation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11213, pp. 508–524. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01240-3_31
Chapter Google Scholar
Zheng, W., Chen, Z., Lu, J., Zhou, J.: Hardness-aware deep metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 72–81 (2019)
Google Scholar

Download references

Acknowledgment

This work was supported in part by the National Key Research and Development Program of China under Grant 2017YFA0700800.

Author information

Authors and Affiliations

MEGVII Inc., Beijing, China
Yuke Zhu & Yichen Wei
Tongji Unversity, Shanghai, China
Yan Bai

Authors

Yuke Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Bai
View author publications
You can also search for this author in PubMed Google Scholar
Yichen Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yichen Wei .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 42 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, Y., Bai, Y., Wei, Y. (2020). Spherical Feature Transform for Deep Metric Learning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12364. Springer, Cham. https://doi.org/10.1007/978-3-030-58529-7_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-58529-7_25
Published: 13 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58528-0
Online ISBN: 978-3-030-58529-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Spherical Feature Transform for Deep Metric Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep metric learning with in-batch feature vector constraints and unsupervised label integration

Deep Variational Metric Learning

DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 42 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Spherical Feature Transform for Deep Metric Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep metric learning with in-batch feature vector constraints and unsupervised label integration

Deep Variational Metric Learning

DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 42 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation