Deep Metric Learning with False Positive Probability

Zhong, Jia-Xing; Li, Ge; Li, Nannan

doi:10.1007/978-3-319-70090-8_66

Deep Metric Learning with False Positive Probability

Trade Off Hard Levels in a Weighted Way

Jia-Xing Zhong^18,19,
Ge Li¹⁸ &
Nannan Li¹⁸

Conference paper
First Online: 28 October 2017

4405 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10636))

Abstract

In recent years, deep metric learning has been an end-to-end fashion in computer vision community due to the great success of deep learning. However, existing deep metric learning frameworks are faced with a dilemma about the hard level trade-off for training examples. Namely, the “harder” examples we feed to neural networks, the more likely we attain highly discriminative models, but the more easily neural networks get stuck into poor local minimal in practice. To fight against this dilemma, we propose a deep metric learning method with FAlse Positive ProbabilitY (FAPPY) to gradually incorporate different hard levels. Unlike mainstream deep metric learning schemes, the presented approach optimizes similarity probability distribution among training samples, instead of the similarity itself. Experimental results on CUB-200-2011, Stanford Online Products and VehicleID datasets show that our FAPPY method achieves or outperforms state-of-the-art metric learning methods on fine-grained image retrieval and vehicle re-identification tasks. Besides, the presented method has relatively low sensitivity of hyper-parameters and it requires minor changes on traditional classification networks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Recall@K is the average recall scores over all the query images in testing set. For each query image, the recall score is 1 if at least one positive image in the nearest K returned images and 0 otherwise.

References

Cui, Y., Zhou, F., Lin, Y., Belongie, S.: Fine-grained categorization and dataset bootstrapping using deep metric learning with humans in the loop. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1153–1162 (2016)
Google Scholar
Zhang, X., Zhou, F., Lin, Y., Zhang, S.: Embedding label structures for fine-grained feature representation. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1114–1123 (2016)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Google Scholar
Wen, Y., Zhang, K., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: 2016 European Conference on Computer Vision, pp. 499–515 (2016)
Google Scholar
Liu, H., Tian, Y., Yang, Y., Pang, L., Huang, T.: Deep relative distance learning: tell the difference between similar vehicles. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2167–2175 (2016)
Google Scholar
You, J., Wu, A., Zheng, W.S.: Top-push video-based person re-identification. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1345–1353 (2016)
Google Scholar
Zhang, Z., Saligrama, V.: Zero-shot learning via joint latent similarity embedding. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 634–642 (2016)
Google Scholar
Bucher, M., Herbin, S., Jurie, F.: Improving semantic embedding consistency by metric learning for zero-shot classification. In: 2016 European Conference on Computer Vision, pp. 730–746 (2016)
Google Scholar
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2016 European Conference on Computer Vision, pp. 539–546 (2016)
Google Scholar
Ustinova, E., Lempitsky, V.: Learning deep embeddings with histogram loss. In: Advances in Neural Information Processing Systems, pp. 4170–4178 (2016)
Google Scholar
Song, H.O., Xiang, Y., Jegelka, S., Savarese, S.: Deep metric learning via lifted structured feature embedding. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4004–4012 (2016)
Google Scholar
Song, H.O., Jegelka, S., Rathod, V., Murphy, K.: Deep metric learning via facility location (2017)
Google Scholar
Yuan, Y., Yang, K., Zhang, C.: Hard-aware deeply cascaded embedding. arXiv preprint:1611.05720 (2016)
Google Scholar
Zucchini, W., Berzel, A., Nenadic, O.: Applied Smoothing Techniques. Part I: Kernel Density Estimation, pp. 5 (2003)
Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: 22nd ACM International Conference on Multimedia, pp. 675–678 (2014)
Google Scholar
Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., Perona, P.: Caltech-UCSD birds 200 (2011)
Google Scholar
Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)
Article Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014)
Gray, D., Brennan, S., Tao, H.: Evaluating appearance models for recognition, reacquisition, and tracking. In: IEEE International Workshop on Performance Evaluation for Tracking and Surveillance (PETS), vol. 3, no. 5 (2007)
Google Scholar

Download references

Acknowledgements

This work is partially supported by National Science Foundation of China (No. U1611461), Shenzhen Peacock Plan (20130408-183003656), Science, Technology Planning Project of Guangdong Province (No. 2014B090910001) and National Natural Science Foundation of China (61602014). In addition, we would like to thank Guangzhou Supercomputer Center for providing us with Tianhe-2 system to conduct the experiment and giving us technical supports.

Author information

Authors and Affiliations

School of Electronic and Computer Engineering, Peking University, Shenzhen, China
Jia-Xing Zhong, Ge Li & Nannan Li
School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
Jia-Xing Zhong

Authors

Jia-Xing Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Ge Li
View author publications
You can also search for this author in PubMed Google Scholar
Nannan Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ge Li .

Editor information

Editors and Affiliations

Guangdong University of Technology, Guangzhou, China
Derong Liu
Guangdong University of Technology, Guangzhou, China
Shengli Xie
South China University of Technology, Guangzhou, China
Yuanqing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Dongbin Zhao
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
El-Sayed M. El-Alfy

Appendix A

Lemma 1.

Proof.

We compute the gradients of these pairs:

$$ \frac{{\partial P_{ij}^{\Delta } }}{{\partial {\text{s}}_{ij} }} = h_{i}^{t} + h_{j}^{t} , \quad \frac{{\partial P_{ij}^{\Delta } }}{{\partial {\text{s}}_{ik} }} = \left\{ {\begin{array}{*{20}c} { - h_{ij}^{right} ,} & {s_{ik} \in \left[ {b_{t} ,b_{t + 1} } \right]} \\ {0,} & {otherwise} \\ \end{array} } \right.,\quad \frac{{\partial P_{ij}^{\Delta } }}{{\partial {\text{s}}_{jl} }} = \left\{ {\begin{array}{*{20}c} { - h_{ij}^{right} , } & {s_{jl} \in \left[ {b_{t} ,b_{t + 1} } \right]} \\ {0,} & {otherwise} \\ \end{array} } \right. $$

Strictly speaking, the necessary and sufficient condition of $ \frac{{\partial P_{ij}^{\Delta } }}{{\partial {\text{s}}_{ik} }} \ne 0 $ is $ s_{ik} , s_{ij} \in \left( {b_{t} ,b_{t + 1} } \right]. $ Likewise, $ \frac{{\partial P_{ij}^{\Delta } }}{{\partial {\text{s}}_{jl} }} \ne 0 $ if and only if $ s_{jl} , s_{ij} \in \left( {b_{t} ,b_{t + 1} } \right]. $

The Equivalency

Proof.

In the L2-normalized space, the feature vectors are defined as: $ \overrightarrow {{f_{x}^{L2} }} = \frac{{\overrightarrow {{f_{x} }} }}{{\overrightarrow {{\left\| {f_{x} } \right\|}} }}. $

The Euclidian distance $ {\text{d}}_{xy}^{L2} $ and $ \Delta^{L2} $ are as follows:

$$ {\text{d}}_{xy}^{L2} = \left\| {\overrightarrow {{f_{x}^{L2} }} - \overrightarrow {{f_{y}^{L2} }} } \right\| = \left\| {\frac{{\overrightarrow {{f_{x} }} }}{{\overrightarrow {{\left\| {f_{x} } \right\|}} }} - \frac{{\overrightarrow {{f_{y} }} }}{{\left\| {\overrightarrow {{f_{y} }} } \right\|}}} \right\| = \sqrt {1 - 2cos\left\langle {\overrightarrow {{f_{x} }} , \overrightarrow {{f_{y} }} } \right\rangle + 1} = \sqrt {2 - 2s_{xy} } $$

$$ \Delta^{L2} = d_{t}^{L2} - d_{t + 1}^{L2} = \sqrt {2 - 2b_{t} } - \sqrt {2 - 2b_{t + 1} } $$

Therefore, $ \Delta^{L2} $ is positively related to $ \Delta $ while $ {\text{d}}_{xy}^{L2} $ is negatively related to $ s_{xy} $, and the explanation in Fig. 4 is reasonable.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, JX., Li, G., Li, N. (2017). Deep Metric Learning with False Positive Probability. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10636. Springer, Cham. https://doi.org/10.1007/978-3-319-70090-8_66

Download citation

DOI: https://doi.org/10.1007/978-3-319-70090-8_66
Published: 28 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70089-2
Online ISBN: 978-3-319-70090-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract

Buying options

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix A

Appendix A

Lemma 1.

The Equivalency

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation