Skip to main content
Log in

Full-scaled deep metric learning for pedestrian re-identification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The pedestrian re-identification problem (i.e., re-id) is essential and pre-requisite in multi-camera video surveillance studies, provided the fact that pedestrian targets need to be accurately re-identified across a network of multiple cameras with non-overlapping fields of views before other post-hoc high-level utilizations (i.e., tracking, behaviors analyses, activities monitoring, etc.) can be carried out. Driven by recent developments in deep learning techniques, the important re-id problem is often tackled via either deep discriminant learning or deep generative learning techniques. However, most contemporary deep learning-based models with tremendously deep structures are not easy to be trained because of the notorious vanishings gradient problem. In this study, a novel full-scaled deep discriminant learning model is proposed. The novelty of the full-scale model is significant, as three crucial concepts in designing a deep learning model, including depth, width, and cardinality, are all taken into consideration, simultaneously. Therefore, the new model needs not to be tremendously deep but is more convenient to be trained. Moreover, based on the new model, a novel deep metric learning method is proposed to further solve the important re-id problem. Technically, two algorithms either based on the conventional SGD (stochastic gradient descent) or an alternative more efficient PGD (proximal gradient descent) are both derived. For experimental analyses, the newly introduced full-scaled deep metric learning method has been comprehensively compared with dozens of popular re-id methods proposed from either deep learning or shallow learning perspectives. Several well-known public re-id datasets have been incorporated and rigorous statistical analyses have been carried out to compare all methods regarding their re-id performance. The superiority of the novel full-scaled deep metric learning method has been substantiated, from the statistical point of view.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  1. An L, Chen X, Yang S (2016) Person Re-identification via Hypergraph-based Matching. Neurocomputing 182:247–254

    Article  Google Scholar 

  2. Ahmed E, Jones M, Marks T (2015) An improved deep learning architecture for person re-identification. In: The proceeding of computer vision and pattern recognition

  3. Ahmed E, Jones M, Marks T (2015) An improved deep learning architecture for person re-identification. In: The proceeding of computer vision and pattern recognition

  4. Abu-El-Haija S, Kothari N, Lee J, et al. (2016) YouTube-8M: A Large-scale video classification benchmark. arXiv.org, pp 1609.0867

  5. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. arXiv.org, pp 1701.07875

  6. Bar-Hillel A, Hertz T, Shental N (2002) Learning via equivalence constraints, with applications to the enhancement of image and video retrieval. In: The proceeding of computer vision and pattern recognition

  7. Bolettieri P, Esuli A, Falchi F, Lucchese C, Perego R, Piccioli T, Rabitti F (2009) CoPhIR: A test collection for content-based image retrieval. arXiv.org, pp 0905.4627

  8. Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge. ISBN: 978-0-521-83378-3

    Book  Google Scholar 

  9. Chapelle O, Scholkopf B, Zien A (2006) Semi-supervised learning, 1st edn. MIT Press, Cambridge. ISBN: 978-0-262-03358-9

    Book  Google Scholar 

  10. Camps O, Gou M, Hebble T, Karanam S, Lehmann O, Li Y, Radke R, Wu Z, Xiong F (2017) From the lab to the real world: Re-Identification in an airport camera network. IEEE Transactions on Circuits and Systems for Video Technology 27(3):540–553

    Article  Google Scholar 

  11. Cakir F, He K, Xia X, Kulis B, Sclaroff S (2019) Deep metric learning to rank. In: The proceeding of computer vision and pattern recognition

  12. Chen S, Guo C, Lai J (2016) Deep ranking for person re-identification via joint representation learning. IEEE Trans Image Process 25(5):2353–2367

    Article  MathSciNet  Google Scholar 

  13. Chen Y, Li J, Xiao H, Jin X, Yan S, Feng J (2017) Dual path networks. arXiv.org, pp 1707.01629

  14. Chen J, Ying H, Liu X, Gu J, Feng R, Chen T, Gao H, Wu J (2020) A transfer learning based super-resolution microscopy for biopsy slice images: The joint methods perspective, IEEE/ACM Transactions on Computational Biology and Bioinformatics, . https://doi.org/10.1109/TCBB.2020.2991173

  15. Chen S, Guo C, Lai J (2016) Deep ranking for person re-identification via joint representation learning. IEEE Trans Image Process 25(5):2353–2367

    Article  MathSciNet  Google Scholar 

  16. Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: A real-world web image database from National University of Singapore. In: The proceeding of content-based image and video retrieval

  17. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: A large-scale hierarchical image database. In: The proceeding of computer vision and pattern recognition

  18. Deng W, Zheng L, Ye Q, Kang G, Yang Y, Jiao J (2018) Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: The proceeding of international conference on computer vision

  19. Duan Y, Lu J, Zheng W, Zhou J (2020) Deep adversarial metric learning. IEEE Transactions on Image Processing 29(1):2037–2051

    Article  Google Scholar 

  20. Durugkar I, Gemp I, Mahadevan S (2016) Generative multi-adversarial networks. arXiv.org, pp 1611.01673

  21. Everingham M, Van Gool L, Williams C, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 88 (2):303–338

    Article  Google Scholar 

  22. Fang W, Hu H, Hu Z, Liao S, Li B (2018) Perceptual hash-based feature description for person re-identification. Neurocomputing 272:520–531

    Article  Google Scholar 

  23. Gao W, Zhu Y, Zhang W, Zhang K, Gao H (2019) A Hierarchical Recurrent Approach to Predict Scene Graphs from a Visual-attention-oriented Perspective. Computational Intelligence 35(3):496–516

    Article  MathSciNet  Google Scholar 

  24. Gao H, Xu Y, Yin Y, Zhang W, Li R, Wang X (2020) Context-aware QoS prediction with neural collaborative filtering for internet-of-things services. IEEE Internet of Things Journal 7(5):4532–4542

    Article  Google Scholar 

  25. Ge W, Huang W, Dong D, Scott M (2018) Deep metric learning with hierarchical triplet loss. In: The proceeding of european conference on computer vision

  26. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. arXiv.org, pp 1406.2661

  27. Gong C, Tao D, Liu W, Maybank S, Fang M, Fu K, Yang J (2015) Saliency propagation from simple to difficult. In: The proceeding of computer vision and pattern recognition

  28. Goldberger J, Hinton G (2004) Neighbourhood components analysis. In: The proceeding of neural information processing systems

  29. Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: The proceeding of european conference on computer vision

  30. Ghahramani Z (2004) Unsupervised learning. Lect Notes Comput Sci 3176:72–112

    Article  Google Scholar 

  31. Guo F, Wang W, Shen J, Shao L, Yang J, Tao D (2018) Video saliency detection using object proposals. IEEE Transactions on Cybernetics 48 (11):3159–3170

    Article  Google Scholar 

  32. Guo M, Karanam S, Liu W, Camps O, Radke R (2017) DukeMTMC4ReID: A large-scale multi-camera person re-identification dataset. [Online]. Available: https://www.ecse.rpi.edu/rjradke/papers/dukemtmc4reid.pdf

  33. He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv.org, pp 1512.03385

  34. He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. arXiv.org, pp 1603.05027

  35. Hoi S, Liu W, Lyu M (2006) Learning distance metrics with contextual constraints for image retrieval. In: The proceeding of computer vision and pattern recognition

  36. Hu J, Lu J, Tan Y (2014) Discriminative deep metric learning for face verification in the wild. In: The proceeding of computer vision and pattern recognition

  37. Hu J, Lu J, Tan Y, Zhou J (2016) Deep transfer metric learning. IEEE Trans Image Process 25(12):5576–5588

    Article  MathSciNet  Google Scholar 

  38. Hu J, Lu J, Tan Y (2016) Deep metric learning for visual tracking. IEEE Transactions on Circuits and Systems for Video Technology 26 (11):2056–2068

    Article  Google Scholar 

  39. Huiskes M, Thomee B, Lew M (2010) New trends and ideas in visual concept detection. In: The proceeding of multimedia information retrieval

  40. Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507

    Article  MathSciNet  Google Scholar 

  41. Karanam S, Li Y, Radke R (2015) Sparse Re-Id: Block sparsity for person re-identification. In: The proceeding of computer vision and pattern recognition

  42. Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of GANs for improved quality, stability, and variation. In: The proceeding of international conference on learning representations

  43. Krasin I, Duerig T, Alldrin N, et al. (2017) Open images: A public dataset for large-scale multi-label and multi-class image classification. [Online] Available: https://storage.googleapis.com/openimages/web/index.html

  44. Krizhevsky A (2009) Learning multiple layers of features from tiny images. [Online] Available: http://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf

  45. Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. In: The proceeding of neural information processing systems

  46. Kim W, Goyal B, Chawla K, Lee J, Kwon K (2018) Attention-based ensemble for deep metric learning. In: The proceeding of european conference on computer vision

  47. Li W, Zhao R, Xiao T, Wang X (2014) DeepReID: Deep filter pairing neural network for person re-identification. In: The proceeding of computer vision and pattern recognition

  48. Liu H, Ma B, Qin L, Pang J, Zhang C, Huang Q (2015) Set-label modeling and deep metric learning on person re-identification. Neurocomputing 151:1283–1292

    Article  Google Scholar 

  49. Liao S, Hu Y, Zhu X, Li S (2015) Person re-identification by local maximal occurrence representation and metric Learning. In: The proceeding of computer vision and pattern recognition

  50. Lin T, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick C (2014) Microsoft COCO: Common objects in context. In: The proceeding of European conference on computer vision

  51. Li H, Wang M, Hua X (2009) MSRA-MM 2.0: A large-scale web multimedia dataset. In: The proceeding of international conference on data mining

  52. Li Z, Tang J (2015) Weakly supervised deep metric learning for Community-Contributed image retrieval. IEEE Transactions on Multimedia 17(11):1989–1999

    Article  Google Scholar 

  53. Li W, Zhao R, Wang X (2012) Human reidentification with transferred metric learning. In: The proceeding of asian conference on computer vision

  54. Li W, Zhao R, Xiao T, Wang X (2014) DeepReID: Deep filter pairing neural network for person re-identification. In: The proceeding of computer vision and pattern recognition

  55. Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification. arXiV.org, pp. 1802.08122

  56. Liu F, Gong C, Huang X, Zhou T, Yang J, Tao D (2018) Robust visual tracking revisited: from correlation filter to template matching. IEEE Trans Image Process 27(6):2777–2790

    Article  MathSciNet  Google Scholar 

  57. Lu J, Wang G, Deng W, Moulin P, Zhou J (2015) Multi-manifold deep metric learning for image set classification. In: The proceeding of computer vision and pattern recognition

  58. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv.org, pp 1411.1784

  59. Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. In: The proceeding of international conference on learning representations

  60. Peng P, Xiang T, Wang Y, Pontil M, Gong S, Huang T, Tian Y (2016) Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns. In: The proceeding of computer vision and pattern recognition

  61. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv.org, pp 1511.06434

  62. Rice J (2007) Mathematical statistics and data analysis. Pacific Grove, CA: Duxbury, (2nd edition). ISBN: 978-8-131-51954-7

  63. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  64. Sabour S, Frosst N, Hinton G (2017) Dynamic routing between capsules. arXiv.org, pp 1710.09829

  65. Soleimani A, Araabi B, Fouladi K (2016) Deep multi-task metric learning for offline signature verification. Pattern Recogn Lett 80:84–90

    Article  Google Scholar 

  66. Suh Y, Han B, Kim W, Lee K (2019) Stochastic class-based hard example mining for deep metric learning. In: The proceeding of computer vision and pattern recognition

  67. Subramaniam A, Chatterjee M, Mittal A (2016) Deep neural networks with inexact matching for person re-identification. In: The proceeding of neural information processing systems

  68. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv.org, pp 1409.1556

  69. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. arXiv.org, pp1409.4842

  70. Tibshirani R (1996) Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, Series B 58(1):267–288

    MathSciNet  MATH  Google Scholar 

  71. Tikhonov A, Arsenin V (1977) Solution of ill-posed problems. Bull Am Math Soc 1(3):521–524

    MathSciNet  Google Scholar 

  72. Wang J, Wang Z, Gao C, Sang N, Huang R (2017) Deeplist: Learning deep features with adaptive listwise constraint for person reidentification. IEEE Transactions on Circuits and Systems for Video Technology 27(3):513–524

    Article  Google Scholar 

  73. Wang X, Hua Y, Kodirov E, Hu G, Garnier R, Robertson N (2019) Ranked list loss for deep metric learning. In: The proceeding of computer vision and pattern recognition

  74. Wang X, Han X, Huang W, Dong D, Scott M (2019) Multi-similarity loss with general pair weighting for deep metric learning. In: The proceeding of computer vision and pattern recognition

  75. Weinberger K, Blitzer J, Saul L (2006) Distance metric learning for large margin nearest neighbor classification. In: The proceeding of neural information processing systems

  76. Wu Z (2016) Human Re-identification, 1st edn. Springer International Publishing, New York. ISBN:978-3-319-40991-7

    Book  Google Scholar 

  77. Wu Z, Li Y, Radke R (2015) Viewpoint invariant human Re-Identification in camera networks using pose priors and Subject-Discriminative features. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(5):1095–1108

    Article  Google Scholar 

  78. Wu A, Zheng W, Lai J (2017) Robust depth-based person re-identification. IEEE Trans Image Process 26(6):2588–2603

    Article  MathSciNet  Google Scholar 

  79. Xie S, Girshick R, Dollar P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv.org, pp 1611.05431

  80. Xing E, Ng A, Jordan M, Russell S (2003) Distance metric learning, with application to clustering with side-information. In: The proceeding of neural information processing systems

  81. Yang L, Jin R (2006) Distance Metric learning: A comprehensive survey. [Online] Available: https://www.cs.cmu.edu/liuy/frame_survey_v2.pdf

  82. Yi D, Lei Z, Liao S, Li S (2014) Deep metric learning for person re-identification. In: The proceeding of international conference on pattern recognition

  83. Yu J, Yang X, Gao F, Tao D (2017) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Transactions on Cybernetics 47(12):4014–4024

    Article  Google Scholar 

  84. Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv.org, pp 1605.07146

  85. Zhang Z, Saligrama V (2017) PRISM: Person reidentification via structured matching. IEEE Transactions on Circuits and Systems for Video Technology 27(3):499–512

    Article  Google Scholar 

  86. Zheng A, Zhang X, Jiang B, Li C (2020) A subspace learning approach to multishot person reidentification. IEEE Transactions on Systems, Man, and Cybernetics: Systems 50(1):149–158

    Article  Google Scholar 

  87. Zheng W, Chen Z, Lu J, Zhou J (2019) Hardness-aware deep metric learning. In: The proceeding of computer vision and pattern recognition

  88. Zheng W, Gong S, Xiang T (2009) Associating groups of people. In: The proceeding of british machine vision conference

  89. Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: A benchmark. In: The proceeding of international conference on computer vision

  90. Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person re-identification with k-reciprocal encoding. In: The proceeding of international conference on computer vision

  91. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B 67(1):301–320

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work was partially supported by grants 61862043 and 61971352 approved by National Natural Science Foundation of China, grant S2020RCDT2K0033 approved by Natural Science Foundation of Jiangxi Province, and grant 2018JM6015 approved by Natural Science Foundation of Shaanxi Province. rgb 0,0,0The source code of the new method has been made public through the following URL: https://github.com/Lmy0217/FDML.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Huang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, W., Luo, M., Zhang, P. et al. Full-scaled deep metric learning for pedestrian re-identification. Multimed Tools Appl 80, 5945–5975 (2021). https://doi.org/10.1007/s11042-020-09997-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09997-x

Keywords

Navigation