Full-scaled deep metric learning for pedestrian re-identification

Huang, Wei; Luo, Mingyuan; Zhang, Peng; Zha, Yufei

doi:10.1007/s11042-020-09997-x

Full-scaled deep metric learning for pedestrian re-identification

Published: 10 October 2020

Volume 80, pages 5945–5975, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Wei Huang ORCID: orcid.org/0000-0002-0541-8612^1,2,
Mingyuan Luo³,
Peng Zhang⁴ &
…
Yufei Zha⁴

311 Accesses
1 Citation
Explore all metrics

Abstract

The pedestrian re-identification problem (i.e., re-id) is essential and pre-requisite in multi-camera video surveillance studies, provided the fact that pedestrian targets need to be accurately re-identified across a network of multiple cameras with non-overlapping fields of views before other post-hoc high-level utilizations (i.e., tracking, behaviors analyses, activities monitoring, etc.) can be carried out. Driven by recent developments in deep learning techniques, the important re-id problem is often tackled via either deep discriminant learning or deep generative learning techniques. However, most contemporary deep learning-based models with tremendously deep structures are not easy to be trained because of the notorious vanishings gradient problem. In this study, a novel full-scaled deep discriminant learning model is proposed. The novelty of the full-scale model is significant, as three crucial concepts in designing a deep learning model, including depth, width, and cardinality, are all taken into consideration, simultaneously. Therefore, the new model needs not to be tremendously deep but is more convenient to be trained. Moreover, based on the new model, a novel deep metric learning method is proposed to further solve the important re-id problem. Technically, two algorithms either based on the conventional SGD (stochastic gradient descent) or an alternative more efficient PGD (proximal gradient descent) are both derived. For experimental analyses, the newly introduced full-scaled deep metric learning method has been comprehensively compared with dozens of popular re-id methods proposed from either deep learning or shallow learning perspectives. Several well-known public re-id datasets have been incorporated and rigorous statistical analyses have been carried out to compare all methods regarding their re-id performance. The superiority of the novel full-scaled deep metric learning method has been substantiated, from the statistical point of view.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 6

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

Statistical Analysis of Design Aspects of Various YOLO-Based Deep Learning Models for Object Detection

Article Open access 02 August 2023

References

An L, Chen X, Yang S (2016) Person Re-identification via Hypergraph-based Matching. Neurocomputing 182:247–254
Article Google Scholar
Ahmed E, Jones M, Marks T (2015) An improved deep learning architecture for person re-identification. In: The proceeding of computer vision and pattern recognition
Ahmed E, Jones M, Marks T (2015) An improved deep learning architecture for person re-identification. In: The proceeding of computer vision and pattern recognition
Abu-El-Haija S, Kothari N, Lee J, et al. (2016) YouTube-8M: A Large-scale video classification benchmark. arXiv.org, pp 1609.0867
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. arXiv.org, pp 1701.07875
Bar-Hillel A, Hertz T, Shental N (2002) Learning via equivalence constraints, with applications to the enhancement of image and video retrieval. In: The proceeding of computer vision and pattern recognition
Bolettieri P, Esuli A, Falchi F, Lucchese C, Perego R, Piccioli T, Rabitti F (2009) CoPhIR: A test collection for content-based image retrieval. arXiv.org, pp 0905.4627
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge. ISBN: 978-0-521-83378-3
Book Google Scholar
Chapelle O, Scholkopf B, Zien A (2006) Semi-supervised learning, 1st edn. MIT Press, Cambridge. ISBN: 978-0-262-03358-9
Book Google Scholar
Camps O, Gou M, Hebble T, Karanam S, Lehmann O, Li Y, Radke R, Wu Z, Xiong F (2017) From the lab to the real world: Re-Identification in an airport camera network. IEEE Transactions on Circuits and Systems for Video Technology 27(3):540–553
Article Google Scholar
Cakir F, He K, Xia X, Kulis B, Sclaroff S (2019) Deep metric learning to rank. In: The proceeding of computer vision and pattern recognition
Chen S, Guo C, Lai J (2016) Deep ranking for person re-identification via joint representation learning. IEEE Trans Image Process 25(5):2353–2367
Article MathSciNet Google Scholar
Chen Y, Li J, Xiao H, Jin X, Yan S, Feng J (2017) Dual path networks. arXiv.org, pp 1707.01629
Chen J, Ying H, Liu X, Gu J, Feng R, Chen T, Gao H, Wu J (2020) A transfer learning based super-resolution microscopy for biopsy slice images: The joint methods perspective, IEEE/ACM Transactions on Computational Biology and Bioinformatics, . https://doi.org/10.1109/TCBB.2020.2991173
Chen S, Guo C, Lai J (2016) Deep ranking for person re-identification via joint representation learning. IEEE Trans Image Process 25(5):2353–2367
Article MathSciNet Google Scholar
Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: A real-world web image database from National University of Singapore. In: The proceeding of content-based image and video retrieval
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: A large-scale hierarchical image database. In: The proceeding of computer vision and pattern recognition
Deng W, Zheng L, Ye Q, Kang G, Yang Y, Jiao J (2018) Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: The proceeding of international conference on computer vision
Duan Y, Lu J, Zheng W, Zhou J (2020) Deep adversarial metric learning. IEEE Transactions on Image Processing 29(1):2037–2051
Article Google Scholar
Durugkar I, Gemp I, Mahadevan S (2016) Generative multi-adversarial networks. arXiv.org, pp 1611.01673
Everingham M, Van Gool L, Williams C, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 88 (2):303–338
Article Google Scholar
Fang W, Hu H, Hu Z, Liao S, Li B (2018) Perceptual hash-based feature description for person re-identification. Neurocomputing 272:520–531
Article Google Scholar
Gao W, Zhu Y, Zhang W, Zhang K, Gao H (2019) A Hierarchical Recurrent Approach to Predict Scene Graphs from a Visual-attention-oriented Perspective. Computational Intelligence 35(3):496–516
Article MathSciNet Google Scholar
Gao H, Xu Y, Yin Y, Zhang W, Li R, Wang X (2020) Context-aware QoS prediction with neural collaborative filtering for internet-of-things services. IEEE Internet of Things Journal 7(5):4532–4542
Article Google Scholar
Ge W, Huang W, Dong D, Scott M (2018) Deep metric learning with hierarchical triplet loss. In: The proceeding of european conference on computer vision
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. arXiv.org, pp 1406.2661
Gong C, Tao D, Liu W, Maybank S, Fang M, Fu K, Yang J (2015) Saliency propagation from simple to difficult. In: The proceeding of computer vision and pattern recognition
Goldberger J, Hinton G (2004) Neighbourhood components analysis. In: The proceeding of neural information processing systems
Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: The proceeding of european conference on computer vision
Ghahramani Z (2004) Unsupervised learning. Lect Notes Comput Sci 3176:72–112
Article Google Scholar
Guo F, Wang W, Shen J, Shao L, Yang J, Tao D (2018) Video saliency detection using object proposals. IEEE Transactions on Cybernetics 48 (11):3159–3170
Article Google Scholar
Guo M, Karanam S, Liu W, Camps O, Radke R (2017) DukeMTMC4ReID: A large-scale multi-camera person re-identification dataset. [Online]. Available: https://www.ecse.rpi.edu/rjradke/papers/dukemtmc4reid.pdf
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv.org, pp 1512.03385
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. arXiv.org, pp 1603.05027
Hoi S, Liu W, Lyu M (2006) Learning distance metrics with contextual constraints for image retrieval. In: The proceeding of computer vision and pattern recognition
Hu J, Lu J, Tan Y (2014) Discriminative deep metric learning for face verification in the wild. In: The proceeding of computer vision and pattern recognition
Hu J, Lu J, Tan Y, Zhou J (2016) Deep transfer metric learning. IEEE Trans Image Process 25(12):5576–5588
Article MathSciNet Google Scholar
Hu J, Lu J, Tan Y (2016) Deep metric learning for visual tracking. IEEE Transactions on Circuits and Systems for Video Technology 26 (11):2056–2068
Article Google Scholar
Huiskes M, Thomee B, Lew M (2010) New trends and ideas in visual concept detection. In: The proceeding of multimedia information retrieval
Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
Article MathSciNet Google Scholar
Karanam S, Li Y, Radke R (2015) Sparse Re-Id: Block sparsity for person re-identification. In: The proceeding of computer vision and pattern recognition
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of GANs for improved quality, stability, and variation. In: The proceeding of international conference on learning representations
Krasin I, Duerig T, Alldrin N, et al. (2017) Open images: A public dataset for large-scale multi-label and multi-class image classification. [Online] Available: https://storage.googleapis.com/openimages/web/index.html
Krizhevsky A (2009) Learning multiple layers of features from tiny images. [Online] Available: http://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. In: The proceeding of neural information processing systems
Kim W, Goyal B, Chawla K, Lee J, Kwon K (2018) Attention-based ensemble for deep metric learning. In: The proceeding of european conference on computer vision
Li W, Zhao R, Xiao T, Wang X (2014) DeepReID: Deep filter pairing neural network for person re-identification. In: The proceeding of computer vision and pattern recognition
Liu H, Ma B, Qin L, Pang J, Zhang C, Huang Q (2015) Set-label modeling and deep metric learning on person re-identification. Neurocomputing 151:1283–1292
Article Google Scholar
Liao S, Hu Y, Zhu X, Li S (2015) Person re-identification by local maximal occurrence representation and metric Learning. In: The proceeding of computer vision and pattern recognition
Lin T, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick C (2014) Microsoft COCO: Common objects in context. In: The proceeding of European conference on computer vision
Li H, Wang M, Hua X (2009) MSRA-MM 2.0: A large-scale web multimedia dataset. In: The proceeding of international conference on data mining
Li Z, Tang J (2015) Weakly supervised deep metric learning for Community-Contributed image retrieval. IEEE Transactions on Multimedia 17(11):1989–1999
Article Google Scholar
Li W, Zhao R, Wang X (2012) Human reidentification with transferred metric learning. In: The proceeding of asian conference on computer vision
Li W, Zhao R, Xiao T, Wang X (2014) DeepReID: Deep filter pairing neural network for person re-identification. In: The proceeding of computer vision and pattern recognition
Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification. arXiV.org, pp. 1802.08122
Liu F, Gong C, Huang X, Zhou T, Yang J, Tao D (2018) Robust visual tracking revisited: from correlation filter to template matching. IEEE Trans Image Process 27(6):2777–2790
Article MathSciNet Google Scholar
Lu J, Wang G, Deng W, Moulin P, Zhou J (2015) Multi-manifold deep metric learning for image set classification. In: The proceeding of computer vision and pattern recognition
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv.org, pp 1411.1784
Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. In: The proceeding of international conference on learning representations
Peng P, Xiang T, Wang Y, Pontil M, Gong S, Huang T, Tian Y (2016) Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns. In: The proceeding of computer vision and pattern recognition
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv.org, pp 1511.06434
Rice J (2007) Mathematical statistics and data analysis. Pacific Grove, CA: Duxbury, (2nd edition). ISBN: 978-8-131-51954-7
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Sabour S, Frosst N, Hinton G (2017) Dynamic routing between capsules. arXiv.org, pp 1710.09829
Soleimani A, Araabi B, Fouladi K (2016) Deep multi-task metric learning for offline signature verification. Pattern Recogn Lett 80:84–90
Article Google Scholar
Suh Y, Han B, Kim W, Lee K (2019) Stochastic class-based hard example mining for deep metric learning. In: The proceeding of computer vision and pattern recognition
Subramaniam A, Chatterjee M, Mittal A (2016) Deep neural networks with inexact matching for person re-identification. In: The proceeding of neural information processing systems
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv.org, pp 1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. arXiv.org, pp1409.4842
Tibshirani R (1996) Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, Series B 58(1):267–288
MathSciNet MATH Google Scholar
Tikhonov A, Arsenin V (1977) Solution of ill-posed problems. Bull Am Math Soc 1(3):521–524
MathSciNet Google Scholar
Wang J, Wang Z, Gao C, Sang N, Huang R (2017) Deeplist: Learning deep features with adaptive listwise constraint for person reidentification. IEEE Transactions on Circuits and Systems for Video Technology 27(3):513–524
Article Google Scholar
Wang X, Hua Y, Kodirov E, Hu G, Garnier R, Robertson N (2019) Ranked list loss for deep metric learning. In: The proceeding of computer vision and pattern recognition
Wang X, Han X, Huang W, Dong D, Scott M (2019) Multi-similarity loss with general pair weighting for deep metric learning. In: The proceeding of computer vision and pattern recognition
Weinberger K, Blitzer J, Saul L (2006) Distance metric learning for large margin nearest neighbor classification. In: The proceeding of neural information processing systems
Wu Z (2016) Human Re-identification, 1st edn. Springer International Publishing, New York. ISBN:978-3-319-40991-7
Book Google Scholar
Wu Z, Li Y, Radke R (2015) Viewpoint invariant human Re-Identification in camera networks using pose priors and Subject-Discriminative features. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(5):1095–1108
Article Google Scholar
Wu A, Zheng W, Lai J (2017) Robust depth-based person re-identification. IEEE Trans Image Process 26(6):2588–2603
Article MathSciNet Google Scholar
Xie S, Girshick R, Dollar P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv.org, pp 1611.05431
Xing E, Ng A, Jordan M, Russell S (2003) Distance metric learning, with application to clustering with side-information. In: The proceeding of neural information processing systems
Yang L, Jin R (2006) Distance Metric learning: A comprehensive survey. [Online] Available: https://www.cs.cmu.edu/liuy/frame_survey_v2.pdf
Yi D, Lei Z, Liao S, Li S (2014) Deep metric learning for person re-identification. In: The proceeding of international conference on pattern recognition
Yu J, Yang X, Gao F, Tao D (2017) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Transactions on Cybernetics 47(12):4014–4024
Article Google Scholar
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv.org, pp 1605.07146
Zhang Z, Saligrama V (2017) PRISM: Person reidentification via structured matching. IEEE Transactions on Circuits and Systems for Video Technology 27(3):499–512
Article Google Scholar
Zheng A, Zhang X, Jiang B, Li C (2020) A subspace learning approach to multishot person reidentification. IEEE Transactions on Systems, Man, and Cybernetics: Systems 50(1):149–158
Article Google Scholar
Zheng W, Chen Z, Lu J, Zhou J (2019) Hardness-aware deep metric learning. In: The proceeding of computer vision and pattern recognition
Zheng W, Gong S, Xiang T (2009) Associating groups of people. In: The proceeding of british machine vision conference
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: A benchmark. In: The proceeding of international conference on computer vision
Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person re-identification with k-reciprocal encoding. In: The proceeding of international conference on computer vision
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B 67(1):301–320
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work was partially supported by grants 61862043 and 61971352 approved by National Natural Science Foundation of China, grant S2020RCDT2K0033 approved by Natural Science Foundation of Jiangxi Province, and grant 2018JM6015 approved by Natural Science Foundation of Shaanxi Province. rgb 0,0,0The source code of the new method has been made public through the following URL: https://github.com/Lmy0217/FDML.

Author information

Authors and Affiliations

Department of Computer Science, School of Information Engineering, Nanchang University, Nanchang, 330022, China
Wei Huang
Informatization Office, Nanchang University, Nanchang, 330022, China
Wei Huang
Guangdong Provincial Key Laboratory of Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, 518061, China
Mingyuan Luo
National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, China
Peng Zhang & Yufei Zha

Authors

Wei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Mingyuan Luo
View author publications
You can also search for this author in PubMed Google Scholar
Peng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yufei Zha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Huang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, W., Luo, M., Zhang, P. et al. Full-scaled deep metric learning for pedestrian re-identification. Multimed Tools Appl 80, 5945–5975 (2021). https://doi.org/10.1007/s11042-020-09997-x

Download citation

Received: 27 March 2020
Revised: 15 September 2020
Accepted: 29 September 2020
Published: 10 October 2020
Issue Date: February 2021
DOI: https://doi.org/10.1007/s11042-020-09997-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Full-scaled deep metric learning for pedestrian re-identification

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Statistical Analysis of Design Aspects of Various YOLO-Based Deep Learning Models for Object Detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Full-scaled deep metric learning for pedestrian re-identification

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Statistical Analysis of Design Aspects of Various YOLO-Based Deep Learning Models for Object Detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation