Abstract
The pedestrian re-identification problem (i.e., re-id) is essential and pre-requisite in multi-camera video surveillance studies, provided the fact that pedestrian targets need to be accurately re-identified across a network of multiple cameras with non-overlapping fields of views before other post-hoc high-level utilizations (i.e., tracking, behaviors analyses, activities monitoring, etc.) can be carried out. Driven by recent developments in deep learning techniques, the important re-id problem is often tackled via either deep discriminant learning or deep generative learning techniques. However, most contemporary deep learning-based models with tremendously deep structures are not easy to be trained because of the notorious vanishings gradient problem. In this study, a novel full-scaled deep discriminant learning model is proposed. The novelty of the full-scale model is significant, as three crucial concepts in designing a deep learning model, including depth, width, and cardinality, are all taken into consideration, simultaneously. Therefore, the new model needs not to be tremendously deep but is more convenient to be trained. Moreover, based on the new model, a novel deep metric learning method is proposed to further solve the important re-id problem. Technically, two algorithms either based on the conventional SGD (stochastic gradient descent) or an alternative more efficient PGD (proximal gradient descent) are both derived. For experimental analyses, the newly introduced full-scaled deep metric learning method has been comprehensively compared with dozens of popular re-id methods proposed from either deep learning or shallow learning perspectives. Several well-known public re-id datasets have been incorporated and rigorous statistical analyses have been carried out to compare all methods regarding their re-id performance. The superiority of the novel full-scaled deep metric learning method has been substantiated, from the statistical point of view.
Similar content being viewed by others
References
An L, Chen X, Yang S (2016) Person Re-identification via Hypergraph-based Matching. Neurocomputing 182:247–254
Ahmed E, Jones M, Marks T (2015) An improved deep learning architecture for person re-identification. In: The proceeding of computer vision and pattern recognition
Ahmed E, Jones M, Marks T (2015) An improved deep learning architecture for person re-identification. In: The proceeding of computer vision and pattern recognition
Abu-El-Haija S, Kothari N, Lee J, et al. (2016) YouTube-8M: A Large-scale video classification benchmark. arXiv.org, pp 1609.0867
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. arXiv.org, pp 1701.07875
Bar-Hillel A, Hertz T, Shental N (2002) Learning via equivalence constraints, with applications to the enhancement of image and video retrieval. In: The proceeding of computer vision and pattern recognition
Bolettieri P, Esuli A, Falchi F, Lucchese C, Perego R, Piccioli T, Rabitti F (2009) CoPhIR: A test collection for content-based image retrieval. arXiv.org, pp 0905.4627
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge. ISBN: 978-0-521-83378-3
Chapelle O, Scholkopf B, Zien A (2006) Semi-supervised learning, 1st edn. MIT Press, Cambridge. ISBN: 978-0-262-03358-9
Camps O, Gou M, Hebble T, Karanam S, Lehmann O, Li Y, Radke R, Wu Z, Xiong F (2017) From the lab to the real world: Re-Identification in an airport camera network. IEEE Transactions on Circuits and Systems for Video Technology 27(3):540–553
Cakir F, He K, Xia X, Kulis B, Sclaroff S (2019) Deep metric learning to rank. In: The proceeding of computer vision and pattern recognition
Chen S, Guo C, Lai J (2016) Deep ranking for person re-identification via joint representation learning. IEEE Trans Image Process 25(5):2353–2367
Chen Y, Li J, Xiao H, Jin X, Yan S, Feng J (2017) Dual path networks. arXiv.org, pp 1707.01629
Chen J, Ying H, Liu X, Gu J, Feng R, Chen T, Gao H, Wu J (2020) A transfer learning based super-resolution microscopy for biopsy slice images: The joint methods perspective, IEEE/ACM Transactions on Computational Biology and Bioinformatics, . https://doi.org/10.1109/TCBB.2020.2991173
Chen S, Guo C, Lai J (2016) Deep ranking for person re-identification via joint representation learning. IEEE Trans Image Process 25(5):2353–2367
Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: A real-world web image database from National University of Singapore. In: The proceeding of content-based image and video retrieval
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) ImageNet: A large-scale hierarchical image database. In: The proceeding of computer vision and pattern recognition
Deng W, Zheng L, Ye Q, Kang G, Yang Y, Jiao J (2018) Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: The proceeding of international conference on computer vision
Duan Y, Lu J, Zheng W, Zhou J (2020) Deep adversarial metric learning. IEEE Transactions on Image Processing 29(1):2037–2051
Durugkar I, Gemp I, Mahadevan S (2016) Generative multi-adversarial networks. arXiv.org, pp 1611.01673
Everingham M, Van Gool L, Williams C, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 88 (2):303–338
Fang W, Hu H, Hu Z, Liao S, Li B (2018) Perceptual hash-based feature description for person re-identification. Neurocomputing 272:520–531
Gao W, Zhu Y, Zhang W, Zhang K, Gao H (2019) A Hierarchical Recurrent Approach to Predict Scene Graphs from a Visual-attention-oriented Perspective. Computational Intelligence 35(3):496–516
Gao H, Xu Y, Yin Y, Zhang W, Li R, Wang X (2020) Context-aware QoS prediction with neural collaborative filtering for internet-of-things services. IEEE Internet of Things Journal 7(5):4532–4542
Ge W, Huang W, Dong D, Scott M (2018) Deep metric learning with hierarchical triplet loss. In: The proceeding of european conference on computer vision
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. arXiv.org, pp 1406.2661
Gong C, Tao D, Liu W, Maybank S, Fang M, Fu K, Yang J (2015) Saliency propagation from simple to difficult. In: The proceeding of computer vision and pattern recognition
Goldberger J, Hinton G (2004) Neighbourhood components analysis. In: The proceeding of neural information processing systems
Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: The proceeding of european conference on computer vision
Ghahramani Z (2004) Unsupervised learning. Lect Notes Comput Sci 3176:72–112
Guo F, Wang W, Shen J, Shao L, Yang J, Tao D (2018) Video saliency detection using object proposals. IEEE Transactions on Cybernetics 48 (11):3159–3170
Guo M, Karanam S, Liu W, Camps O, Radke R (2017) DukeMTMC4ReID: A large-scale multi-camera person re-identification dataset. [Online]. Available: https://www.ecse.rpi.edu/rjradke/papers/dukemtmc4reid.pdf
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv.org, pp 1512.03385
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. arXiv.org, pp 1603.05027
Hoi S, Liu W, Lyu M (2006) Learning distance metrics with contextual constraints for image retrieval. In: The proceeding of computer vision and pattern recognition
Hu J, Lu J, Tan Y (2014) Discriminative deep metric learning for face verification in the wild. In: The proceeding of computer vision and pattern recognition
Hu J, Lu J, Tan Y, Zhou J (2016) Deep transfer metric learning. IEEE Trans Image Process 25(12):5576–5588
Hu J, Lu J, Tan Y (2016) Deep metric learning for visual tracking. IEEE Transactions on Circuits and Systems for Video Technology 26 (11):2056–2068
Huiskes M, Thomee B, Lew M (2010) New trends and ideas in visual concept detection. In: The proceeding of multimedia information retrieval
Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313:504–507
Karanam S, Li Y, Radke R (2015) Sparse Re-Id: Block sparsity for person re-identification. In: The proceeding of computer vision and pattern recognition
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of GANs for improved quality, stability, and variation. In: The proceeding of international conference on learning representations
Krasin I, Duerig T, Alldrin N, et al. (2017) Open images: A public dataset for large-scale multi-label and multi-class image classification. [Online] Available: https://storage.googleapis.com/openimages/web/index.html
Krizhevsky A (2009) Learning multiple layers of features from tiny images. [Online] Available: http://www.cs.toronto.edu/kriz/learning-features-2009-TR.pdf
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. In: The proceeding of neural information processing systems
Kim W, Goyal B, Chawla K, Lee J, Kwon K (2018) Attention-based ensemble for deep metric learning. In: The proceeding of european conference on computer vision
Li W, Zhao R, Xiao T, Wang X (2014) DeepReID: Deep filter pairing neural network for person re-identification. In: The proceeding of computer vision and pattern recognition
Liu H, Ma B, Qin L, Pang J, Zhang C, Huang Q (2015) Set-label modeling and deep metric learning on person re-identification. Neurocomputing 151:1283–1292
Liao S, Hu Y, Zhu X, Li S (2015) Person re-identification by local maximal occurrence representation and metric Learning. In: The proceeding of computer vision and pattern recognition
Lin T, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollar P, Zitnick C (2014) Microsoft COCO: Common objects in context. In: The proceeding of European conference on computer vision
Li H, Wang M, Hua X (2009) MSRA-MM 2.0: A large-scale web multimedia dataset. In: The proceeding of international conference on data mining
Li Z, Tang J (2015) Weakly supervised deep metric learning for Community-Contributed image retrieval. IEEE Transactions on Multimedia 17(11):1989–1999
Li W, Zhao R, Wang X (2012) Human reidentification with transferred metric learning. In: The proceeding of asian conference on computer vision
Li W, Zhao R, Xiao T, Wang X (2014) DeepReID: Deep filter pairing neural network for person re-identification. In: The proceeding of computer vision and pattern recognition
Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification. arXiV.org, pp. 1802.08122
Liu F, Gong C, Huang X, Zhou T, Yang J, Tao D (2018) Robust visual tracking revisited: from correlation filter to template matching. IEEE Trans Image Process 27(6):2777–2790
Lu J, Wang G, Deng W, Moulin P, Zhou J (2015) Multi-manifold deep metric learning for image set classification. In: The proceeding of computer vision and pattern recognition
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv.org, pp 1411.1784
Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. In: The proceeding of international conference on learning representations
Peng P, Xiang T, Wang Y, Pontil M, Gong S, Huang T, Tian Y (2016) Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns. In: The proceeding of computer vision and pattern recognition
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv.org, pp 1511.06434
Rice J (2007) Mathematical statistics and data analysis. Pacific Grove, CA: Duxbury, (2nd edition). ISBN: 978-8-131-51954-7
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Sabour S, Frosst N, Hinton G (2017) Dynamic routing between capsules. arXiv.org, pp 1710.09829
Soleimani A, Araabi B, Fouladi K (2016) Deep multi-task metric learning for offline signature verification. Pattern Recogn Lett 80:84–90
Suh Y, Han B, Kim W, Lee K (2019) Stochastic class-based hard example mining for deep metric learning. In: The proceeding of computer vision and pattern recognition
Subramaniam A, Chatterjee M, Mittal A (2016) Deep neural networks with inexact matching for person re-identification. In: The proceeding of neural information processing systems
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv.org, pp 1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. arXiv.org, pp1409.4842
Tibshirani R (1996) Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, Series B 58(1):267–288
Tikhonov A, Arsenin V (1977) Solution of ill-posed problems. Bull Am Math Soc 1(3):521–524
Wang J, Wang Z, Gao C, Sang N, Huang R (2017) Deeplist: Learning deep features with adaptive listwise constraint for person reidentification. IEEE Transactions on Circuits and Systems for Video Technology 27(3):513–524
Wang X, Hua Y, Kodirov E, Hu G, Garnier R, Robertson N (2019) Ranked list loss for deep metric learning. In: The proceeding of computer vision and pattern recognition
Wang X, Han X, Huang W, Dong D, Scott M (2019) Multi-similarity loss with general pair weighting for deep metric learning. In: The proceeding of computer vision and pattern recognition
Weinberger K, Blitzer J, Saul L (2006) Distance metric learning for large margin nearest neighbor classification. In: The proceeding of neural information processing systems
Wu Z (2016) Human Re-identification, 1st edn. Springer International Publishing, New York. ISBN:978-3-319-40991-7
Wu Z, Li Y, Radke R (2015) Viewpoint invariant human Re-Identification in camera networks using pose priors and Subject-Discriminative features. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(5):1095–1108
Wu A, Zheng W, Lai J (2017) Robust depth-based person re-identification. IEEE Trans Image Process 26(6):2588–2603
Xie S, Girshick R, Dollar P, Tu Z, He K (2016) Aggregated residual transformations for deep neural networks. arXiv.org, pp 1611.05431
Xing E, Ng A, Jordan M, Russell S (2003) Distance metric learning, with application to clustering with side-information. In: The proceeding of neural information processing systems
Yang L, Jin R (2006) Distance Metric learning: A comprehensive survey. [Online] Available: https://www.cs.cmu.edu/liuy/frame_survey_v2.pdf
Yi D, Lei Z, Liao S, Li S (2014) Deep metric learning for person re-identification. In: The proceeding of international conference on pattern recognition
Yu J, Yang X, Gao F, Tao D (2017) Deep multimodal distance metric learning using click constraints for image ranking. IEEE Transactions on Cybernetics 47(12):4014–4024
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv.org, pp 1605.07146
Zhang Z, Saligrama V (2017) PRISM: Person reidentification via structured matching. IEEE Transactions on Circuits and Systems for Video Technology 27(3):499–512
Zheng A, Zhang X, Jiang B, Li C (2020) A subspace learning approach to multishot person reidentification. IEEE Transactions on Systems, Man, and Cybernetics: Systems 50(1):149–158
Zheng W, Chen Z, Lu J, Zhou J (2019) Hardness-aware deep metric learning. In: The proceeding of computer vision and pattern recognition
Zheng W, Gong S, Xiang T (2009) Associating groups of people. In: The proceeding of british machine vision conference
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: A benchmark. In: The proceeding of international conference on computer vision
Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person re-identification with k-reciprocal encoding. In: The proceeding of international conference on computer vision
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B 67(1):301–320
Acknowledgments
This work was partially supported by grants 61862043 and 61971352 approved by National Natural Science Foundation of China, grant S2020RCDT2K0033 approved by Natural Science Foundation of Jiangxi Province, and grant 2018JM6015 approved by Natural Science Foundation of Shaanxi Province. rgb 0,0,0The source code of the new method has been made public through the following URL: https://github.com/Lmy0217/FDML.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huang, W., Luo, M., Zhang, P. et al. Full-scaled deep metric learning for pedestrian re-identification. Multimed Tools Appl 80, 5945–5975 (2021). https://doi.org/10.1007/s11042-020-09997-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09997-x