Independent metric learning with aligned multi-part features for video-based person re-identification

Wu, Jingjing; Jiang, Jianguo; Qi, Meibin; Liu, Hao

doi:10.1007/s11042-018-7119-6

Independent metric learning with aligned multi-part features for video-based person re-identification

Published: 03 January 2019

Volume 78, pages 29323–29341, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jingjing Wu ORCID: orcid.org/0000-0002-3818-4277¹,
Jianguo Jiang¹,
Meibin Qi¹ &
…
Hao Liu¹

369 Accesses
6 Citations
3 Altmetric
Explore all metrics

Abstract

Video-based person re-identification attracts wide attention because it plays a crucial role for many applications in the video surveillance. The task of video-based person re-identification is to match image sequences of the pedestrian recorded by non-overlapping cameras. Like many visual recognition problems, variations in pose, viewpoints, illumination, and occlusion make this task non-trivial. Aiming at increasing the robustness of features to variations and occlusion, this paper designs an aligned multi-part image model inspired by human visual attention mechanism. This model performs a pose estimation method to align the pedestrians. Then, it divides the images to extract multi-part appearance features. Besides, we present independent metric learning to combine the multi-part appearance and spatial-temporal features, which obtains several metric kernels by feeding these features into distance metric learning respectively. These kernels are fused with the weights learned by the attention measure. The novel way of features fusion can achieve better functional complementarity of these features. In experiments, we analyze the effectiveness of the major components. Extensive experiments on two public benchmark datasets, i.e., the iLIDS-VID and PRID-2011 datasets, demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video-Based Person Re-identification with Adaptive Multi-part Features Learning

Discriminative feature extraction for video person re-identification via multi-task network

Article 02 September 2020

Person re-identification from UAVs based on Deep hybrid features: Application for intelligent video surveillance

Article 02 September 2024

References

Chen J, Wang Y, Tang YY (2016) Person re-identification by exploiting spatio-temporal cues and multi-view metric learning. IEEE Signal Process Lett 23 (7):998–1002. https://doi.org/10.1109/LSP.2016.2574323
Article Google Scholar
Cho YJ, Yoon KJ (2016) Improving person re-identification via pose-aware multi-shot matching. In: Computer vision and pattern recognition, pp 1354–1362
Chu H, Qi M, Liu H, Jiang J (2017) Local region partition for person re-identification.Multimed Tools Appl (7):1–17
Ferrari V, Marinjimenez M, Zisserman A (2008) Progressive search space reduction for human pose estimation. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008, pp 1–8
Gao C, Wang J, Liu L, Yu JG, Sang N (2016) Temporally aligned pooling representation for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4284–4288, DOI https://doi.org/10.1109/ICIP.2016.7533168, (to appear in print)
Gordon CC, Churchill T, Clauser CE, Bradtmiller B, Mcconville JT (1989) Anthropometric survey of us army personnel: methods and summary statistics 1988. Tech. rep., Anthropology Research Project Inc., Yellow Springs, OH
He L, Xu X, Lu H, Yang Y, Shen F, Shen HT (2017) Unsupervised cross-modal retrieval through adversarial learning. In: IEEE International conference on multimedia and expo, pp 1153–1158
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis, pp 91–102
Itti L, Koch C (2000) A saliency-based search mechanism for overt and covert shifts of visual attention. Vis Res 40(12):1489–1506
Article Google Scholar
Klaser A, Marszałek M, Schmid C (2008) A spatio-temporal descriptor based on 3d-gradients. In: BMVC 2008-19Th British machine vision conference, pp 275–1. British machine vision association
Li W, Wang X (2013) Locally aligned feature transforms across views. In: Computer vision and pattern recognition, pp 3594–3601
Li Y, Zhuo L, Li J, Zhang J, Liang X, Tian Q (2017) Video-based person re-identification by deep feature guided pooling. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 1454–1461. https://doi.org/10.1109/CVPRW.2017.188
Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 2197–2206. https://doi.org/10.1109/CVPR.2015.7298832
Liu H, Jie Z, Jayashree K, Qi M, Jiang J, Yan S, Feng J (2017) Video-based person re-identification with accumulative motion context. IEEE Trans Circuits Syst Video Technol PP(99):1–1. https://doi.org/10.1109/TCSVT.2017.2715499
Article Google Scholar
Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for video-based pedestrian re-identification. In: IEEE International conference on computer vision, pp 3810– 3818
Liu Z, Chen J, Wang Y (2016) A fast adaptive spatio-temporal 3d feature for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 4294– 4298
Mclaughlin N, Rincon JMD, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Computer vision and pattern recognition, pp 1325–1334
Ramanan D (2007) Learning to parse images of articulated bodies. In: Advances in neural information processing systems, pp 1129–1136
Song Z, Cai X, Chen Y, Zeng Y, Lv L, Shu H (2017) Deep convolutional neural networks with adaptive spatial feature for person re-identification. In: IEEE Advanced information technology, electronic and automation control conference, pp 2020–2023
Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: European conference on computer vision, pp 135–153
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision, pp 688–703
Wang T, Gong S, Zhu X, Wang S (2016) Person re-identification by discriminative selection in video ranking. IEEE Trans Pattern Anal Mach Intell 38 (12):2501–2514. https://doi.org/10.1109/TPAMI.2016.2522418
Article Google Scholar
Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) Glad: global-local-alignment descriptor for pedestrian retrieval. In: Proceedings of the 2017 ACM on multimedia conference. ACM, pp 420–428
Xiao Q, Cao K, Chen H, Peng F, Zhang C (2016) Cross domain knowledge transfer for person re-identification. arXiv:1611.06026
Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: 2017 IEEE international conference on computer vision (ICCV), pp 4743–4752. https://doi.org/10.1109/ICCV.2017.507
Xu X, He L, Lu H, Gao L, Ji Y (2018) Deep adversarial metric learning for cross-modal retrieval. World Wide Web-internet & Web Information Systems, pp 1–16
Yang Y, Ramanan D (2013) Articulated human detection with flexible mixtures of parts. IEEE Trans Pattern Anal Mach Intell 35(12):2878–2890
Article Google Scholar
Yao H, Zhang S, Zhang Y, Li J, Tian Q (2017) Deep representation learning with part loss for person re-identification. arXiv:1707.00798
You J, Wu A, Li X, Zheng WS (2016) Top-push video-based person re-identification. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 1345–1353. https://doi.org/10.1109/CVPR.2016.150
Zhang W, Chen Q, Zhang W, He X (2018) Long-range terrain perception using convolutional neural networks. Neurocomputing 275:781–787
Article Google Scholar
Zhang W, Hu S, Liu K (2017) Learning compact appearance representation for video-based person re-identification. arXiv:1702.06294
Zhang W, Ma B, Liu K, Huang R (2017) Video-based pedestrian re-identification by adaptive spatio-temporal appearance model. IEEE Trans Image Process PP(99):1–1
MathSciNet MATH Google Scholar
Zhang W, Yu X, He X (2017) Learning bidirectional temporal cues for video-based person re-identification. IEEE Trans Circuits Syst Video Technol PP (99):1–1. https://doi.org/10.1109/TCSVT.2017.2718188
Article Google Scholar
Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X (2017) Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: Computer vision and pattern recognition, pp 907–915
Zheng L, Huang Y, Lu H, Yang Y (2017) Pose invariant embedding for deep person re-identification. arXiv:1701.07732
Zheng L, Yang Y, Hauptmann AG (2016) Person re-identification: Past, present and future. arXiv:1610.02984
Zheng S, Li X, Men A, Guo X, Yang B (2017) Integration of deep features and hand-crafted features for person re-identification. In: 2017 IEEE international conference on multimedia expo workshops (ICMEW), pp 674–679. https://doi.org/10.1109/ICMEW.2017.8026267
Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In: IEEE Conference on computer vision and pattern recognition, pp 6776–6785
Zhu J, Zeng H, Liao S, Lei Z, Cai C, Zheng LX (2017) Deep hybrid similarity learning for person re-identification. IEEE Trans Circuits Syst Video Technol PP(99):1–1
Google Scholar
Zhu X, Jing XY, Wu F, Feng H (2016) Video-based person re-identification by simultaneously learning intra-video and inter-video distance metrics. In: International joint conference on artificial intelligence, pp 3552–3558
Zhu X, Jing XY, Yang L, You X, Chen D, Gao G, Wang Y (2017) Semi-supervised cross-view projection-based dictionary learning for video-based person re-identification. IEEE Trans Circuits Syst Video Technol PP(99):1–1. https://doi.org/10.1109/TCSVT.2017.2718036
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, Anhui, 230009, China
Jingjing Wu, Jianguo Jiang, Meibin Qi & Hao Liu

Authors

Jingjing Wu
View author publications
You can also search for this author inPubMed Google Scholar
Jianguo Jiang
View author publications
You can also search for this author inPubMed Google Scholar
Meibin Qi
View author publications
You can also search for this author inPubMed Google Scholar
Hao Liu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jingjing Wu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by the National Natural Science Foundation of China Grant 61876056 and Grant 61771180.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, J., Jiang, J., Qi, M. et al. Independent metric learning with aligned multi-part features for video-based person re-identification. Multimed Tools Appl 78, 29323–29341 (2019). https://doi.org/10.1007/s11042-018-7119-6

Download citation

Received: 26 December 2017
Revised: 08 November 2018
Accepted: 21 December 2018
Published: 03 January 2019
Issue Date: October 2019
DOI: https://doi.org/10.1007/s11042-018-7119-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Independent metric learning with aligned multi-part features for video-based person re-identification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Video-Based Person Re-identification with Adaptive Multi-part Features Learning

Discriminative feature extraction for video person re-identification via multi-task network

Person re-identification from UAVs based on Deep hybrid features: Application for intelligent video surveillance

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now