Skip to main content
Log in

Independent metric learning with aligned multi-part features for video-based person re-identification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Video-based person re-identification attracts wide attention because it plays a crucial role for many applications in the video surveillance. The task of video-based person re-identification is to match image sequences of the pedestrian recorded by non-overlapping cameras. Like many visual recognition problems, variations in pose, viewpoints, illumination, and occlusion make this task non-trivial. Aiming at increasing the robustness of features to variations and occlusion, this paper designs an aligned multi-part image model inspired by human visual attention mechanism. This model performs a pose estimation method to align the pedestrians. Then, it divides the images to extract multi-part appearance features. Besides, we present independent metric learning to combine the multi-part appearance and spatial-temporal features, which obtains several metric kernels by feeding these features into distance metric learning respectively. These kernels are fused with the weights learned by the attention measure. The novel way of features fusion can achieve better functional complementarity of these features. In experiments, we analyze the effectiveness of the major components. Extensive experiments on two public benchmark datasets, i.e., the iLIDS-VID and PRID-2011 datasets, demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Chen J, Wang Y, Tang YY (2016) Person re-identification by exploiting spatio-temporal cues and multi-view metric learning. IEEE Signal Process Lett 23 (7):998–1002. https://doi.org/10.1109/LSP.2016.2574323

    Article  Google Scholar 

  2. Cho YJ, Yoon KJ (2016) Improving person re-identification via pose-aware multi-shot matching. In: Computer vision and pattern recognition, pp 1354–1362

  3. Chu H, Qi M, Liu H, Jiang J (2017) Local region partition for person re-identification.Multimed Tools Appl (7):1–17

  4. Ferrari V, Marinjimenez M, Zisserman A (2008) Progressive search space reduction for human pose estimation. In: IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008, pp 1–8

  5. Gao C, Wang J, Liu L, Yu JG, Sang N (2016) Temporally aligned pooling representation for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4284–4288, DOI https://doi.org/10.1109/ICIP.2016.7533168, (to appear in print)

  6. Gordon CC, Churchill T, Clauser CE, Bradtmiller B, Mcconville JT (1989) Anthropometric survey of us army personnel: methods and summary statistics 1988. Tech. rep., Anthropology Research Project Inc., Yellow Springs, OH

  7. He L, Xu X, Lu H, Yang Y, Shen F, Shen HT (2017) Unsupervised cross-modal retrieval through adversarial learning. In: IEEE International conference on multimedia and expo, pp 1153–1158

  8. Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis, pp 91–102

  9. Itti L, Koch C (2000) A saliency-based search mechanism for overt and covert shifts of visual attention. Vis Res 40(12):1489–1506

    Article  Google Scholar 

  10. Klaser A, Marszałek M, Schmid C (2008) A spatio-temporal descriptor based on 3d-gradients. In: BMVC 2008-19Th British machine vision conference, pp 275–1. British machine vision association

  11. Li W, Wang X (2013) Locally aligned feature transforms across views. In: Computer vision and pattern recognition, pp 3594–3601

  12. Li Y, Zhuo L, Li J, Zhang J, Liang X, Tian Q (2017) Video-based person re-identification by deep feature guided pooling. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 1454–1461. https://doi.org/10.1109/CVPRW.2017.188

  13. Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), pp 2197–2206. https://doi.org/10.1109/CVPR.2015.7298832

  14. Liu H, Jie Z, Jayashree K, Qi M, Jiang J, Yan S, Feng J (2017) Video-based person re-identification with accumulative motion context. IEEE Trans Circuits Syst Video Technol PP(99):1–1. https://doi.org/10.1109/TCSVT.2017.2715499

    Article  Google Scholar 

  15. Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for video-based pedestrian re-identification. In: IEEE International conference on computer vision, pp 3810– 3818

  16. Liu Z, Chen J, Wang Y (2016) A fast adaptive spatio-temporal 3d feature for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 4294– 4298

  17. Mclaughlin N, Rincon JMD, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Computer vision and pattern recognition, pp 1325–1334

  18. Ramanan D (2007) Learning to parse images of articulated bodies. In: Advances in neural information processing systems, pp 1129–1136

  19. Song Z, Cai X, Chen Y, Zeng Y, Lv L, Shu H (2017) Deep convolutional neural networks with adaptive spatial feature for person re-identification. In: IEEE Advanced information technology, electronic and automation control conference, pp 2020–2023

  20. Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: European conference on computer vision, pp 135–153

  21. Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision, pp 688–703

  22. Wang T, Gong S, Zhu X, Wang S (2016) Person re-identification by discriminative selection in video ranking. IEEE Trans Pattern Anal Mach Intell 38 (12):2501–2514. https://doi.org/10.1109/TPAMI.2016.2522418

    Article  Google Scholar 

  23. Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) Glad: global-local-alignment descriptor for pedestrian retrieval. In: Proceedings of the 2017 ACM on multimedia conference. ACM, pp 420–428

  24. Xiao Q, Cao K, Chen H, Peng F, Zhang C (2016) Cross domain knowledge transfer for person re-identification. arXiv:1611.06026

  25. Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: 2017 IEEE international conference on computer vision (ICCV), pp 4743–4752. https://doi.org/10.1109/ICCV.2017.507

  26. Xu X, He L, Lu H, Gao L, Ji Y (2018) Deep adversarial metric learning for cross-modal retrieval. World Wide Web-internet & Web Information Systems, pp 1–16

  27. Yang Y, Ramanan D (2013) Articulated human detection with flexible mixtures of parts. IEEE Trans Pattern Anal Mach Intell 35(12):2878–2890

    Article  Google Scholar 

  28. Yao H, Zhang S, Zhang Y, Li J, Tian Q (2017) Deep representation learning with part loss for person re-identification. arXiv:1707.00798

  29. You J, Wu A, Li X, Zheng WS (2016) Top-push video-based person re-identification. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 1345–1353. https://doi.org/10.1109/CVPR.2016.150

  30. Zhang W, Chen Q, Zhang W, He X (2018) Long-range terrain perception using convolutional neural networks. Neurocomputing 275:781–787

    Article  Google Scholar 

  31. Zhang W, Hu S, Liu K (2017) Learning compact appearance representation for video-based person re-identification. arXiv:1702.06294

  32. Zhang W, Ma B, Liu K, Huang R (2017) Video-based pedestrian re-identification by adaptive spatio-temporal appearance model. IEEE Trans Image Process PP(99):1–1

    MathSciNet  MATH  Google Scholar 

  33. Zhang W, Yu X, He X (2017) Learning bidirectional temporal cues for video-based person re-identification. IEEE Trans Circuits Syst Video Technol PP (99):1–1. https://doi.org/10.1109/TCSVT.2017.2718188

    Article  Google Scholar 

  34. Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X (2017) Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: Computer vision and pattern recognition, pp 907–915

  35. Zheng L, Huang Y, Lu H, Yang Y (2017) Pose invariant embedding for deep person re-identification. arXiv:1701.07732

  36. Zheng L, Yang Y, Hauptmann AG (2016) Person re-identification: Past, present and future. arXiv:1610.02984

  37. Zheng S, Li X, Men A, Guo X, Yang B (2017) Integration of deep features and hand-crafted features for person re-identification. In: 2017 IEEE international conference on multimedia expo workshops (ICMEW), pp 674–679. https://doi.org/10.1109/ICMEW.2017.8026267

  38. Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In: IEEE Conference on computer vision and pattern recognition, pp 6776–6785

  39. Zhu J, Zeng H, Liao S, Lei Z, Cai C, Zheng LX (2017) Deep hybrid similarity learning for person re-identification. IEEE Trans Circuits Syst Video Technol PP(99):1–1

    Google Scholar 

  40. Zhu X, Jing XY, Wu F, Feng H (2016) Video-based person re-identification by simultaneously learning intra-video and inter-video distance metrics. In: International joint conference on artificial intelligence, pp 3552–3558

  41. Zhu X, Jing XY, Yang L, You X, Chen D, Gao G, Wang Y (2017) Semi-supervised cross-view projection-based dictionary learning for video-based person re-identification. IEEE Trans Circuits Syst Video Technol PP(99):1–1. https://doi.org/10.1109/TCSVT.2017.2718036

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingjing Wu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by the National Natural Science Foundation of China Grant 61876056 and Grant 61771180.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, J., Jiang, J., Qi, M. et al. Independent metric learning with aligned multi-part features for video-based person re-identification. Multimed Tools Appl 78, 29323–29341 (2019). https://doi.org/10.1007/s11042-018-7119-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-7119-6

Keywords

Navigation