Skip to main content
Log in

T-MAN: a neural ensemble approach for person re-identification using spatio-temporal information

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Person re-identification plays a central role in tracking and monitoring crowd movement in public places, and hence it serves as an important means for providing public security in video surveillance application sites. The problem of person re-identification has received significant attention in the past few years, and with the introduction of deep learning, several interesting approaches have been developed. In this paper, we propose an ensemble model called Temporal Motion Aware Network (T-MAN) for handling the visual context and spatio-temporal information jointly from the input video sequences. Our methodology makes use of the long-range motion context with recurrent information for establishing correspondences among multiple cameras. The proposed T-MAN approach first extracts explicit frame-level feature descriptors from a given video sequence by using three different sub-networks (FPAN, MPN, and LSTM), and then aggregates these models using an ensemble technique to perform re-identification. The method has been evaluated on three publicly available data sets, namely, the PRID-2011, iLIDS-VID, and MARS, and re-identification accuracy of 83.0%, 73.5%, and 83.3% have been obtained from these three data sets, respectively. Experimental results emphasize the effectiveness of our approach and its superiority over the state-of-the-art techniques for video-based person re-identification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: a system for large-scale machine learning. In: 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16), pp 265–283

  2. Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3908–3916

  3. Cai W, Wei Z (2020) PiiGAN: generative adversarial networks for pluralistic image inpainting. IEEE Access 8:48451–48463

    Article  Google Scholar 

  4. Chen L, Lou J, Xu F, Ren M (2019) Grid-based multi-object tracking with siamese CNN based appearance edge and access region mechanism. Multimed Tool Appl :1–19

  5. Cheng DS, Cristani M, Stoppa M, Bazzani L, Murino V (2011) Custom pictorial structures for re-identification. In: Proceedings of the British machine vision conference. Citeseer, pp 1–11

  6. Chung D, Tahboub K, Delp EJ (2017) A two stream siamese convolutional neural network for person re-identification. In: Proceedings of the IEEE international conference on computer vision , pp 1983–1991

  7. Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Proceedings of the advances in neural information processing systems, pp 379–387

  8. Dehghan A, Modiri Assari S, Shah M (2015) GMMCP tracker: globally optimal generalized maximum multi clique problem for multiple object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4091–4099

  9. Fan D-P, Wang W, Cheng M-M, Shen J (2019) Shifting more attention to video salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition , pp 8554–8564

  10. Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 2360–2367

  11. Fawaz HI, Forestier G, Weber J, Idoumghar L, Muller P (2019) Deep neural network ensembles for time series classification. arXiv:1903.06602

  12. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object Detection with Discriminatively Trained Part-Based Models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645

    Article  Google Scholar 

  13. Fu J, Zheng H, Mei T (2017) Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4438–4446

  14. Gao C, Wang J, Liu L, Yu J-G, Sang N (2016) Temporally aligned pooling representation for video-based person re-identification. In: Proceedings of the IEEE international conference on image processing, IEEE, pp 4284–4288

  15. Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Proceedings of the European conference on computer vision, Springer, pp 262–275

  16. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  17. Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Proceedings of the scandinavian conference on image analysis. Springer , pp 91–102

  18. Karanam S, Li Y, Radke RJ (2015) Sparse re-id: block sparsity for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops , pp 33–40

  19. Klaser A, Marszałek M, Schmid C (2008) A spatio-temporal descriptor based on 3D-gradients. In: Proceedings of the 19th british machine vision conference, pp 275:1–10

  20. Koestinger M, Hirzer M, Wohlhart P, Roth PM, Bischof H (2012) Large scale metric learning from equivalence constraints. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 2288–2295

  21. Kviatkovsky I, Adam A, Rivlin E (2012) Color invariants for person reidentification. IEEE Trans Pattern Anal Mach Intell 35(7):1622–1634

    Article  Google Scholar 

  22. Li W, Zhao R, Xiao T, Wang X (2014) DeepReID: deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 152–159

  23. Li M, Zhu X, Gong S (2018) Unsupervised person re-identification by deep learning tracklet association. In: Proceedings of the European conference on computer vision, pp 737–753

  24. Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2197–2206

  25. Liu H, Jie Z, Jayashree K, Qi M, Jiang J, Yan S, Feng J (2017) Video-based person re-identification with accumulative motion context. IEEE Trans Circuits Syst Video Technol 28(10):2788–2802

    Article  Google Scholar 

  26. Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for video-based pedestrian re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3810–3818

  27. Liu Y, Yan J, Ouyang W (2017) Quality aware network for set to set recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5790–5799

  28. Lu X, Ma C, Ni B, Yang X (2019) Adaptive region proposal with channel regularization for robust object tracking. IEEE Trans Circuits Syst Video Technol

  29. Lu X, Wang W, Ma C, Shen J, Shao L, Porikli F (2019) See more, know more: unsupervised video object segmentation with co-attention siamese networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3623–3632

  30. Ma M (2019) Infrared pedestrian detection algorithm based on multimedia image recombination and matrix restoration. Multimed Tools Appl :1–16

  31. Ma B, Su Y, Jurie F (2012) Local descriptors encoded by fisher vectors for person re-identification. In: Proceedings of the European conference on computer vision. Springer, pp 413–422

  32. Ma B, Su Y, Jurie F (2014) Covariance descriptor based on bio-inspired features for person re-identification and face verification. Image Vis Comput 32 (6-7):379–390

    Article  Google Scholar 

  33. McLaughlin N, Martinez del Rincon J, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1325–1334

  34. Minetto R, Segundo MP, Sarkar S (2019) Hydra: An ensemble of convolutional neural networks for geospatial land Classification. IEEE Trans Geosci Remote Sens

  35. Muñoz DU, Ruiz-Aguilar JJ, González-Enrique J, Domínguez I J T (2019) A deep ensemble neural network approach to improve predictions of container inspection volume. In: Proceedings of the international work-conference on artificial neural networks. Springer, pp 806–817

  36. Nguyen HD, Na IS, Kim SH, Lee GS, Yang HJ, Choi JH (2019) Multiple human tracking in drone image. Multimed Tools Appl 78(4):4563–4577

    Article  Google Scholar 

  37. Prosser BJ, Zheng W-S, Gong S, Xiang T, Mary Q (2010) Person re-identification by support vector ranking. In: Proceedings of the British machine vision conference, pp 1–11

  38. Pytorch: Models. https://pytorch.org/docs/stable/torchvision/models.html. Accessed: 2020-01-16

  39. Tagore NK, Singh SK (2019) Crowd counting in a highly congested scene using deep augmentation based convolutional network. Available at SSRN 3392307

  40. Wang T, Gong S, Zhu X, Wang S (2016) Person re-identification by discriminative selection in video ranking. IEEE Trans Pattern Anal Mach Intell 38(12):2501–2514

    Article  Google Scholar 

  41. Wang Z, Zou C, Cai W (2020) Small sample classification of hyperspectral remote sensing images based on sequential joint deeping learning model. IEEE Access 8:71353–71363

    Article  Google Scholar 

  42. Wang Z, Zou C, Cai W (2020) Small sample classification of hyperspectral remote sensing images based on sequential joint deeping learning model. IEEE Access 8:71353–71363

    Article  Google Scholar 

  43. Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(Feb):207–244

    MATH  Google Scholar 

  44. Wu Z, Shen C, Van Den Hengel A (2019) Wider or deeper: revisiting the resnet model for visual recognition. Pattern Recogn 90:119–133

    Article  Google Scholar 

  45. Xiong F, Gou M, Camps O, Sznaier M (2014) Person re-identification using kernel-based metric learning methods. In: Proceedings of the European conference on computer vision, Springer, pp 1–16

  46. Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 4733–4742

  47. Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: Proceedings of the European conference on computer vision, Springer, pp 701–716

  48. Yang X, Chen P (2019) Person re-identification based on multi-scale convolutional network. Multimed Tools Appl :1–15

  49. Ye M, Li J, Ma AJ, Zheng L, Yuen PC (2019) Dynamic graph co-matching for unsupervised video-based person re-identification. IEEE Trans Image Process 28(6):2976–2990

    Article  MathSciNet  Google Scholar 

  50. You H, Tian S, Yu L, Lv Y (2019) Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Trans Geosci Remote Sens 58(2):1281–1293

    Article  Google Scholar 

  51. You H, Tian S, Yu L, Lv Y (2019) Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Trans Geosci Remote Sens 58(2):1281–1293

    Article  Google Scholar 

  52. You J, Wu A, Li X, Zheng W-S (2016) Top-push video-based person re-identification. In: Proceedings of the ieee conference on computer vision and pattern recognition, pp 1345–1353

  53. Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE international conference on computer vision, pp 5209–5217

  54. Zheng W-S, Gong S, Xiang T (2012) Reidentification by relative distance comparison. IEEE Trans Patt Anal Mach Intell 35(3):653–668

    Article  Google Scholar 

  55. Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124

  56. Zheng L, Zhang H, Sun S, Chandraker M, Yang Y, Tian Q (2017) Person re-identification in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1367–1376

  57. Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4747–4756

  58. Zhu X, Jing X-Y, You X, Zhang X, Zhang T (2018) Video-based person re-identification by simultaneously learning intra-video and inter-video distance metrics. IEEE Trans Image Process 27(11):5683–5695

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge NVIDIA for supporting their research with a TITAN Xp Graphics processing unit.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pratik Chattopadhyay.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tagore, N.K., Chattopadhyay, P. & Wang, L. T-MAN: a neural ensemble approach for person re-identification using spatio-temporal information. Multimed Tools Appl 79, 28393–28409 (2020). https://doi.org/10.1007/s11042-020-09398-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09398-0

Keywords

Navigation