T-MAN: a neural ensemble approach for person re-identification using spatio-temporal information

Tagore, Nirbhay Kumar; Chattopadhyay, Pratik; Wang, Lipo

doi:10.1007/s11042-020-09398-0

T-MAN: a neural ensemble approach for person re-identification using spatio-temporal information

Published: 03 August 2020

Volume 79, pages 28393–28409, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

232 Accesses
7 Citations
Explore all metrics

Abstract

Person re-identification plays a central role in tracking and monitoring crowd movement in public places, and hence it serves as an important means for providing public security in video surveillance application sites. The problem of person re-identification has received significant attention in the past few years, and with the introduction of deep learning, several interesting approaches have been developed. In this paper, we propose an ensemble model called Temporal Motion Aware Network (T-MAN) for handling the visual context and spatio-temporal information jointly from the input video sequences. Our methodology makes use of the long-range motion context with recurrent information for establishing correspondences among multiple cameras. The proposed T-MAN approach first extracts explicit frame-level feature descriptors from a given video sequence by using three different sub-networks (FPAN, MPN, and LSTM), and then aggregates these models using an ensemble technique to perform re-identification. The method has been evaluated on three publicly available data sets, namely, the PRID-2011, iLIDS-VID, and MARS, and re-identification accuracy of 83.0%, 73.5%, and 83.3% have been obtained from these three data sets, respectively. Experimental results emphasize the effectiveness of our approach and its superiority over the state-of-the-art techniques for video-based person re-identification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video-based person re-identification using a novel feature extraction and fusion technique

Article 16 January 2020

Wanru Song, Jieying Zheng, … Feng Liu

MARS: A Video Benchmark for Large-Scale Person Re-Identification

Person Re-identification: System Design and Evaluation Overview

References

Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: a system for large-scale machine learning. In: 12^th {USENIX} symposium on operating systems design and implementation ({OSDI} 16), pp 265–283
Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3908–3916
Cai W, Wei Z (2020) PiiGAN: generative adversarial networks for pluralistic image inpainting. IEEE Access 8:48451–48463
Article Google Scholar
Chen L, Lou J, Xu F, Ren M (2019) Grid-based multi-object tracking with siamese CNN based appearance edge and access region mechanism. Multimed Tool Appl :1–19
Cheng DS, Cristani M, Stoppa M, Bazzani L, Murino V (2011) Custom pictorial structures for re-identification. In: Proceedings of the British machine vision conference. Citeseer, pp 1–11
Chung D, Tahboub K, Delp EJ (2017) A two stream siamese convolutional neural network for person re-identification. In: Proceedings of the IEEE international conference on computer vision , pp 1983–1991
Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Proceedings of the advances in neural information processing systems, pp 379–387
Dehghan A, Modiri Assari S, Shah M (2015) GMMCP tracker: globally optimal generalized maximum multi clique problem for multiple object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4091–4099
Fan D-P, Wang W, Cheng M-M, Shen J (2019) Shifting more attention to video salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition , pp 8554–8564
Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 2360–2367
Fawaz HI, Forestier G, Weber J, Idoumghar L, Muller P (2019) Deep neural network ensembles for time series classification. arXiv:1903.06602
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object Detection with Discriminatively Trained Part-Based Models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Fu J, Zheng H, Mei T (2017) Look closer to see better: recurrent attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4438–4446
Gao C, Wang J, Liu L, Yu J-G, Sang N (2016) Temporally aligned pooling representation for video-based person re-identification. In: Proceedings of the IEEE international conference on image processing, IEEE, pp 4284–4288
Gray D, Tao H (2008) Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: Proceedings of the European conference on computer vision, Springer, pp 262–275
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Proceedings of the scandinavian conference on image analysis. Springer , pp 91–102
Karanam S, Li Y, Radke RJ (2015) Sparse re-id: block sparsity for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops , pp 33–40
Klaser A, Marszałek M, Schmid C (2008) A spatio-temporal descriptor based on 3D-gradients. In: Proceedings of the 19^th british machine vision conference, pp 275:1–10
Koestinger M, Hirzer M, Wohlhart P, Roth PM, Bischof H (2012) Large scale metric learning from equivalence constraints. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 2288–2295
Kviatkovsky I, Adam A, Rivlin E (2012) Color invariants for person reidentification. IEEE Trans Pattern Anal Mach Intell 35(7):1622–1634
Article Google Scholar
Li W, Zhao R, Xiao T, Wang X (2014) DeepReID: deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 152–159
Li M, Zhu X, Gong S (2018) Unsupervised person re-identification by deep learning tracklet association. In: Proceedings of the European conference on computer vision, pp 737–753
Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2197–2206
Liu H, Jie Z, Jayashree K, Qi M, Jiang J, Yan S, Feng J (2017) Video-based person re-identification with accumulative motion context. IEEE Trans Circuits Syst Video Technol 28(10):2788–2802
Article Google Scholar
Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for video-based pedestrian re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3810–3818
Liu Y, Yan J, Ouyang W (2017) Quality aware network for set to set recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5790–5799
Lu X, Ma C, Ni B, Yang X (2019) Adaptive region proposal with channel regularization for robust object tracking. IEEE Trans Circuits Syst Video Technol
Lu X, Wang W, Ma C, Shen J, Shao L, Porikli F (2019) See more, know more: unsupervised video object segmentation with co-attention siamese networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3623–3632
Ma M (2019) Infrared pedestrian detection algorithm based on multimedia image recombination and matrix restoration. Multimed Tools Appl :1–16
Ma B, Su Y, Jurie F (2012) Local descriptors encoded by fisher vectors for person re-identification. In: Proceedings of the European conference on computer vision. Springer, pp 413–422
Ma B, Su Y, Jurie F (2014) Covariance descriptor based on bio-inspired features for person re-identification and face verification. Image Vis Comput 32 (6-7):379–390
Article Google Scholar
McLaughlin N, Martinez del Rincon J, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1325–1334
Minetto R, Segundo MP, Sarkar S (2019) Hydra: An ensemble of convolutional neural networks for geospatial land Classification. IEEE Trans Geosci Remote Sens
Muñoz DU, Ruiz-Aguilar JJ, González-Enrique J, Domínguez I J T (2019) A deep ensemble neural network approach to improve predictions of container inspection volume. In: Proceedings of the international work-conference on artificial neural networks. Springer, pp 806–817
Nguyen HD, Na IS, Kim SH, Lee GS, Yang HJ, Choi JH (2019) Multiple human tracking in drone image. Multimed Tools Appl 78(4):4563–4577
Article Google Scholar
Prosser BJ, Zheng W-S, Gong S, Xiang T, Mary Q (2010) Person re-identification by support vector ranking. In: Proceedings of the British machine vision conference, pp 1–11
Pytorch: Models. https://pytorch.org/docs/stable/torchvision/models.html. Accessed: 2020-01-16
Tagore NK, Singh SK (2019) Crowd counting in a highly congested scene using deep augmentation based convolutional network. Available at SSRN 3392307
Wang T, Gong S, Zhu X, Wang S (2016) Person re-identification by discriminative selection in video ranking. IEEE Trans Pattern Anal Mach Intell 38(12):2501–2514
Article Google Scholar
Wang Z, Zou C, Cai W (2020) Small sample classification of hyperspectral remote sensing images based on sequential joint deeping learning model. IEEE Access 8:71353–71363
Article Google Scholar
Wang Z, Zou C, Cai W (2020) Small sample classification of hyperspectral remote sensing images based on sequential joint deeping learning model. IEEE Access 8:71353–71363
Article Google Scholar
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(Feb):207–244
MATH Google Scholar
Wu Z, Shen C, Van Den Hengel A (2019) Wider or deeper: revisiting the resnet model for visual recognition. Pattern Recogn 90:119–133
Article Google Scholar
Xiong F, Gou M, Camps O, Sznaier M (2014) Person re-identification using kernel-based metric learning methods. In: Proceedings of the European conference on computer vision, Springer, pp 1–16
Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 4733–4742
Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: Proceedings of the European conference on computer vision, Springer, pp 701–716
Yang X, Chen P (2019) Person re-identification based on multi-scale convolutional network. Multimed Tools Appl :1–15
Ye M, Li J, Ma AJ, Zheng L, Yuen PC (2019) Dynamic graph co-matching for unsupervised video-based person re-identification. IEEE Trans Image Process 28(6):2976–2990
Article MathSciNet Google Scholar
You H, Tian S, Yu L, Lv Y (2019) Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Trans Geosci Remote Sens 58(2):1281–1293
Article Google Scholar
You H, Tian S, Yu L, Lv Y (2019) Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Trans Geosci Remote Sens 58(2):1281–1293
Article Google Scholar
You J, Wu A, Li X, Zheng W-S (2016) Top-push video-based person re-identification. In: Proceedings of the ieee conference on computer vision and pattern recognition, pp 1345–1353
Zheng H, Fu J, Mei T, Luo J (2017) Learning multi-attention convolutional neural network for fine-grained image recognition. In: Proceedings of the IEEE international conference on computer vision, pp 5209–5217
Zheng W-S, Gong S, Xiang T (2012) Reidentification by relative distance comparison. IEEE Trans Patt Anal Mach Intell 35(3):653–668
Article Google Scholar
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision, pp 1116–1124
Zheng L, Zhang H, Sun S, Chandraker M, Yang Y, Tian Q (2017) Person re-identification in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1367–1376
Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4747–4756
Zhu X, Jing X-Y, You X, Zhang X, Zhang T (2018) Video-based person re-identification by simultaneously learning intra-video and inter-video distance metrics. IEEE Trans Image Process 27(11):5683–5695
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge NVIDIA for supporting their research with a TITAN Xp Graphics processing unit.

Author information

Authors and Affiliations

Pattern Recognition Lab, Department of Computer Science and Engineering, Indian Institute of Technology (BHU), Varanasi, 221005, Uttar Pradesh, India
Nirbhay Kumar Tagore & Pratik Chattopadhyay
School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, 639798, Singapore
Lipo Wang

Authors

Nirbhay Kumar Tagore
View author publications
You can also search for this author in PubMed Google Scholar
Pratik Chattopadhyay
View author publications
You can also search for this author in PubMed Google Scholar
Lipo Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pratik Chattopadhyay.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tagore, N.K., Chattopadhyay, P. & Wang, L. T-MAN: a neural ensemble approach for person re-identification using spatio-temporal information. Multimed Tools Appl 79, 28393–28409 (2020). https://doi.org/10.1007/s11042-020-09398-0

Download citation

Received: 17 January 2020
Revised: 10 July 2020
Accepted: 21 July 2020
Published: 03 August 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11042-020-09398-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

T-MAN: a neural ensemble approach for person re-identification using spatio-temporal information

Abstract

Access this article

Similar content being viewed by others

Video-based person re-identification using a novel feature extraction and fusion technique

MARS: A Video Benchmark for Large-Scale Person Re-Identification

Person Re-identification: System Design and Evaluation Overview

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Video-based person re-identification using a novel feature extraction and fusion technique

MARS: A Video Benchmark for Large-Scale Person Re-Identification

Person Re-identification: System Design and Evaluation Overview

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation