Effective multi-shot person re-identification through representative frames selection and temporal feature pooling

Nguyen, Thuy-Binh; Le, Thi-Lan; Devillaine, Louis; Pham, Thi Thanh Thuy; Ngoc, Nam Pham

doi:10.1007/s11042-019-08183-y

Effective multi-shot person re-identification through representative frames selection and temporal feature pooling

Published: 12 October 2019

Volume 78, pages 33939–33967, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Thuy-Binh Nguyen^1,2,3,
Thi-Lan Le ORCID: orcid.org/0000-0001-9541-3905¹,
Louis Devillaine⁴,
Thi Thanh Thuy Pham⁵ &
…
Nam Pham Ngoc²

297 Accesses
7 Citations
Explore all metrics

Abstract

Multi-shot person re-identification (ReID) is a popular case of person ReID in which a set of images are processed for each person. However, using entire image set for person ReID as most experimented proposals is not always effective because of time and memory consuming. The main contribution of this work is the proposed strategies for (1) choosing representative image frames for each individual instead of entire set of frames, and (2) temporal feature pooling in multi-shot person ReID. These strategies are efficiently integrated in a person ReID framework which uses GoG (Gaussian of Gaussian) and XQDA (metric learning Cross-view Quadratic Discriminant Analysis) for person representation and matching. The effectiveness of the proposed framework on two benchmark datasets (PRID 2011 and iLIDS-VID) in terms of re-identification accuracy, computational time, and storage requirements are deeply investigated and analyzed. The experimental results allow to provide several recommendations on the use of these schemes based on the characteristics of the working dataset and the requirement of the applications. Furthermore, the study also offers a desktop-based application for person search and ReID. The implementation of the proposed framework will be made publicly available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Images Selection and Best Descriptor Combination for Multi-shot Person Re-identification

Multiple-Shot Person Re-identification via Riemannian Discriminative Learning

Local Sparse Representation Based Interest Point Matching for Person Re-identification

References

Avraham T, Gurvich I, Lindenbaum M, Markovitch S (2012) Learning implicit transfer for person re-identification. In: Workshops and demonstrations computer vision–ECCV 2012, pp 381–390. Springer
Bazzani L, Cristani M, Murino V (2013) Symmetry-driven accumulation of local features for human characterization and re-identification. Comput Vis Image Underst 117(2):130–144
Article Google Scholar
Chang Y C, Chiang C K, Lai S H (2012) Single-shot person re-identification based on improved random-walk pedestrian segmentation. In: 2012 international symposium on intelligent signal processing and communications systems (ISPACS), pp 1–6. IEEE
Chen Y, Zhu X, Gong S (2018) Deep association learning for unsupervised video person re-identification. arXiv:1808.07301
Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1335–1344
Eisenbach M, Kolarow A, Vorndran A, Niebling J, Gross H M (2015) Evaluation of multi feature fusion at score-level for appearance-based person re-identification. In: 2015 international joint conference on neural networks (IJCNN), pp 1–8. IEEE
Frikha M, Chebbi O, Fendri E, Hammami M (2016) Key frame selection for multi-shot person re-identification. In: International workshop on representations, analysis and recognition of shape and motion from imaging data (2016), pp 97–110. Springer
Gao C, Wang J, Liu L, Yu J G, Sang N (2016) Temporally aligned pooling representation for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4284–4288. IEEE
Gao M, Ai H, Bai B (2016) A feature fusion strategy for person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4274–4278. IEEE
Geng S, Yu M, Liu Y, Yu Y, Bai J (2018) Re-ranking pedestrian re-identification with multiple metrics. Multimedia Tools and Applications, pp 1–23
Graves A (2013) Generating sequences with recurrent neural networks. arXiv:1308.0850
Hassen Y H, Ayedi W, Ouni T, Jallouli M (2015) Multi-shot person re-identification approach based key frame selection. In: 8th international conference on machine vision (ICMV 2015), vol. 9875, p. 98751H. International Society for Optics and Photonics
Hassen Y H, Loukil K, Ouni T, Jallouli M (2017) Images selection and best descriptor combination for multi-shot person re-identification. In: International conference on intelligent interactive multimedia systems and services (2017), pp 11–20. Springer
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Heidarysafa M, Kowsari K, Brown D E, Meimandi K J, Barnes L E (2018) An improvement of data classification using random multimodel deep learning (rmdl). arXiv:1808.08121
Hirzer M, Beleznai C, Roth P M, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis (2011), pp 91–102. Springer
Huang Z, Wang R, Shan S, Chen X (2015) Projection metric learning on grassmann manifold with application to video based face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 140–149
John Lu Z (2010) The elements of statistical learning: data mining, inference, and prediction. J R Stat Soc A Stat Soc 173(3):693–694
Article Google Scholar
Johnson J, Yasugi S, Sugino Y, Pranata S, Shen S (2018) Person re-identification with fusion of hand-crafted and deep pose-based body region features. arXiv:1803.10630
Karanam S, Gou M, Wu Z, Rates-Borras A, Camps O, Radke R J (2018) A systematic evaluation and benchmark for person re-identification: features, metrics, and datasets IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)
Klaser A, Marszałek M, Schmid C (2008) A spatio-temporal descriptor based on 3d-gradients. In: BMVC 2008-19th British machine vision conference, pp 275–1. British machine vision association
Koestinger M, Hirzer M, Wohlhart P, Roth P M, Bischof H (2012) Large scale metric learning from equivalence constraints. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2288–2295. IEEE
Kowsari K, Brown D E, Heidarysafa M, Meimandi K J, Gerber M S, Barnes L E (2017) Hdltex: Hierarchical deep learning for text classification. In: 2017 16th IEEE international conference on machine learning and applications (ICMLA), pp 364–371. IEEE
Le TL, Thonnat M, Boucher A, Brémond F (2009) Appearance based retrieval for tracked objects in surveillance videos. In: Proceedings of the ACM international conference on image and video retrieval, CIVR ’09. ACM, New York, pp 40:1–40:8. https://doi.org/10.1145/1646396.1646444
Lejbølle AR, Nasrollahi K, Moeslund TB (2017) Enhancing person re-identification by late fusion of low-, mid-and high-level features Iet Biometrics
Li Z, Chang S, Liang F, Huang T S, Cao L, Smith J R (2013) Learning locally-adaptive decision functions for person verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3610–3617
Li Y, Zhuo L, Li J, Zhang J, Liang X, Tian Q (2017) Video-based person re-identification by deep feature guided pooling. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops (2017), pp 39–46
Li M, Zhu X, Gong S (2018) Unsupervised person re-identification by deep learning tracklet association. In: Proceedings of the European conference on computer vision (ECCV), pp 737–753
Chapter Google Scholar
Li M, Zhu X, Gong S (2019) Unsupervised tracklet person re-identification. IEEE transactions on pattern analysis and machine intelligence
Liao S, Hu Y, Zhu X, Li S Z (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp 2197–2206
Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for video-based pedestrian re-identification. In: Proceedings of the IEEE international conference on computer vision (2015), pp 3810–3818
Liu Z, Chen J, Wang Y (2016) A fast adaptive spatio-temporal 3d feature for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4294–4298. IEEE
Liu H, Jie Z, Jayashree K, Qi M, Jiang J, Yan S, Feng J (2017) Video-based person re-identification with accumulative motion context. IEEE Trans Circuits Syst Video Technol 28(10):2788–2802
Article Google Scholar
Liu Z, Wang D, Lu H (2017) Stepwise metric promotion for unsupervised video person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 2429–2438
Liu Y, Song N, Han Y (2019) Multi-cue fusion: Discriminative enhancing for person re-identification. J Vis Commun Image Represent 58:46–52
Article Google Scholar
Ma B, Su Y, Jurie F (2012) Local descriptors encoded by fisher vectors for person re-identification. In: Workshops and demonstrations computer vision–ECCV 2012, pp 413–422. Springer
Ma X, Zhu X, Gong S, Xie X, Hu J, Lam K M, Zhong Y (2017) Person re-identification by unsupervised video matching. Pattern Recogn 65:197–210
Article Google Scholar
Matsukawa T, Okabe T, Suzuki E, Sato Y (2016) Hierarchical gaussian descriptor for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp 1363–1372
McLaughlin N, Martinez del Rincon J, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp 1325–1334
Nguyen H Q, Nguyen T B, Le T L (2018) Enhancing person re-identification based on recurrent feature aggregation network. In: 2018 1st international conference on multimedia analysis and pattern recognition (MAPR), pp 1–6. IEEE
Nguyen TB, Le TL, Ngoc NP (2018) Fusion schemes for image-to-video person re-identification. Journal of Information and Telecommunication 0(0):1–21. https://doi.org/10.1080/24751839.2018.1531233
Article Google Scholar
Peng P, Xiang T, Wang Y, Pontil M, Gong S, Huang T, Tian Y (2016) Unsupervised cross-dataset transfer learning for person re-identification. In Proc. IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA
Prosser B J, Zheng W S, Gong S, Xiang T, Mary Q (2010) Person re-identification by support vector ranking. In: BMVC, vol 2, pp 6
ur Rehman S, Chen Z, Shah J H, Raza M (2016) Multi-feature fusion based re-ranking for person re-identification. In: 2016 international conference on audio, language and image processing (ICALIP), pp 213–216. IEEE
Song J, Gao L, Nie F, Shen H T, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25(11):4999–5011
Article MathSciNet Google Scholar
Song J, Guo Y, Gao L, Li X, Hanjalic A, Shen H T (2017) From deterministic to generative: Multi-modal stochastic rnns for video captioning. arXiv:1708.02478
Song J, Zhang H, Li X, Gao L, Wang M, Hong R (2018) Self-supervised video hashing with hierarchical binary auto-encoder. IEEE Trans Image Process 27 (7):3210–3221
Article MathSciNet Google Scholar
Song S, Cheung N M, Chandrasekhar V, Mandal B (2018) Deep adaptive temporal pooling for activity recognition. arXiv:1808.07272
Su C, Yang F, Zhang S, Tian Q, Davis L S, Gao W (2015) Multi-task learning with low rank attribute embedding for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3739–3747
Thuy-Binh N, Duc-Long T, Thi-Lan L, Thi Thanh Thuy P, Huong-Giang D (2018) Towards effective implementation of gaussian of gaussian descriptor for person re-identification. In: The 5th NAFOSTED conference on information and computer science (NICS 2018)
Wang R, Chen X (2009) Manifold discriminant analysis. In: IEEE Conference on computer vision and pattern recognition, 2009. CVPR 2009, pp 429–436. IEEE
Wang X, Doretto G, Sebastian T, Rittscher J, Tu P (2007) Shape and appearance context modeling. In: IEEE 11th international conference on computer vision, 2007. ICCV 2007, pp 1–8. IEEE
Wang R, Guo H, Davis L S, Dai Q (2012) Covariance discriminative learning: A natural and efficient approach to image set classification. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2496–2503. IEEE
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: ECCV (4), pp 688–703
Chapter Google Scholar
Wang T, Gong S, Zhu X, Wang S (2016) Person re-identification by discriminative selection in video ranking. IEEE Trans Pattern Anal Mach Intell 38 (12):2501–2514
Article Google Scholar
Wang X, Gao L, Wang P, Sun X, Liu X (2017) Two-stream 3-d convnet fusion for action recognition in videos with arbitrary size and length. IEEE Trans Multimed 20(3):634–644
Article Google Scholar
Wu Y, Minoh M, Mukunoki M, Lao S (2012) Set based discriminative ranking for recognition. Computer Vision–ECCV 2012:497–510
Google Scholar
Wu Y, Mukunoki M, Minoh M (2014) Locality-constrained collaboratively regularized nearest points for multiple-shot person re-identification. In: Proc. of The 20th Korea-Japan joint workshop on frontiers of computer vision (FCV). Citeseer
Wu S, Chen Y C, Li X, Wu A C, You J J, Zheng W S (2016) An enhanced deep feature representation for person re-identification. In: 2016 IEEE winter conference on applications of computer vision (WACV), pp 1–8. IEEE
Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: European Conference on computer vision (2016), pp 701–716. Springer
Yang Y, Yang J, Yan J, Liao S, Yi D, Li S Z (2014) Salient color names for person re-identification. In: European conference on computer vision, pp 536–551. Springer
Ye M, Ma A J, Zheng L, Li J, Yuen P C (2017) Dynamic label graph matching for unsupervised video re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 5142–5150
Yuan L, Tian Z (2016) Person re-identification based on color and texture feature fusion. In: International conference on intelligent computing, pp 341–352. Springer
Zeng Z, Li Z, Cheng D, Zhang H, Zhan K, Yang Y (2017) Two-stream multirate recurrent neural network for video-based pedestrian reidentification. IEEE Trans Ind Inf 14(7):3179–3186
Article Google Scholar
Zeng M, Tian C, Wu Z (2018) Person re-identification with hierarchical deep learning feature and efficient xqda metric. In: 2018 ACM multimedia conference on multimedia conference, pp 1838–1846. ACM
Zhang W, Hu S, Liu K (2017) Learning compact appearance representation for video-based person re-identification. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
Zhao R, Ouyang W, Wang X (2013) Person re-identification by salience matching. In: Proceedings of the IEEE international conference on computer vision, pp 2528–2535
Zhao S, Liu Y, Han Y, Hong R, Hu Q, Tian Q (2017) Pooling the convolutional layers in deep convnets for video action recognition. IEEE Trans Circuits Syst Video Technol 28(8):1839–1849
Article Google Scholar
Zheng L, Wang S, Tian L, He F, Liu Z, Tian Q (2015) Query-adaptive late fusion for image search and person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp 1741–1750
Zheng L, Yang Y, Hauptmann A G (2016) Person re-identification: Past, present and future. arXiv:1610.02984
Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. arXiv:1701.07717

Download references

Acknowledgments

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2017.315

Author information

Authors and Affiliations

Computer Vision Department, MICA International Research Institute, Hanoi University of Science and Technology, Hanoi, Vietnam
Thuy-Binh Nguyen & Thi-Lan Le
School of Electronics and Telecommunications, Hanoi University of Science and Technology, Hanoi, Vietnam
Thuy-Binh Nguyen & Nam Pham Ngoc
Faculty of Electrical and Electronics Engineering, University of Transport and Communications, Hanoi, Vietnam
Thuy-Binh Nguyen
School of Engineering in Physics, Applied Physics, Electronics & Materials Science, Grenoble Institute of Technology, Grenoble, France
Louis Devillaine
Faculty of Security and Information Technology, Academy of People Security, Hanoi, Vietnam
Thi Thanh Thuy Pham

Authors

Thuy-Binh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Thi-Lan Le
View author publications
You can also search for this author in PubMed Google Scholar
Louis Devillaine
View author publications
You can also search for this author in PubMed Google Scholar
Thi Thanh Thuy Pham
View author publications
You can also search for this author in PubMed Google Scholar
Nam Pham Ngoc
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thi-Lan Le.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nguyen, TB., Le, TL., Devillaine, L. et al. Effective multi-shot person re-identification through representative frames selection and temporal feature pooling. Multimed Tools Appl 78, 33939–33967 (2019). https://doi.org/10.1007/s11042-019-08183-y

Download citation

Received: 04 December 2018
Revised: 04 July 2019
Accepted: 02 September 2019
Published: 12 October 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11042-019-08183-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective multi-shot person re-identification through representative frames selection and temporal feature pooling

Abstract

Access this article

Similar content being viewed by others

Images Selection and Best Descriptor Combination for Multi-shot Person Re-identification

Multiple-Shot Person Re-identification via Riemannian Discriminative Learning

Local Sparse Representation Based Interest Point Matching for Person Re-identification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Effective multi-shot person re-identification through representative frames selection and temporal feature pooling

Abstract

Access this article

Similar content being viewed by others

Images Selection and Best Descriptor Combination for Multi-shot Person Re-identification

Multiple-Shot Person Re-identification via Riemannian Discriminative Learning

Local Sparse Representation Based Interest Point Matching for Person Re-identification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation