Abstract
Multi-shot person re-identification (ReID) is a popular case of person ReID in which a set of images are processed for each person. However, using entire image set for person ReID as most experimented proposals is not always effective because of time and memory consuming. The main contribution of this work is the proposed strategies for (1) choosing representative image frames for each individual instead of entire set of frames, and (2) temporal feature pooling in multi-shot person ReID. These strategies are efficiently integrated in a person ReID framework which uses GoG (Gaussian of Gaussian) and XQDA (metric learning Cross-view Quadratic Discriminant Analysis) for person representation and matching. The effectiveness of the proposed framework on two benchmark datasets (PRID 2011 and iLIDS-VID) in terms of re-identification accuracy, computational time, and storage requirements are deeply investigated and analyzed. The experimental results allow to provide several recommendations on the use of these schemes based on the characteristics of the working dataset and the requirement of the applications. Furthermore, the study also offers a desktop-based application for person search and ReID. The implementation of the proposed framework will be made publicly available.
Similar content being viewed by others
References
Avraham T, Gurvich I, Lindenbaum M, Markovitch S (2012) Learning implicit transfer for person re-identification. In: Workshops and demonstrations computer vision–ECCV 2012, pp 381–390. Springer
Bazzani L, Cristani M, Murino V (2013) Symmetry-driven accumulation of local features for human characterization and re-identification. Comput Vis Image Underst 117(2):130–144
Chang Y C, Chiang C K, Lai S H (2012) Single-shot person re-identification based on improved random-walk pedestrian segmentation. In: 2012 international symposium on intelligent signal processing and communications systems (ISPACS), pp 1–6. IEEE
Chen Y, Zhu X, Gong S (2018) Deep association learning for unsupervised video person re-identification. arXiv:1808.07301
Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1335–1344
Eisenbach M, Kolarow A, Vorndran A, Niebling J, Gross H M (2015) Evaluation of multi feature fusion at score-level for appearance-based person re-identification. In: 2015 international joint conference on neural networks (IJCNN), pp 1–8. IEEE
Frikha M, Chebbi O, Fendri E, Hammami M (2016) Key frame selection for multi-shot person re-identification. In: International workshop on representations, analysis and recognition of shape and motion from imaging data (2016), pp 97–110. Springer
Gao C, Wang J, Liu L, Yu J G, Sang N (2016) Temporally aligned pooling representation for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4284–4288. IEEE
Gao M, Ai H, Bai B (2016) A feature fusion strategy for person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4274–4278. IEEE
Geng S, Yu M, Liu Y, Yu Y, Bai J (2018) Re-ranking pedestrian re-identification with multiple metrics. Multimedia Tools and Applications, pp 1–23
Graves A (2013) Generating sequences with recurrent neural networks. arXiv:1308.0850
Hassen Y H, Ayedi W, Ouni T, Jallouli M (2015) Multi-shot person re-identification approach based key frame selection. In: 8th international conference on machine vision (ICMV 2015), vol. 9875, p. 98751H. International Society for Optics and Photonics
Hassen Y H, Loukil K, Ouni T, Jallouli M (2017) Images selection and best descriptor combination for multi-shot person re-identification. In: International conference on intelligent interactive multimedia systems and services (2017), pp 11–20. Springer
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Heidarysafa M, Kowsari K, Brown D E, Meimandi K J, Barnes L E (2018) An improvement of data classification using random multimodel deep learning (rmdl). arXiv:1808.08121
Hirzer M, Beleznai C, Roth P M, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis (2011), pp 91–102. Springer
Huang Z, Wang R, Shan S, Chen X (2015) Projection metric learning on grassmann manifold with application to video based face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 140–149
John Lu Z (2010) The elements of statistical learning: data mining, inference, and prediction. J R Stat Soc A Stat Soc 173(3):693–694
Johnson J, Yasugi S, Sugino Y, Pranata S, Shen S (2018) Person re-identification with fusion of hand-crafted and deep pose-based body region features. arXiv:1803.10630
Karanam S, Gou M, Wu Z, Rates-Borras A, Camps O, Radke R J (2018) A systematic evaluation and benchmark for person re-identification: features, metrics, and datasets IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)
Klaser A, Marszałek M, Schmid C (2008) A spatio-temporal descriptor based on 3d-gradients. In: BMVC 2008-19th British machine vision conference, pp 275–1. British machine vision association
Koestinger M, Hirzer M, Wohlhart P, Roth P M, Bischof H (2012) Large scale metric learning from equivalence constraints. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2288–2295. IEEE
Kowsari K, Brown D E, Heidarysafa M, Meimandi K J, Gerber M S, Barnes L E (2017) Hdltex: Hierarchical deep learning for text classification. In: 2017 16th IEEE international conference on machine learning and applications (ICMLA), pp 364–371. IEEE
Le TL, Thonnat M, Boucher A, Brémond F (2009) Appearance based retrieval for tracked objects in surveillance videos. In: Proceedings of the ACM international conference on image and video retrieval, CIVR ’09. ACM, New York, pp 40:1–40:8. https://doi.org/10.1145/1646396.1646444
Lejbølle AR, Nasrollahi K, Moeslund TB (2017) Enhancing person re-identification by late fusion of low-, mid-and high-level features Iet Biometrics
Li Z, Chang S, Liang F, Huang T S, Cao L, Smith J R (2013) Learning locally-adaptive decision functions for person verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3610–3617
Li Y, Zhuo L, Li J, Zhang J, Liang X, Tian Q (2017) Video-based person re-identification by deep feature guided pooling. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops (2017), pp 39–46
Li M, Zhu X, Gong S (2018) Unsupervised person re-identification by deep learning tracklet association. In: Proceedings of the European conference on computer vision (ECCV), pp 737–753
Li M, Zhu X, Gong S (2019) Unsupervised tracklet person re-identification. IEEE transactions on pattern analysis and machine intelligence
Liao S, Hu Y, Zhu X, Li S Z (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp 2197–2206
Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for video-based pedestrian re-identification. In: Proceedings of the IEEE international conference on computer vision (2015), pp 3810–3818
Liu Z, Chen J, Wang Y (2016) A fast adaptive spatio-temporal 3d feature for video-based person re-identification. In: 2016 IEEE international conference on image processing (ICIP), pp 4294–4298. IEEE
Liu H, Jie Z, Jayashree K, Qi M, Jiang J, Yan S, Feng J (2017) Video-based person re-identification with accumulative motion context. IEEE Trans Circuits Syst Video Technol 28(10):2788–2802
Liu Z, Wang D, Lu H (2017) Stepwise metric promotion for unsupervised video person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 2429–2438
Liu Y, Song N, Han Y (2019) Multi-cue fusion: Discriminative enhancing for person re-identification. J Vis Commun Image Represent 58:46–52
Ma B, Su Y, Jurie F (2012) Local descriptors encoded by fisher vectors for person re-identification. In: Workshops and demonstrations computer vision–ECCV 2012, pp 413–422. Springer
Ma X, Zhu X, Gong S, Xie X, Hu J, Lam K M, Zhong Y (2017) Person re-identification by unsupervised video matching. Pattern Recogn 65:197–210
Matsukawa T, Okabe T, Suzuki E, Sato Y (2016) Hierarchical gaussian descriptor for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp 1363–1372
McLaughlin N, Martinez del Rincon J, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp 1325–1334
Nguyen H Q, Nguyen T B, Le T L (2018) Enhancing person re-identification based on recurrent feature aggregation network. In: 2018 1st international conference on multimedia analysis and pattern recognition (MAPR), pp 1–6. IEEE
Nguyen TB, Le TL, Ngoc NP (2018) Fusion schemes for image-to-video person re-identification. Journal of Information and Telecommunication 0(0):1–21. https://doi.org/10.1080/24751839.2018.1531233
Peng P, Xiang T, Wang Y, Pontil M, Gong S, Huang T, Tian Y (2016) Unsupervised cross-dataset transfer learning for person re-identification. In Proc. IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA
Prosser B J, Zheng W S, Gong S, Xiang T, Mary Q (2010) Person re-identification by support vector ranking. In: BMVC, vol 2, pp 6
ur Rehman S, Chen Z, Shah J H, Raza M (2016) Multi-feature fusion based re-ranking for person re-identification. In: 2016 international conference on audio, language and image processing (ICALIP), pp 213–216. IEEE
Song J, Gao L, Nie F, Shen H T, Yan Y, Sebe N (2016) Optimized graph learning using partial tags and multiple features for image and video annotation. IEEE Trans Image Process 25(11):4999–5011
Song J, Guo Y, Gao L, Li X, Hanjalic A, Shen H T (2017) From deterministic to generative: Multi-modal stochastic rnns for video captioning. arXiv:1708.02478
Song J, Zhang H, Li X, Gao L, Wang M, Hong R (2018) Self-supervised video hashing with hierarchical binary auto-encoder. IEEE Trans Image Process 27 (7):3210–3221
Song S, Cheung N M, Chandrasekhar V, Mandal B (2018) Deep adaptive temporal pooling for activity recognition. arXiv:1808.07272
Su C, Yang F, Zhang S, Tian Q, Davis L S, Gao W (2015) Multi-task learning with low rank attribute embedding for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3739–3747
Thuy-Binh N, Duc-Long T, Thi-Lan L, Thi Thanh Thuy P, Huong-Giang D (2018) Towards effective implementation of gaussian of gaussian descriptor for person re-identification. In: The 5th NAFOSTED conference on information and computer science (NICS 2018)
Wang R, Chen X (2009) Manifold discriminant analysis. In: IEEE Conference on computer vision and pattern recognition, 2009. CVPR 2009, pp 429–436. IEEE
Wang X, Doretto G, Sebastian T, Rittscher J, Tu P (2007) Shape and appearance context modeling. In: IEEE 11th international conference on computer vision, 2007. ICCV 2007, pp 1–8. IEEE
Wang R, Guo H, Davis L S, Dai Q (2012) Covariance discriminative learning: A natural and efficient approach to image set classification. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), pp 2496–2503. IEEE
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: ECCV (4), pp 688–703
Wang T, Gong S, Zhu X, Wang S (2016) Person re-identification by discriminative selection in video ranking. IEEE Trans Pattern Anal Mach Intell 38 (12):2501–2514
Wang X, Gao L, Wang P, Sun X, Liu X (2017) Two-stream 3-d convnet fusion for action recognition in videos with arbitrary size and length. IEEE Trans Multimed 20(3):634–644
Wu Y, Minoh M, Mukunoki M, Lao S (2012) Set based discriminative ranking for recognition. Computer Vision–ECCV 2012:497–510
Wu Y, Mukunoki M, Minoh M (2014) Locality-constrained collaboratively regularized nearest points for multiple-shot person re-identification. In: Proc. of The 20th Korea-Japan joint workshop on frontiers of computer vision (FCV). Citeseer
Wu S, Chen Y C, Li X, Wu A C, You J J, Zheng W S (2016) An enhanced deep feature representation for person re-identification. In: 2016 IEEE winter conference on applications of computer vision (WACV), pp 1–8. IEEE
Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: European Conference on computer vision (2016), pp 701–716. Springer
Yang Y, Yang J, Yan J, Liao S, Yi D, Li S Z (2014) Salient color names for person re-identification. In: European conference on computer vision, pp 536–551. Springer
Ye M, Ma A J, Zheng L, Li J, Yuen P C (2017) Dynamic label graph matching for unsupervised video re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 5142–5150
Yuan L, Tian Z (2016) Person re-identification based on color and texture feature fusion. In: International conference on intelligent computing, pp 341–352. Springer
Zeng Z, Li Z, Cheng D, Zhang H, Zhan K, Yang Y (2017) Two-stream multirate recurrent neural network for video-based pedestrian reidentification. IEEE Trans Ind Inf 14(7):3179–3186
Zeng M, Tian C, Wu Z (2018) Person re-identification with hierarchical deep learning feature and efficient xqda metric. In: 2018 ACM multimedia conference on multimedia conference, pp 1838–1846. ACM
Zhang W, Hu S, Liu K (2017) Learning compact appearance representation for video-based person re-identification. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
Zhao R, Ouyang W, Wang X (2013) Person re-identification by salience matching. In: Proceedings of the IEEE international conference on computer vision, pp 2528–2535
Zhao S, Liu Y, Han Y, Hong R, Hu Q, Tian Q (2017) Pooling the convolutional layers in deep convnets for video action recognition. IEEE Trans Circuits Syst Video Technol 28(8):1839–1849
Zheng L, Wang S, Tian L, He F, Liu Z, Tian Q (2015) Query-adaptive late fusion for image search and person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp 1741–1750
Zheng L, Yang Y, Hauptmann A G (2016) Person re-identification: Past, present and future. arXiv:1610.02984
Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. arXiv:1701.07717
Acknowledgments
This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2017.315
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Nguyen, TB., Le, TL., Devillaine, L. et al. Effective multi-shot person re-identification through representative frames selection and temporal feature pooling. Multimed Tools Appl 78, 33939–33967 (2019). https://doi.org/10.1007/s11042-019-08183-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-08183-y