Efficient lightweight video person re-identification with online difference discrimination module

Gao, Cunyuan; Yao, Rui; Zhou, Yong; Zhao, Jiaqi; Fang, Liang; Hu, Fuyuan

doi:10.1007/s11042-021-10543-6

Efficient lightweight video person re-identification with online difference discrimination module

1182: Deep Processing of Multimedia Data
Published: 30 January 2021

Volume 81, pages 19169–19181, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Cunyuan Gao¹,
Rui Yao ORCID: orcid.org/0000-0003-2734-915X^1,2,
Yong Zhou¹,
Jiaqi Zhao¹,
Liang Fang¹ &
…
Fuyuan Hu²

344 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Video person re-identification (video Re-ID) is a key technology applied to video surveillance and security. Typical person re-identification is designed to retrieve the correct match of the target image (query) from gallery images, while video Re-ID extends this to query from gallery videos. The main factors affecting the video Re-ID model are: (i) a high-quality frame-level feature extractor, and (ii) temporal modeling that combines frame-level features into a feature for retrieval. In this work, we use ShuffleNet V2-based lightweight algorithm for video Re-ID, which can meet the demand for practical application and solve the problem of high consumption for computing resources, and maintain high performance. At the same time, the lightweight space attention mechanism Spatial Group-wise Enhance (SGE) module is used to view the person in more detail, which makes the feature representation more compact and effectively improves the retrieval accuracy. Finally, we design an Online Difference Discrimination (ODD) module to measure the feature gap between video frames, and use this module to make different temporal modeling for different quality video sequences. Experiments on three datasets (i.e., iLIDS-VID, PRID2011 and MARS) show that our method is competitive with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Spatial Quality Aware Network for Video-Based Person Re-identification

Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation

Article 19 April 2022

Zeng Lu, Ganghan Zhang, … Wing-Kuen Ling

Effective multi-shot person re-identification through representative frames selection and temporal feature pooling

Article 12 October 2019

Thuy-Binh Nguyen, Thi-Lan Le, … Nam Pham Ngoc

References

Ahmed S, Dogra DP, Choi H, Chae S, Kim IJ et al (2019) Person re-identification in videos by analyzing spatio-temporal tubes. arXiv:1902.04856
Chen D, Hua G, Wen F, Sun J (2016) Supervised transformer network for efficient face detection. In: European conference on computer vision. Springer, pp 122–138
Chen Y, Liu L, Tao J, Xia R, Chen X (2020) The improved image inpainting algorithm via encoder and similarity constraint. Vis Comput, https://doi.org/10.1007/s00371-020-01932-3
Chen Y, Wang J, Xia R, Zhang Q, Cao Z, Yang K (2019) The visual object tracking algorithm research based on adaptive combination kernel. J Ambient Intell Humanized Comput 10(12):4855–4867
Article Google Scholar
Dai J, Zhang P, Wang D, Lu H, Wang H (2018) Video person re-identification by temporal residual learning. IEEE Trans Image Process 28(3):1366–1377
Article MathSciNet Google Scholar
Guo D, Wang J, Cui Y, Wang Z, Chen S (2020) Siamcar: Siamese fully convolutional classification and regression for visual tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06). IEEE, vol 2, pp 1735–1742
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv:1703.07737
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis. Springer, pp 91–102
Howard A, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861
Iandola FN, Han S, Moskewicz MW, Ashraf K, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and < 0.5 mb model size. arXiv:1602.07360
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Li S, Bak S, Carr P, Wang X (2018) Diversity regularized spatiotemporal attention for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 369–378
Li D, Chen X, Zhang Z, Huang K (2017) Learning deep context-aware features over body and latent parts for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 384–393
Liao X, He L, Yang Z, Zhang C (2018) Video-based person re-identification via 3d convolutional networks and non-local attention. In: Asian conference on computer vision. Springer, pp 620–634
Liao S, Hu Y, Zhu X, Li S (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2197–2206
Lisanti G, Masi I, Del Bimbo A (2014) Matching people across camera views using kernel canonical correlation analysis. In: Proceedings of the international conference on distributed smart cameras. ACM, pp 10
Liu Y, Yan J, Ouyang W (2017) Quality aware network for set to set recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5790–5799
Ma N, Zhang X, Zheng HT, Sun J (2018) Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Proceedings of the european conference on computer vision (ECCV), pp 116–131
McLaughlin N, Martinez del Rincon J, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1325–1334
Navaneet K, Todi V, Babu RV, Chakraborty A (2019) All for one: Frame-wise rank loss for improving video-based person re-identification. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2472–2476
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Si J, Zhang H, Li CG, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5363–5372
Song G, Leng B, Liu Y, Hetang C, Cai S (2018) Region-based quality estimation network for large-scale person re-identification. In: Thirty-second AAAI conference on artificial intelligence
Su X, Zou Y, Cheng Y, Xu S, Yu M, Zhou P (2018) Spatial-temporal synergic residual learning for video person re-identification. arXiv:1807.05799
Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: European conference on computer vision. Springer, pp 135–153
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision. Springer, pp 688–703
Xiong F, Gou M, Camps O, Sznaier M (2014) Person re-identification using kernel-based metric learning methods. In: European conference on computer vision. Springer, pp 1–16
Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 4733–4742
Zakria, Cai J, Deng J, Aftab MU, Kumar R (2019) Efficient and deep vehicle re-identification using multi-level feature extraction. Appl Sci 9 (7):1291
Article Google Scholar
Zhang J, Wang N, Zhang L (2018) Multi-shot pedestrian re-identification via sequential decision making. In: Proceedings of the IEEE conferences on computer vision and pattern recognition, pp 6781–6789
Zhang L, Xiang T, Gong S (2016) Learning a discriminative null space for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1239–1248
Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6848–6856
Zhao Y, Shen X, Jin Z, Lu H, Hua X.s (2019) Attribute-driven feature disentangling and temporal aggregation for video person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4913–4922
Zheng L, Bie Z, Sun Y, Wang J, Su C, Wang S, Tian Q (2016) Mars: A video benchmark for large-scale person re-identification. In: European conference on computer vision. Springer, pp 868–884
Zheng Z, Zheng L, Yang Y (2018) Pedestrian alignment network for large-scale person re-identification. IEEE Trans Circ Syst Video Technol 29(10):3037–3045
Article Google Scholar
Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person re-identification with k-reciprocal encoding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1318–1327
Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: Joint spatial and temporal recurrent neural networks for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4747–4756
Zhou Q, Zhong B, Lan X, Sun G, Ji R (2020) Fine-grained spatial alignment model for person re-identification with focal triplet loss. IEEE Trans Image Process 29:1–1
Article MathSciNet Google Scholar
Zhou Q, Zhong B, Zhang Y, Li J, Fu Y (2018) Deep alignment network based multi-person tracking with occlusion and motion reasoning. IEEE Trans Multimed PP:1–1
Google Scholar

Download references

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China (No. 61772530, No. 61806206, No. 61876121), in part by the State’s Key Project of Research and Development Plan of China (No.2016YFC0600908), in part by the Natural Science Foundation of Jiangsu Province of China (No. BK20171192, No. BK20180639), in part by the Six Talent Peaks Project in Jiangsu Province (No. 2018-XYDXX-044), in part by the Open Foundation of the Suzhou Smart City Research Institute, Suzhou University of Science and Technology (No. SZSCR2019005), and in part by the project supported by Xuzhou Science and Technology Plan Funds (No. KC19005).

Author information

Authors and Affiliations

School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
Cunyuan Gao, Rui Yao, Yong Zhou, Jiaqi Zhao & Liang Fang
The Suzhou Smart City Research Institute, Suzhou University of Science and Technology, Suzhou, 215009, China
Rui Yao & Fuyuan Hu

Authors

Cunyuan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Rui Yao
View author publications
You can also search for this author in PubMed Google Scholar
Yong Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jiaqi Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Liang Fang
View author publications
You can also search for this author in PubMed Google Scholar
Fuyuan Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Yao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, C., Yao, R., Zhou, Y. et al. Efficient lightweight video person re-identification with online difference discrimination module. Multimed Tools Appl 81, 19169–19181 (2022). https://doi.org/10.1007/s11042-021-10543-6

Download citation

Received: 19 June 2020
Revised: 29 December 2020
Accepted: 13 January 2021
Published: 30 January 2021
Issue Date: June 2022
DOI: https://doi.org/10.1007/s11042-021-10543-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Efficient lightweight video person re-identification with online difference discrimination module

Abstract

Access this article

Similar content being viewed by others

Spatial Quality Aware Network for Video-Based Person Re-identification

Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation

Effective multi-shot person re-identification through representative frames selection and temporal feature pooling

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient lightweight video person re-identification with online difference discrimination module

Abstract

Access this article

Similar content being viewed by others

Spatial Quality Aware Network for Video-Based Person Re-identification

Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation

Effective multi-shot person re-identification through representative frames selection and temporal feature pooling

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation