Person Reidentification using 3D inception based Spatio-temporal features learning, attribute recognition, and Reranking

Choudhary, Meenakshi; Tiwari, Vivek; Jain, Swati; Rajpoot, Vikram

doi:10.1007/s11042-023-15473-z

Person Reidentification using 3D inception based Spatio-temporal features learning, attribute recognition, and Reranking

Published: 11 May 2023

Volume 83, pages 2007–2030, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Meenakshi Choudhary¹,
Vivek Tiwari²,
Swati Jain³ &
…
Vikram Rajpoot⁴

154 Accesses
1 Altmetric
Explore all metrics

Abstract

Identifying pedestrians in video sequences captured by non-overlapping multi-cameras is referred to as video-based Person Re-identification. The successive video frames in video clips embrace motion patterns of pedestrians and represent a person's appearance from varying angles with different body poses and, thus, provide critical features to counter occlusion, pose variation, viewpoint change, etc. This article proposes a novel person reidentification methodology, which incorporates a 3D Inception-based Person Re-identification model, which embraces four three-dimensional (3D) Inception modules with 3D convolution and pooling layers. The receptive fields of neurons are well expanded through 3D inception modules in both temporal and spatial dimensions. Due to this, the model learns discriminatory appearance along with pedestrians' long-term and short-term motion patterns without any motion approximation module. Further, the model is trained with a unified loss function integrating center loss with usual identification loss to reduce intra-class difference while increasing inter-class difference. Further, the proposed method incorporates an attribute recognition model to identify discriminatory attributes in the video frames. The Spatio-temporal and attribute features are then utilized by a reranking method, which generates the k-most similar video clips for the given input. The effectiveness of the proposed method is validated by performing extensive experiments over three realistic surveillance video datasets; MARS, DukeMTMC-VideoReID, and iLIDS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Discriminative feature extraction for video person re-identification via multi-task network

Article 02 September 2020

Video-based person re-identification with scene and person attributes

Article 13 June 2023

Spatial-temporal aware network for video-based person re-identification

Article 29 September 2023

Data availability

The data that support the findings of this study are openly available and cited/reference in text.

References

Chen D, Yuan Z, Hua G, Zheng N, Wang J (2015) Similarity learning on an explicit polynomial kernel feature map for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.:1565–1573
Chen W, Chen X, Zhang J, Huang K (2017) Beyond triplet loss: a deep quadruplet network for person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:403–412
Chen T, Ding S, Xie J, Yuan Y, Chen W, Yang Y, Wang Z (2019) Abd-net: attentive but diverse person re-identification. ProceedIEEE Int Conf Comput Vision:8350–8360
Chen B, Deng W, Hu J (2019) Mixed high-order attention network for person re-identification. Proceed IEEE Int Conf Comp Vision:371–381
Chen Z, Zhou Z, Huang J, Zhang P, Li B (2020) Frame-guided region-aligned representation for video person re-identification. Proc AAAI Conf Artif Intell 34(7):10591–10598
Google Scholar
Choudhary M, Tiwari V, Jain S (2021) Person re-identification using deep siamese network with multi-layer similarity constraints. Multimed Tools Appl:1–17
Fu Y, Wang X, Wei Y, Huang T (2019) STA: spatial-temporal attention for large-scale video-based person re-identification. Proc. AAAI Conf. Artif. Intell. 33:8287–8294
Google Scholar
Fu H, Zhang K, Li H, Wang J, Wang Z (2022) Spatial temporal and channel aware network for video-based person re-identification. Image Vis Comput 118:104356
Article Google Scholar
Ge Y, Li Z, Zhao H, Yin G, Yi S, Wang X, Li H (2018) Fd-Gan: pose-guided feature distilling Gan for robust person re-identification
Gong W, Yan B, Lin C (2020) Flow-guided feature enhancement network for video-based person re-identification. Neurocomputing 383:295–302
Article Google Scholar
Gong W, Yan B, Lin C (2020) Flow-guided feature enhancement network for videobased person re-identification. Neurocomputing 383:295–302
Article Google Scholar
Hermans A, Beyer L, Leibe B, In Defense of the Triplet Loss for Person Reidentification, https://arxiv.org/pdf/1703.07737.pdf 2017 (arXiv preprint arXiv:1703.07737).
Hou R, Ma B, Chang H, Gu X, Shan S, Chen X (2019) VRSTC: occlusion-free video person re-identification. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp 7176–7185. https://doi.org/10.1109/CVPR.2019.00735
Chapter Google Scholar
Jiang M, Leng B, Song G, Meng Z (2020) Weighted triple-sequence loss for videobased person re-identification. Neurocomputing 381:314–321
Article Google Scholar
Khamis S, Kuo C-H, Singh VK, Shet VD, Davis LS (2014) Joint learning for attribute-consistent person reidentification, in: European conference on computer vision, springer. Pp 134146.
Layne R, Hospedales TM, Gong S (2017) Attributes-based reidentification. Person Re-Identification, In, pp 93–117
Li W, Wang X (2013) Locally aligned feature transforms across views. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition:3594–3601
Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: deep filter pairing neural network for person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:152–159
Li J, Zhang S, Wang J, Gao W, Tian Q (2019) Global-Local Temporal Representations for Video Person Re-Identification. Proc. IEEE Int. Conf Comput. Vis. (ICCV):3957–3966
Li J, Zhang S, Huang T (2020) Multi-scale temporal cues learning for video person re-identification. IEEE Trans Image Process 29:4 461–4 473
Article Google Scholar
Li S, Yu H, Hu H (2020) Appearance and motion enhancement for video-based person re-identification. Proc. AAAI Conf. Artif. Intell. 34(7):11394–11401
Google Scholar
Li P, Pan P, Liu P, Xu M, Yang Y (2021) Hierarchical temporal modeling with mutual distance matching for video based person re-identification. IEEE Trans Circuits Syst Video Technol 31(2):503–511
Article Google Scholar
Liang Z, Bie Z, Sun Y, Wang J, Su C, Wang S, Tian Q. 2016. Mars: A video benchmark for large-scale person re-identification. In Proceedings of European Conference on Computer Vision. Springer, 868–884.
Lin Y, Zheng L, Zheng Z, Wu Y, Yang Y (2017) Improving person re-identification by attribute and identity learning. Comput Vis Pattern Recognit 95:151–161
Article Google Scholar
Lin G, Zhao S, Shen J (2021) Video person re-identification with global statistic pooling and self-attention distillation. Neurocomputing 381:777–789
Article Google Scholar
Liu J, Zha ZJ, Chen X, Wang Z, Zhang Y (2019) Dense 3D-convolutional neural network for person re-identification in videos. ACM Trans Multimedia Comput, Commun, Appl (TOMM) 15(1s):1–19
Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60:91–110
Article Google Scholar
Mansouri N, Ammar S, Kessentini Y (2021) Re-ranking person re-identification using attributes learning. Neural Comput Applic 33(19):12827–12843
Article Google Scholar
Matsukawa T, Suzuki E (2016) Person re-identification using CNN features learned from combination of attributes , In: International conference on pattern recognition, Cancn, Mxico. pp 2429 – 2434.
Mclaughlin N, Del Rincon JM, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition:1325–1334
McLaughlin N, Del Rincon JM, Miller P (2016) Recurrent convolutional network for video-based person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:1325–1334
Ming Z, Zhu M, Wang X, Zhu J, Cheng J, Gao C, Yang Y, Wei X (2022) Deep learning-based person re-identification methods: a survey and outlook of recent works. Image Vis Comput 119:104394
Article Google Scholar
Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision. Springer, pp 17–35
Google Scholar
Song W, Zheng J, Wu Y, Chen C, Liu F (2021) Discriminative feature extraction for video person re-identification via multi-task network. Appl Intell 51(2):788–803
Article Google Scholar
Su C, Zhang, Xing J, Gao W, Tian Q (2016) Deep attributes driven multi-camera person re-identification, arXiv:1605.03259.
Subramaniam A, Nambiar A, Mittal A (2019) Co-Segmentation Inspired Attention Networks for Video-Based Person Re-Identification. Proc. IEEE Int. Conf. Comput. Vis. (ICCV). 562-572
Tay CP, Roy S, Yap KH (2019) Aanet: attribute attention network for person reidentifications. Proc IEEE Conf Comput Vis Pattern Recognit:7127–7136
Tay CP, Roy S, Yap KH (2019) Aanet: attribute attention network for person reidentifications. Proc IEEE Conf Comput Vis Pattern Recognit:7127–7136
Wang J, Zhu X, Gong SH, Li W (2015) Transferable joint attribute-identity deep learning for unsupervised person re-identification,In: Conference on computer vision and pattern recognition, tats-Unis. pp 2275 – 2284.
Wang T, Gong S, Zhu X, Wang S (2016) Person re-identification by discriminative selection in video ranking. IEEE Trans. Pattern Anal. Mach. Intell. 38:2501–2514
Article Google Scholar
G. Wang, Y. Yuan, X. Chen, J. Li, and X. Zhou, “Learning Discriminative Features with Multiple Granularities for Person Re-Identification,” in Proc. ACM Multimedia Conf. MM, 2018, pp. 274-282.
Wang Z et al. (2021) Robust Video-based Person Re-Identification by Hierarchical Mining. IEEE Trans. Circuits Syst. Video Technol. 1-1, https://doi.org/10.1109/TCSVT.2021.3076097.
Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) Glad: global-local-alignment descriptor for pedestrian retrieval. Proceed ACM Int Conf Multimedia:420–428
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In: European conference on computer vision. Springer, Cham, pp 499–515
Google Scholar
Wu Y, Lin Y, Dong X, Yan Y, Ouyang W, Yang Y (2018) Exploit the unknown gradually: one-shot video-based person re-identification by stepwise learning. In: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun, pp 5177–5186
Google Scholar
Wu Y, Bourahla O, Li X, Wu F, Zhou X (2020) Adaptive graph representation learning for video person re-identification. IEEE Trans Image Process 29:8821–8830
Article Google Scholar
Wu D, Ye M, Lin G, Gao X, Shen J (2021) Person re-identification by context-aware part attention and multi-head collaborative learning. IEEE Trans Inf. Foren, Sec
Yan Y, Qin J, Chen J, Liu L, Zhu F, Tai Y, Shao L (2020) Learning multi-granular hypergraphs for video-based person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:2899–2908
Yang J, Zheng W, Yang Q, Chen Y, Tian Q (2020) Spatial-temporal graph convolutional network for video-based person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:3289–3299
Yang X, Liu L, Wang N, Gao X (2021) A two-stream dynamic pyramid representation model for video-based person re-identification. IEEE Trans Image Process 30:6266–6276
Article Google Scholar
Yang F, Wang X, Zhu X, Liang B, Li W (2022) Relation-based global-partial feature learning network for video-based person re-identification. Neurocomputing 488:424–435
Article Google Scholar
Yao Y, Jiang X, Fujita H, Fang Z (2022) A sparse graph wavelet convolution neural network for video-based person re-identification. Pattern Recogn 129:108708
Article Google Scholar
Zhang Z, Lan C, Zeng W, Jin X, Chen Z (2020) Relation-aware global attention for person re-identification. Proceed IEEE Conf Comput Vision Patt Recogn:3183–3192
Zhang L et al (2021) Ordered or Orderless: a revisit for video based person re- identification. IEEE Trans Pattern Anal Mach Intell 43(4):1460–1466
Article Google Scholar
Zheng Z, Yang X, Yu Z, Zheng L, Yang Y, Kautz J (2019) Joint discriminative and generative learning for person re-identification. Proc IEEE Conf Comput Vis Pattern Recognit:2138–2147
Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person re-identification with k-reciprocal encoding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.:3652–3661
Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person reidentification with K-reciprocal encoding, Conference on Computer Vision and Pattern Recognition, pp1318–1327. Hawa, tats Unis, IEEE
Zhou Z, Huang Y, Wang W, Liang W, Tan T. 2017. See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In proceedings of the IEEE international conference on computer vision. IEEE, 6776–6785.

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, IIIT Pune, Pune, India
Meenakshi Choudhary
Department of Computer Science and Engineering, DSPM IIIT Naya Raipur, Raipur, Chhattisgarh, India
Vivek Tiwari
Govt. J. Yoganandam Chhattisgarh College, Raipur, CG, India
Swati Jain
Department of Information Technology, Madhav Institute of Technology & Science, Gwalior, India
Vikram Rajpoot

Authors

Meenakshi Choudhary
View author publications
You can also search for this author in PubMed Google Scholar
Vivek Tiwari
View author publications
You can also search for this author in PubMed Google Scholar
Swati Jain
View author publications
You can also search for this author in PubMed Google Scholar
Vikram Rajpoot
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vivek Tiwari.

Ethics declarations

Funding and/or Conflicts of Interest/Competing interests: I declare on behalf of the author that there is not any conflict of interest, either non-financial or commercial, among the author.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Choudhary, M., Tiwari, V., Jain, S. et al. Person Reidentification using 3D inception based Spatio-temporal features learning, attribute recognition, and Reranking. Multimed Tools Appl 83, 2007–2030 (2024). https://doi.org/10.1007/s11042-023-15473-z

Download citation

Received: 11 July 2022
Revised: 18 October 2022
Accepted: 18 April 2023
Published: 11 May 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-15473-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Person Reidentification using 3D inception based Spatio-temporal features learning, attribute recognition, and Reranking

Abstract

Access this article

Similar content being viewed by others

Discriminative feature extraction for video person re-identification via multi-task network

Video-based person re-identification with scene and person attributes

Spatial-temporal aware network for video-based person re-identification

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Person Reidentification using 3D inception based Spatio-temporal features learning, attribute recognition, and Reranking

Abstract

Access this article

Similar content being viewed by others

Discriminative feature extraction for video person re-identification via multi-task network

Video-based person re-identification with scene and person attributes

Spatial-temporal aware network for video-based person re-identification

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation