Abstract
Gait recognition is one of the essential biometric technologies. It is widely used in security surveillance and other fields. The key to gait recognition is to extract robust spatio-temporal features from the gait silhouette sequences of pedestrians. Due to the overlapping and the cross-view, learning distinguishable spatio-temporal features and reducing the impact on recognition accuracy in occlusion situations such as backpacks and wearing coats is a crucial problem to be solved for gait recognition. In this paper, we propose a two-branch gait recognition network containing global and local branches, which extracts local and global spatio-temporal gait features using 3D convolutional neural networks and maps the obtained features to the corresponding dimensions. Specifically, in the global branch, we design an enhanced 3D convolution module (E3D) for global spatio-temporal feature extraction. In the local branch, we design the Partical-E3D module. The Partical-E3D module divides the features in the vertical direction by averaging the chunks and cascades them to obtain the local features after using the E3D module to extract the features for each local part of the division. In addition, for the features extracted in global and local branches, we not only perform feature mapping in the spatial dimension but also in the temporal dimension to obtain the gait features with temporal and spatial information. Experiments on the CASIA-B dataset showed that our method achieved first-class accuracy rates of 97.8%, 95.1% and 84.3% under normal walking, bag carrying and coat wearing conditions. These metrics demonstrate that our method has better performance than existing gait recognition methods and has state-of-the-art performance, and the improvement is more obvious in the condition of BG and CL.
Similar content being viewed by others
Availability of data and materials
The datasets analyzed during the current study are available in the [CASIA] repository, [http://www.cbsr.ia.ac.cn/china/Gait%20Databases%20CH.asp].
References
Yao, L., Kusakunniran, W., Wu, Q., et al.: Robust CNN-based gait verification and identification using skeleton gait energy image. In: 2018 Digital Image Computing: Techniques and Applications (DICTA). IEEE, pp. 1–7 (2018)
Liu, Y., Jiang, X., Sun, T., et al.: 3D gait recognition based on a CNN-LSTM network with the fusion of SkeGEI and DA features. In:2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, pp. 1–8 (2019)
Shi, L., Zhang, Y., Cheng, J., et al.: Skeleton-based action recognition with directed graph neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7912–7921 (2019)
Ju, H., Bhanu, B.: Individual recognition using gait energy image. IEEE Trans. Pattern Anal. Mach. Intell. 28(2), 316–322 (2005)
Liu, J.Y., Zheng, N.N.: Gait history image: a novel temporal template for gait recognition. In: Proceedings of the 2007 IEEE International Conference on Multimedia and Expo. Beijing, China. IEEE, pp. 663−666 (2007)
Zhang, E.H., Zhao, Y.W., Xiong, W.: Active energy image plus 2DLPP for gait recognition. Signal Process. 90(7), 2295–2302 (2010)
Bashir, K., Xiang, T., Gong, S.G.: Gait recognition without subject cooperation. Pattern Recogn. Lett. 31(13), 2052–2060 (2010)
Wang, C., Zhang, J.P., Wang, L., Pu, J., Yuan, X.R.: Human identification using temporal information preserving gait template. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2164–2176 (2012)
Takemura, N., Makihara, Y., Muramatsu, D., et al.: On input/output architectures for convolutional neural network-based cross-view gait recognition. IEEE Trans. Circuits Syst. Video Technol. 29, 1–1 (2017)
He, Y., Zhang, J., Shan, H., et al.: Multi-task GANs for view-specific feature learning in gait recognition. IEEE Trans. Inf. Forensics Secur. PP, 1–1 (2018)
Zhang, K., Luo, W., Ma, L., et al.: Learning joint gait representation via quintuplet loss minimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4700–4709 (2019)
Chao, H., He, Y., Zhang, J., et al.: Gaitset: regarding gait as a set for cross-view gait recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33(01), pp. 8126–8133 (2019)
Qin, H., Chen, Z., Guo, Q., et al.: RPNet: gait recognition with relationships between each body-parts. IEEE Trans. Circuits Syst. Video Technol. 32(5), 2990–3000 (2021)
Li, H., Qiu, Y., Zhao, H., et al.: GaitSlice: a gait recognition model based on spatio-temporal slice features. Pattern Recognit. 124, 108453 (2022)
Xu, J., Li, H., Hou, S.: Autoencoder-guided GAN for minority-class cloth-changing gait data generation. Digit Signal Process 128, 103608 (2022)
Xu, J., Li, H., Hou, S.: Attention-based gait recognition network with novel partial representation PGOFI based on prior motion information. Digit Signal Process 133, 103845 (2023)
Zhang, Z., Luan, T., Liu, F., et al.: On learning disentangled representations for gait recognition. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1–1 (2020)
Li, X., Makihara, Y., Xu, C., et al.: Gait recognition via semi-supervised disentangled representation learning to identity and covariate features. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13309–13319 (2020)
Hou, S., Cao, C., Liu, X., et al.: Gait lateral network: learning discriminative and compact representations for gait recognition. In: European Conference on Computer Vision. Springer, Cham, pp. 382–398 (2020)
Xu, C., Makihara, Y., Li, X., et al.: Gait recognition from a single image using a phase-aware gait cycle reconstruction network. In: European Conference on Computer Vision. Springer, Cham, pp. 386–403 (2020)
Zhang, S., Wang, Y., Li, A.: Cross-view gait recognition with deep universal linear embeddings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9095–9104 (2021)
Huang, X., Zhu, D., Wang, H., et al.: Context-sensitive temporal feature learning for gait recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12909–12918 (2021)
Lin, B., Zhang, S., Bao, F.: Gait recognition with multiple-temporal-scale 3d convolutional neural network. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 3054–3062 (2020)
Li, X., Chen, Y., Su, J., et al.: Spatio-temporal gait feature with adaptive distance alignment. arXiv preprint arXiv:2203.03376 (2022)
Lin, B., Zhang, S., Yu, X.: Gait recognition via effective global-local feature representation and local temporal aggregation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14648–14656 (2021)
Huang, Z., Xue, D., Shen, X., et al.: 3D local convolutional neural networks for gait recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14920–14929 (2021)
Ji, S., Xu, W., Yang, M., et al.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2012)
Diba, A., Fayyaz, M., Sharma, V., et al.: Temporal 3d convnets: new architecture and transfer learning for video classification. arXiv preprint arXiv:1711.08200 (2017)
Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3d residual networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5533–5541 (2017)
Sun, Y., Zheng, L., Yang, Y., et al.: Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision (ECCV), pp. 480–496 (2018)
Fan, C., Peng, Y., Cao, C., et al.: Gaitpart: temporal part-based model for gait recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14225–14233 (2020)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823 (2015)
Yu, S.Q., Tan, D.L., Tan, T.N.: A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In: Proceedings of the IEEE International Conference on Pattern Recognition, Washington, USA. IEEE, pp. 441–444 (2006)
Acknowledgements
This work is supported by the National Natural Science Foundation of China [Grant Number 62272016] and the National Natural Science Foundation for Young Scientists of China [Grant Number 61906008].
Funding
This work is supported by the National Natural Science Foundation of China [Grant Number 62272016] and the National Natural Science Foundation for Young Scientists of China [Grant Number 61906008].
Author information
Authors and Affiliations
Contributions
DH conceived the paper and proposed the main idea. HH designed the experiment, wrote the manuscript and analyzed the proposed method. YZ suggested improvements. YS suggested improvements. JW suggested improvements. HH is main author of this paper. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
This article does not contain any data, or other information from studies or experimentation, with the involvement of human or animal subjects.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Huang, H., Zhang, Y., Si, Y. et al. Two-branch 3D convolution neural network for gait recognition. SIViP 17, 3495–3504 (2023). https://doi.org/10.1007/s11760-023-02573-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-023-02573-4