Multi-granularity cross attention network for person re-identification

Han, Chengmei; Jiang, Bo; Tang, Jin

doi:10.1007/s11042-022-13833-9

Multi-granularity cross attention network for person re-identification

Published: 06 October 2022

Volume 82, pages 14755–14773, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

317 Accesses
2 Citations
Explore all metrics

Abstract

Typical person re-identification (Re-ID) methods suffer from common challenges from body misalignment, occlusion issues, background perturbance, pose variations, and other aspects. In solving these problems, the combination of global features and local features makes the network pay attention to the global information and local information in the image. The attention mechanism is found to be effective, which aims to strengthen the salient information and suppress the irrelevant ones. To further enhance the contribution of global information to significant information, in this paper, we propose a multi-granularity cross attention (MGCA) network for person Re-ID. The key component of our framework is the multi-granularity cross attention module, where the attention module selectively aggregates the features of each location and extracts the weighted sum of the features of each location based on each pixel’s contribution to significance. Thus, it obtains the global view of the image and the spatial correlation between any two positions. The related semantic features reinforce each other, further improving compactness and semantic consistency within the classes, gaining feature refinement and feature-pair alignment, respectively. Extensive experiments demonstrate that our method is comparable to the most advanced methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

CBAM: Convolutional Block Attention Module

Data Availability

The datasets source are listed in the paper.

References

Chen B, Deng W, Hu J (2019) Mixed high-order attention network for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 371–381
Chen T, Ding S, Xie J, Yuan Y, Chen W, Yang Y, Ren Z, Wang Z (2019) Abd-net: attentive but diverse person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 8351–8361
Chen X, Fu C, Zhao Y, Zheng F, Song J, Ji R, Yang Y (2020) Salience-guided cascaded suppression network for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 3300–3310
Chen W, Lu Y, Ma H, Chen Q, Wu X, Wu P (2021) Self-attention mechanism in person re-identification models. Multimed Tools Appl 81:4649–4667
Article Google Scholar
Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1335–1344
Cordonnier JB, Loukas A, Jaggi M (2019) On the relationship between self-attention and convolutional layers. arXiv:1911.03584
Dai Z, Chen M, Gu X, Zhu S, Tan P (2019) Batch dropblock network for person re-identification and beyond. In: Proceedings of the IEEE international conference on computer vision. pp 3691–3701
Das A, Chakraborty A, Roy-Chowdhury AK (2014) Consistent re-identification in a camera network. In: European conference on computer vision, Springer, pp 330–345
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 248–255
Eom S, Huh JH (2018) Group signature with restrictive linkability: minimizing privacy exposure in ubiquitous environment. J Ambient Intell Humanized Comput :1–11
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3146–3154
Gao SH, Cheng MM, Zhao K, Zhang XY, Yang MH, Torr P (2021) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662
Article Google Scholar
Gong S, Xiang T (2011) Person re-identification. In: Visual analysis of behaviour. Springer, pp 301–313
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv:1703.07737
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7132–7141
Hu X, Yang K, Fei L, Wang K (2019) Acnet: attention based network to exploit complementary features for rgbd semantic segmentation. In: 2019 IEEE international conference on image processing, ICIP, IEEE, pp 1440–1444
Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 603–612
Huh JH, Seo YS (2019) Understanding edge computing: engineering evolution with artificial intelligence. IEEE Access 7:164229–164245
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, PMLR pp 448–456
Khatun A, Denman S, Sridharan S, Fookes C (2021) Pose-driven attention-guided image generation for person re-identification. arXiv:2104.13773
Kumar V, Namboodiri A, Paluri M, Jawahar C (2017) Pose-aware person recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 6223–6232
Lee H, Park SH, Yoo JH, Jung SH, Huh JH (2020) Face recognition at a distance for a stand-alone access control system. Sensors 20(3):785
Article Google Scholar
Leng Q, Ye M, Tian Q (2019) A survey of open-world person re-identification. IEEE Trans Circ Syst Video Technol 30(4):1092–1108
Article Google Scholar
Li Z, Chang S, Liang F, Huang TS, Cao L, Smith JR (2013) Learning locally-adaptive decision functions for person verification. In: Proceedings of the IEEE conference on computer cision and pattern recognition. pp 3610–3617
Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 152–159
Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2285–2294
Li M, Zhu X, Gong S (2019) Unsupervised tracklet person re-identification. IEEE Trans Pattern Anal Mach Intell 42(7):1770–1782
Article Google Scholar
Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2197–2206
Liu H, Feng J, Qi M, Jiang J, Yan S (2017) End-to-end comparative attention networks for person re-identification. IEEE Trans Image Process 26(7):3492–3506
Article MathSciNet MATH Google Scholar
Liu C, Gong S, Loy CC, Lin X (2012) Person re-identification: what features are important?. In: European conference on computer vision, Springer, pp 391–401
Ma AJ, Yuen PC, Li J (2013) Domain transfer support vector ranking for person re-identification without target camera label information. In: Proceedings of the IEEE international conference on computer vision. pp 3567–3574
Pedagadi S, Orwell J, Velastin S, Boghossian B (2013) Local fisher discriminant analysis for pedestrian re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3318–3325
Quan R, Dong X, Wu Y, Zhu L, Yang Y (2019) Auto-reid: searching for a part-aware convnet for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 3750–3759
Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision, Springer, pp 17–35
Shen Y, Lin W, Yan J, Xu M, Wu J, Wang J (2015) Person re-identification with correspondence structure learning. In: Proceedings of the IEEE international conference on computer vision. pp 3200–3208
Si J, Zhang H, Li CG, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 5363–5372
Su C, Li J, Zhang S, Xing J, Gao W, Tian Q (2017) Pose-driven deep convolutional model for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 3960– 3969
Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision (ECCV). pp 480–496
Ustinova E, Ganin Y, Lempitsky V (2017) Multi-region bilinear convolutional neural networks for person re-identification. In: 2017 14th IEEE international conference on advanced video and signal based surveillance, AVSS, IEEE, pp 1–6
Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: European conference on computer vision, Springer, pp 135–153
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems. pp 5998–6008
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7794–7803
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3156–3164
Wang G, Yuan Y, Chen X, Li J, Zhou X (2018) Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM international conference on multimedia. pp 274–282
Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) Glad: global-local-alignment descriptor for pedestrian retrieval. In: Proceedings of the 25th ACM international conference on multimedia. pp 420–428
Woo S, Park J, Lee JY, So Kweon I (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV). pp 3–19
Wu Y, Lin Y, Dong X, Yan Y, Bian W, Yang Y (2019) Progressive learning for person re-identification with one example. IEEE Trans Image Process 28(6):2872–2881
Article MathSciNet MATH Google Scholar
Wu CY, Manmatha R, Smola AJ, Krahenbuhl P (2017) Sampling matters in deep embedding learning. In: Proceedings of the IEEE international conference on computer vision. pp 2840–2848
Xia BN, Gong Y, Zhang Y, Poellabauer C (2019) Second-order non-local attention networks for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 3760–3769
Xiao Q, Luo H, Zhang C (2017) Margin sample mining loss: a deep learning based method for person re-identification. arXiv:1710.00478
Xu J, Zhao R, Zhu F, Wang H, Ouyang W (2018) Attention-aware compositional network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition— pp 2119–2128
Yang F, Yan K, Lu S, Jia H, Xie X, Gao W (2019) Attention driven person re-identification. Pattern Recogn 86:143–155
Article Google Scholar
Yao H, Zhang S, Hong R, Zhang Y, Xu C, Tian Q (2019) Deep representation learning with part loss for person re-identification. IEEE Trans Image Process 28(6):2860–2871
Article MathSciNet MATH Google Scholar
Zhang Z, Lan C, Zeng W, Jin X, Chen Z (2020) Relation-aware global attention for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 3186–3195
Zhang X, Luo H, Fan X, Xiang W, Sun Y, Xiao Q, Jiang W, Zhang C, Sun J (2017) Alignedreid: surpassing human-level performance in person re-identification. arXiv:1711.08184
Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X (2017) Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1077–1085
Zheng F, Deng C, Sun X, Jiang X, Guo X, Yu Z, Huang F, Ji R (2019) Pyramidal person re-identification via multi-loss dynamic training. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 8514–8522
Zheng L, Huang Y, Lu H, Yang Y (2019) Pose-invariant embedding for deep person re-identification. IEEE Trans Image Process 28(9):4500–4509
Article MathSciNet MATH Google Scholar
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision. pp 1116–1124
Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person re-identification with k-reciprocal encoding. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1318–1327
Zou G, Fu G, Peng X, Liu Y, Gao M, Liu Z (2021) Person re-identification based on metric learning: a survey. Multimed Tools Appl 80:26855–26888
Article Google Scholar

Download references

Acknowledgements

This research is partially funded by the Major Project for New Generation of AI under Grant (2018AAA0100400 ), the National Natural Science Foundation of China (No. 61976002, 61860206004 and U20B2068), the Natural Science Foundation of Anhui Higher Education Institutions of China (KJ2019A0033 ), and the Key scientific research project of Hefei Normal University(2021KJZD18, 2021KJZD13 ).

Author information

Authors and Affiliations

School of Computer Science and Technology, Hefei Normal University, Hefei, China
Chengmei Han
Anhui Provincial Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, China
Chengmei Han, Bo Jiang & Jin Tang

Authors

Chengmei Han
View author publications
You can also search for this author in PubMed Google Scholar
Bo Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jin Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengmei Han.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Han, C., Jiang, B. & Tang, J. Multi-granularity cross attention network for person re-identification. Multimed Tools Appl 82, 14755–14773 (2023). https://doi.org/10.1007/s11042-022-13833-9

Download citation

Received: 01 June 2021
Revised: 13 June 2022
Accepted: 06 September 2022
Published: 06 October 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11042-022-13833-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-granularity cross attention network for person re-identification

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

CBAM: Convolutional Block Attention Module

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-granularity cross attention network for person re-identification

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

CBAM: Convolutional Block Attention Module

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation