Abstract
Typical person re-identification (Re-ID) methods suffer from common challenges from body misalignment, occlusion issues, background perturbance, pose variations, and other aspects. In solving these problems, the combination of global features and local features makes the network pay attention to the global information and local information in the image. The attention mechanism is found to be effective, which aims to strengthen the salient information and suppress the irrelevant ones. To further enhance the contribution of global information to significant information, in this paper, we propose a multi-granularity cross attention (MGCA) network for person Re-ID. The key component of our framework is the multi-granularity cross attention module, where the attention module selectively aggregates the features of each location and extracts the weighted sum of the features of each location based on each pixel’s contribution to significance. Thus, it obtains the global view of the image and the spatial correlation between any two positions. The related semantic features reinforce each other, further improving compactness and semantic consistency within the classes, gaining feature refinement and feature-pair alignment, respectively. Extensive experiments demonstrate that our method is comparable to the most advanced methods.
Similar content being viewed by others
Data Availability
The datasets source are listed in the paper.
References
Chen B, Deng W, Hu J (2019) Mixed high-order attention network for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 371–381
Chen T, Ding S, Xie J, Yuan Y, Chen W, Yang Y, Ren Z, Wang Z (2019) Abd-net: attentive but diverse person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 8351–8361
Chen X, Fu C, Zhao Y, Zheng F, Song J, Ji R, Yang Y (2020) Salience-guided cascaded suppression network for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 3300–3310
Chen W, Lu Y, Ma H, Chen Q, Wu X, Wu P (2021) Self-attention mechanism in person re-identification models. Multimed Tools Appl 81:4649–4667
Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1335–1344
Cordonnier JB, Loukas A, Jaggi M (2019) On the relationship between self-attention and convolutional layers. arXiv:1911.03584
Dai Z, Chen M, Gu X, Zhu S, Tan P (2019) Batch dropblock network for person re-identification and beyond. In: Proceedings of the IEEE international conference on computer vision. pp 3691–3701
Das A, Chakraborty A, Roy-Chowdhury AK (2014) Consistent re-identification in a camera network. In: European conference on computer vision, Springer, pp 330–345
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 248–255
Eom S, Huh JH (2018) Group signature with restrictive linkability: minimizing privacy exposure in ubiquitous environment. J Ambient Intell Humanized Comput :1–11
Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3146–3154
Gao SH, Cheng MM, Zhao K, Zhang XY, Yang MH, Torr P (2021) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662
Gong S, Xiang T (2011) Person re-identification. In: Visual analysis of behaviour. Springer, pp 301–313
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv:1703.07737
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7132–7141
Hu X, Yang K, Fei L, Wang K (2019) Acnet: attention based network to exploit complementary features for rgbd semantic segmentation. In: 2019 IEEE international conference on image processing, ICIP, IEEE, pp 1440–1444
Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 603–612
Huh JH, Seo YS (2019) Understanding edge computing: engineering evolution with artificial intelligence. IEEE Access 7:164229–164245
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, PMLR pp 448–456
Khatun A, Denman S, Sridharan S, Fookes C (2021) Pose-driven attention-guided image generation for person re-identification. arXiv:2104.13773
Kumar V, Namboodiri A, Paluri M, Jawahar C (2017) Pose-aware person recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 6223–6232
Lee H, Park SH, Yoo JH, Jung SH, Huh JH (2020) Face recognition at a distance for a stand-alone access control system. Sensors 20(3):785
Leng Q, Ye M, Tian Q (2019) A survey of open-world person re-identification. IEEE Trans Circ Syst Video Technol 30(4):1092–1108
Li Z, Chang S, Liang F, Huang TS, Cao L, Smith JR (2013) Learning locally-adaptive decision functions for person verification. In: Proceedings of the IEEE conference on computer cision and pattern recognition. pp 3610–3617
Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 152–159
Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2285–2294
Li M, Zhu X, Gong S (2019) Unsupervised tracklet person re-identification. IEEE Trans Pattern Anal Mach Intell 42(7):1770–1782
Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2197–2206
Liu H, Feng J, Qi M, Jiang J, Yan S (2017) End-to-end comparative attention networks for person re-identification. IEEE Trans Image Process 26(7):3492–3506
Liu C, Gong S, Loy CC, Lin X (2012) Person re-identification: what features are important?. In: European conference on computer vision, Springer, pp 391–401
Ma AJ, Yuen PC, Li J (2013) Domain transfer support vector ranking for person re-identification without target camera label information. In: Proceedings of the IEEE international conference on computer vision. pp 3567–3574
Pedagadi S, Orwell J, Velastin S, Boghossian B (2013) Local fisher discriminant analysis for pedestrian re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3318–3325
Quan R, Dong X, Wu Y, Zhu L, Yang Y (2019) Auto-reid: searching for a part-aware convnet for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 3750–3759
Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision, Springer, pp 17–35
Shen Y, Lin W, Yan J, Xu M, Wu J, Wang J (2015) Person re-identification with correspondence structure learning. In: Proceedings of the IEEE international conference on computer vision. pp 3200–3208
Si J, Zhang H, Li CG, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 5363–5372
Su C, Li J, Zhang S, Xing J, Gao W, Tian Q (2017) Pose-driven deep convolutional model for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 3960– 3969
Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision (ECCV). pp 480–496
Ustinova E, Ganin Y, Lempitsky V (2017) Multi-region bilinear convolutional neural networks for person re-identification. In: 2017 14th IEEE international conference on advanced video and signal based surveillance, AVSS, IEEE, pp 1–6
Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: European conference on computer vision, Springer, pp 135–153
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems. pp 5998–6008
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7794–7803
Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3156–3164
Wang G, Yuan Y, Chen X, Li J, Zhou X (2018) Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM international conference on multimedia. pp 274–282
Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) Glad: global-local-alignment descriptor for pedestrian retrieval. In: Proceedings of the 25th ACM international conference on multimedia. pp 420–428
Woo S, Park J, Lee JY, So Kweon I (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV). pp 3–19
Wu Y, Lin Y, Dong X, Yan Y, Bian W, Yang Y (2019) Progressive learning for person re-identification with one example. IEEE Trans Image Process 28(6):2872–2881
Wu CY, Manmatha R, Smola AJ, Krahenbuhl P (2017) Sampling matters in deep embedding learning. In: Proceedings of the IEEE international conference on computer vision. pp 2840–2848
Xia BN, Gong Y, Zhang Y, Poellabauer C (2019) Second-order non-local attention networks for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 3760–3769
Xiao Q, Luo H, Zhang C (2017) Margin sample mining loss: a deep learning based method for person re-identification. arXiv:1710.00478
Xu J, Zhao R, Zhu F, Wang H, Ouyang W (2018) Attention-aware compositional network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition— pp 2119–2128
Yang F, Yan K, Lu S, Jia H, Xie X, Gao W (2019) Attention driven person re-identification. Pattern Recogn 86:143–155
Yao H, Zhang S, Hong R, Zhang Y, Xu C, Tian Q (2019) Deep representation learning with part loss for person re-identification. IEEE Trans Image Process 28(6):2860–2871
Zhang Z, Lan C, Zeng W, Jin X, Chen Z (2020) Relation-aware global attention for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 3186–3195
Zhang X, Luo H, Fan X, Xiang W, Sun Y, Xiao Q, Jiang W, Zhang C, Sun J (2017) Alignedreid: surpassing human-level performance in person re-identification. arXiv:1711.08184
Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X (2017) Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1077–1085
Zheng F, Deng C, Sun X, Jiang X, Guo X, Yu Z, Huang F, Ji R (2019) Pyramidal person re-identification via multi-loss dynamic training. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 8514–8522
Zheng L, Huang Y, Lu H, Yang Y (2019) Pose-invariant embedding for deep person re-identification. IEEE Trans Image Process 28(9):4500–4509
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision. pp 1116–1124
Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person re-identification with k-reciprocal encoding. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1318–1327
Zou G, Fu G, Peng X, Liu Y, Gao M, Liu Z (2021) Person re-identification based on metric learning: a survey. Multimed Tools Appl 80:26855–26888
Acknowledgements
This research is partially funded by the Major Project for New Generation of AI under Grant (2018AAA0100400 ), the National Natural Science Foundation of China (No. 61976002, 61860206004 and U20B2068), the Natural Science Foundation of Anhui Higher Education Institutions of China (KJ2019A0033 ), and the Key scientific research project of Hefei Normal University(2021KJZD18, 2021KJZD13 ).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Han, C., Jiang, B. & Tang, J. Multi-granularity cross attention network for person re-identification. Multimed Tools Appl 82, 14755–14773 (2023). https://doi.org/10.1007/s11042-022-13833-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13833-9