Skip to main content
Log in

Multi-granularity cross attention network for person re-identification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Typical person re-identification (Re-ID) methods suffer from common challenges from body misalignment, occlusion issues, background perturbance, pose variations, and other aspects. In solving these problems, the combination of global features and local features makes the network pay attention to the global information and local information in the image. The attention mechanism is found to be effective, which aims to strengthen the salient information and suppress the irrelevant ones. To further enhance the contribution of global information to significant information, in this paper, we propose a multi-granularity cross attention (MGCA) network for person Re-ID. The key component of our framework is the multi-granularity cross attention module, where the attention module selectively aggregates the features of each location and extracts the weighted sum of the features of each location based on each pixel’s contribution to significance. Thus, it obtains the global view of the image and the spatial correlation between any two positions. The related semantic features reinforce each other, further improving compactness and semantic consistency within the classes, gaining feature refinement and feature-pair alignment, respectively. Extensive experiments demonstrate that our method is comparable to the most advanced methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability

The datasets source are listed in the paper.

References

  1. Chen B, Deng W, Hu J (2019) Mixed high-order attention network for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 371–381

  2. Chen T, Ding S, Xie J, Yuan Y, Chen W, Yang Y, Ren Z, Wang Z (2019) Abd-net: attentive but diverse person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 8351–8361

  3. Chen X, Fu C, Zhao Y, Zheng F, Song J, Ji R, Yang Y (2020) Salience-guided cascaded suppression network for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 3300–3310

  4. Chen W, Lu Y, Ma H, Chen Q, Wu X, Wu P (2021) Self-attention mechanism in person re-identification models. Multimed Tools Appl 81:4649–4667

    Article  Google Scholar 

  5. Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1335–1344

  6. Cordonnier JB, Loukas A, Jaggi M (2019) On the relationship between self-attention and convolutional layers. arXiv:1911.03584

  7. Dai Z, Chen M, Gu X, Zhu S, Tan P (2019) Batch dropblock network for person re-identification and beyond. In: Proceedings of the IEEE international conference on computer vision. pp 3691–3701

  8. Das A, Chakraborty A, Roy-Chowdhury AK (2014) Consistent re-identification in a camera network. In: European conference on computer vision, Springer, pp 330–345

  9. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp 248–255

  10. Eom S, Huh JH (2018) Group signature with restrictive linkability: minimizing privacy exposure in ubiquitous environment. J Ambient Intell Humanized Comput :1–11

  11. Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3146–3154

  12. Gao SH, Cheng MM, Zhao K, Zhang XY, Yang MH, Torr P (2021) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662

    Article  Google Scholar 

  13. Gong S, Xiang T (2011) Person re-identification. In: Visual analysis of behaviour. Springer, pp 301–313

  14. Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv:1703.07737

  15. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7132–7141

  16. Hu X, Yang K, Fei L, Wang K (2019) Acnet: attention based network to exploit complementary features for rgbd semantic segmentation. In: 2019 IEEE international conference on image processing, ICIP, IEEE, pp 1440–1444

  17. Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 603–612

  18. Huh JH, Seo YS (2019) Understanding edge computing: engineering evolution with artificial intelligence. IEEE Access 7:164229–164245

    Article  Google Scholar 

  19. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, PMLR pp 448–456

  20. Khatun A, Denman S, Sridharan S, Fookes C (2021) Pose-driven attention-guided image generation for person re-identification. arXiv:2104.13773

  21. Kumar V, Namboodiri A, Paluri M, Jawahar C (2017) Pose-aware person recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 6223–6232

  22. Lee H, Park SH, Yoo JH, Jung SH, Huh JH (2020) Face recognition at a distance for a stand-alone access control system. Sensors 20(3):785

    Article  Google Scholar 

  23. Leng Q, Ye M, Tian Q (2019) A survey of open-world person re-identification. IEEE Trans Circ Syst Video Technol 30(4):1092–1108

    Article  Google Scholar 

  24. Li Z, Chang S, Liang F, Huang TS, Cao L, Smith JR (2013) Learning locally-adaptive decision functions for person verification. In: Proceedings of the IEEE conference on computer cision and pattern recognition. pp 3610–3617

  25. Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 152–159

  26. Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2285–2294

  27. Li M, Zhu X, Gong S (2019) Unsupervised tracklet person re-identification. IEEE Trans Pattern Anal Mach Intell 42(7):1770–1782

    Article  Google Scholar 

  28. Liao S, Hu Y, Zhu X, Li SZ (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 2197–2206

  29. Liu H, Feng J, Qi M, Jiang J, Yan S (2017) End-to-end comparative attention networks for person re-identification. IEEE Trans Image Process 26(7):3492–3506

    Article  MathSciNet  MATH  Google Scholar 

  30. Liu C, Gong S, Loy CC, Lin X (2012) Person re-identification: what features are important?. In: European conference on computer vision, Springer, pp 391–401

  31. Ma AJ, Yuen PC, Li J (2013) Domain transfer support vector ranking for person re-identification without target camera label information. In: Proceedings of the IEEE international conference on computer vision. pp 3567–3574

  32. Pedagadi S, Orwell J, Velastin S, Boghossian B (2013) Local fisher discriminant analysis for pedestrian re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3318–3325

  33. Quan R, Dong X, Wu Y, Zhu L, Yang Y (2019) Auto-reid: searching for a part-aware convnet for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 3750–3759

  34. Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision, Springer, pp 17–35

  35. Shen Y, Lin W, Yan J, Xu M, Wu J, Wang J (2015) Person re-identification with correspondence structure learning. In: Proceedings of the IEEE international conference on computer vision. pp 3200–3208

  36. Si J, Zhang H, Li CG, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 5363–5372

  37. Su C, Li J, Zhang S, Xing J, Gao W, Tian Q (2017) Pose-driven deep convolutional model for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 3960– 3969

  38. Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision (ECCV). pp 480–496

  39. Ustinova E, Ganin Y, Lempitsky V (2017) Multi-region bilinear convolutional neural networks for person re-identification. In: 2017 14th IEEE international conference on advanced video and signal based surveillance, AVSS, IEEE, pp 1–6

  40. Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: European conference on computer vision, Springer, pp 135–153

  41. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems. pp 5998–6008

  42. Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7794–7803

  43. Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 3156–3164

  44. Wang G, Yuan Y, Chen X, Li J, Zhou X (2018) Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM international conference on multimedia. pp 274–282

  45. Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) Glad: global-local-alignment descriptor for pedestrian retrieval. In: Proceedings of the 25th ACM international conference on multimedia. pp 420–428

  46. Woo S, Park J, Lee JY, So Kweon I (2018) Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV). pp 3–19

  47. Wu Y, Lin Y, Dong X, Yan Y, Bian W, Yang Y (2019) Progressive learning for person re-identification with one example. IEEE Trans Image Process 28(6):2872–2881

    Article  MathSciNet  MATH  Google Scholar 

  48. Wu CY, Manmatha R, Smola AJ, Krahenbuhl P (2017) Sampling matters in deep embedding learning. In: Proceedings of the IEEE international conference on computer vision. pp 2840–2848

  49. Xia BN, Gong Y, Zhang Y, Poellabauer C (2019) Second-order non-local attention networks for person re-identification. In: Proceedings of the IEEE international conference on computer vision. pp 3760–3769

  50. Xiao Q, Luo H, Zhang C (2017) Margin sample mining loss: a deep learning based method for person re-identification. arXiv:1710.00478

  51. Xu J, Zhao R, Zhu F, Wang H, Ouyang W (2018) Attention-aware compositional network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition— pp 2119–2128

  52. Yang F, Yan K, Lu S, Jia H, Xie X, Gao W (2019) Attention driven person re-identification. Pattern Recogn 86:143–155

    Article  Google Scholar 

  53. Yao H, Zhang S, Hong R, Zhang Y, Xu C, Tian Q (2019) Deep representation learning with part loss for person re-identification. IEEE Trans Image Process 28(6):2860–2871

    Article  MathSciNet  MATH  Google Scholar 

  54. Zhang Z, Lan C, Zeng W, Jin X, Chen Z (2020) Relation-aware global attention for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 3186–3195

  55. Zhang X, Luo H, Fan X, Xiang W, Sun Y, Xiao Q, Jiang W, Zhang C, Sun J (2017) Alignedreid: surpassing human-level performance in person re-identification. arXiv:1711.08184

  56. Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X (2017) Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1077–1085

  57. Zheng F, Deng C, Sun X, Jiang X, Guo X, Yu Z, Huang F, Ji R (2019) Pyramidal person re-identification via multi-loss dynamic training. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 8514–8522

  58. Zheng L, Huang Y, Lu H, Yang Y (2019) Pose-invariant embedding for deep person re-identification. IEEE Trans Image Process 28(9):4500–4509

    Article  MathSciNet  MATH  Google Scholar 

  59. Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of the IEEE international conference on computer vision. pp 1116–1124

  60. Zhong Z, Zheng L, Cao D, Li S (2017) Re-ranking person re-identification with k-reciprocal encoding. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1318–1327

  61. Zou G, Fu G, Peng X, Liu Y, Gao M, Liu Z (2021) Person re-identification based on metric learning: a survey. Multimed Tools Appl 80:26855–26888

    Article  Google Scholar 

Download references

Acknowledgements

This research is partially funded by the Major Project for New Generation of AI under Grant (2018AAA0100400 ), the National Natural Science Foundation of China (No. 61976002, 61860206004 and U20B2068), the Natural Science Foundation of Anhui Higher Education Institutions of China (KJ2019A0033 ), and the Key scientific research project of Hefei Normal University(2021KJZD18, 2021KJZD13 ).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chengmei Han.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, C., Jiang, B. & Tang, J. Multi-granularity cross attention network for person re-identification. Multimed Tools Appl 82, 14755–14773 (2023). https://doi.org/10.1007/s11042-022-13833-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13833-9

Keywords

Navigation