Cross-modality person re-identication with triple-attentive feature aggregation

Huang, Pan; Zhu, Songhao; Wang, Dongsheng; Liang, Zhiwei

doi:10.1007/s11042-021-11739-6

Cross-modality person re-identication with triple-attentive feature aggregation

Published: 09 December 2021

Volume 81, pages 4455–4473, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Pan Huang¹,
Songhao Zhu ORCID: orcid.org/0000-0002-9891-5692¹,
Dongsheng Wang¹ &
…
Zhiwei Liang¹

453 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Cross-modal person re-identification between the visible (RGB) modality and infrared (IR) modality is extremely important for nighttime surveillance applications. In addition to the cross-modal differences caused by different camera spectra, RGB-IR person re-identification is also affected by the large cross-modal and intra-modal variations caused by different camera views and person poses. On the other hand, existing VI-ReID works tend to learn global representations with limited discriminative power and weak robustness to noisy images. In this paper, we propose a novel three-attentional aggregation (TAANet) learning method by mining intra-modal hierarchical and cross-modal graph-level contextual cues of VI-ReID. We propose an intra-modal hybrid weight attention module, which extracts distinguished local aggregated features by mining channel and local feature relationships. To enhance robustness to noisy samples, we introduce an improved triple loss combined with a center loss that takes into account the distance between the different classes closest to the sample, allowing a certain distance to be maintained between classes and improving the discrimination of features. Extensive experiments show that TAANet outperforms state-of-the-art methods in a variety of settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-identification

Cross-Modality Visible-Infrared Person Re-Identification with Multi-scale Attention and Part Aggregation

Multi-granularity feature utilization network for cross-modality visible-infrared person re-identification

Article 10 May 2023

References

Bak S, Zaidenberg S, Boulay B, Bremond F (2014) Improving person re-identification by viewpoint cues. Proceedings of the IEEE Conference on Advanced Video and Signal Based Surveillance, In, pp 175–180
Google Scholar
Basaran E, G¨okmen M, Kamasak ME (2020) An efficient framework for visible–infrared cross modality person re-identification. https://arxiv.org/abs/1907.06498, pp 1-12
Chang X, Hospedales TM, Xiang T (2018) Multi-level factorisation net for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 2109–2118
Google Scholar
Chen LC, Yang Y, Wang J, Xu W, Yuille AL (2016) Attention to scale: Scale-aware semantic image segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 3640–3649
Google Scholar
Cho YJ, Yoon KJ (2016) Improving person re-identification via poseaware multi-shot matching. Proceedings of the IEEE conference on computer vision and pattern recognition, In, pp 1354–1362
Google Scholar
Dai P, Ji R, Wang H, Wu Q, Huang Y (2018) Cross-modality person re-identification with generative adversarial training. Proceedings of the IEEE Joint Conference on Artificial Intelligence, In, pp 667–683
Google Scholar
Feng Z, Lai J, Xie X (2019) Learning modality-specific representations for visible-infrared person re-identification. IEEE Transactions on Image Processing 29(7):579–590
MathSciNet Google Scholar
Fu J, Liu J, Tian H, Li Y, Bao Y et al (2019) Dual attention network for scene segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 3146–3154
Google Scholar
Gheissari N, Sebastian TB, Hartley R (2006) Person reidentification using spatiotemporal appearance. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 1528–1535
Google Scholar
Gong S, Cristani M, Loy CC, Hospedales TM (2014) The re-identification challenge. Proceedings of the IEEE Conference on Person re-identification, Springer, In, pp 1–20
Google Scholar
Han C, Zheng R, Gao C, Sang N (2019) Complementation-reinforced attention network for person re-identification. IEEE Transactions on Circuits and Systems for Video Technology 30(10):3433–3445
Article Google Scholar
Hao Y, Wang N, Li J, Gao X (2019) Hsme: hypersphere manifold embedding for visible thermal person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, In, pp 8385–8392
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, In, pp 770–778
Google Scholar
Huang Y, Zha ZJ, Fu X, Zhang W (2019) Illumination-invariant person reidentification. Proceedings of the ACM Conference on Multimedia, In, pp 365–373
Google Scholar
Jiang J, Jin K, Qi M, Wang Q, Wu J et al (2020) A cross-modal multi-granularity attention network for rgb-ir person re-identification. Neurocomputing 406:59–67
Article Google Scholar
Jin X, Lan C, Zeng W, Chen Z (2020) Global distance-distributions separation for unsupervised person re-identification. Proceedings of the European Conference on Computer Vision, In, pp 735–751
Google Scholar
Karanam S, Li Y, Radke RJ (2019) Person re-identification with discriminatively trained viewpoint invariant dictionaries. Proceedings of the IEEE Conference on Computer Vision, In, pp 4516–4524
Google Scholar
Leng Q, Ye M, Tian Q (2019) A survey of open-world person re-identification. IEEE Transactions on Circuits and Systems for Video Technology 30(4):1092–1108
Article Google Scholar
Li S, Xiao T, Li H, Zhou B, Yue D et al (2017) Person search with natural language description. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 1970–1979
Google Scholar
Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 2285–2294
Google Scholar
Li D, Wei X, Hong X, Gong Y (2020) Infrared-visible cross-modal person re-identification with an x modality. Proceedings of the AAAI Conference on Artificial Intelligence, In, pp 4610–4617
Google Scholar
Liu H, Tan X, Zhou X (2020) Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification. https://arxiv.org/abs/2008.06223, pp 1-12
Liu H, Cheng J, Wang W, Su Y, Bai H (2020) Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification. Neurocomputing 398:11–19
Article Google Scholar
Luo H, Gu Y, Liao X, Lai S, Jiang W (2019) Bag of tricks and a strong baseline for deep person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 1487–1495
Google Scholar
Luo H, Jiang W, Gu Y, Liu F, Liao X et al (2019) A strong baseline and batch normalization neck for deep person re-identification. IEEE Transactions on Multimedia 22(10):2597–2609
Article Google Scholar
Nguyen DT, Hong HG, Kim KW, Park KR (2017) Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3):1–29
Google Scholar
Su C, Li J, Zhang S, Xing J, Gao W et al (2017) Pose-driven deep convolutional model for person re-identification. Proceedings of the IEEE Conference on Computer Vision, In, pp 3960–3969
Google Scholar
Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). Proceedings of the European Conference on Computer Vision, In, pp 480–496
Google Scholar
Sun L, Jiang Z, Song H, Lu Q, Men A (2018) Semi-coupled dictionary learning with relaxation label space transformation for video-based person re-identification. IEEE Access 6:12587–12597
Article Google Scholar
Sun Y, Xu Q, Li Y, Zhang C, Li Y et al (2019) Perceive where to focus: Learning visibility-aware part-level features for partial person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 393–402
Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L et al (2017) Attention is all you need. Proceedings of the Annual Conference on Neural Information Processing Systems, In, pp 5998–6008
Google Scholar
Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, et al (2017) Graph attention networks. https://arxiv.org/abs/1710.10903: 1-12
Wang F, Jiang M, Qian C, Yang S, Li C et al (2017) Residual attention network for image classification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 3156–3164
Google Scholar
Wang G, Yuan Y, Chen X, Li J, Zhou X (2018) Learning discriminative features with multiple granularities for person re-identification. Proceedings of the ACM Conference on Multimedia, In, pp 274–282
Google Scholar
Wang J, Zhu X, Gong S, Li W (2018) Transferable joint attribute-identity deep learning for unsupervised person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 2275–2284
Google Scholar
Wang C, Zhang Q, Huang C, Liu W, Wang X (2018) Mancs: A multi-task attentional network with curriculum sampling for person re-identification. Proceedings of the European Conference on Computer Vision, In, pp 365–381
Google Scholar
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 7794–7803
Google Scholar
Wang G, Zhang T, Cheng J, Liu S, Yang Y et al (2019) Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment. Proceedings of the IEEE Conference on Computer Vision, In, pp 3623–3632
Google Scholar
Wang Z, Wang Z, Zheng Y, Chuang YY, Satoh S (2019) Learning to reduce dual-level discrepancy for infrared-visible person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 618–626
Google Scholar
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. Proceedings of the European conference on computer vision, In, pp 499–515
Google Scholar
Woo S, Park J, Lee JY, Kweon IS (2018) Cbam: Convolutional block attention module. Proceedings of the European Conference on Computer Vision, In, pp 3–19
Google Scholar
Wu A, Zheng WS, Yu HX, Gong S, Lai J (2017) Rgb-infrared cross-modality person re-identification. Proceedings of the IEEE Conference on Computer Vision, In, pp 5380–5389
Google Scholar
Wu A, Zheng WS, Lai JH (2017) Robust depth-based person re-identification. IEEE Transactions on Image Processing 26(6):2588–2603
Article MathSciNet MATH Google Scholar
Wu L, Wang Y, Gao J, Li X (2018) Where-and-when to look: Deep siamese attention networks for video-based person re-identification. IEEE Transactions on Multimedia 21(6):1412–1424
Article Google Scholar
Wu D, Zheng SJ, Zhang XP, Yuan CA, Cheng F et al (2019) Deep learning-based methods for person re-identification: A comprehensive review. Neurocomputing 337:354–371
Article Google Scholar
Xu K, Ba J, Kiros R, Cho K, Courville A et al (2015) Show, attend and tell: Neural image caption generation with visual attention. Proceedings of the IEEE Conference on Machine Learning, In, pp 2048–2057
Google Scholar
Xu J, Zhao R, Zhu F, Wang H, Ouyang W (2018) Attention-aware compositional network for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 2119–2128
Google Scholar
Yang X, Wang M, Tao D (2017) Person re-identification with metric learning using privileged information. IEEE Transactions on Image Processing 27(2):791–805
Article MathSciNet MATH Google Scholar
Yang F, Yan K, Lu S, Jia H, Xie X et al (2019) Attention driven person re-identification. Pattern Recognition 86:143–155
Article Google Scholar
Yao H, Zhang S, Hong R, Zhang Y, Xu C et al (2019) Deep representation learning with part loss for person re-identification. IEEE Transactions on Image Processing 28(6):2860–2871
Article MathSciNet MATH Google Scholar
Ye M, Shen J, J Crandall D, Shao L, Luo J (2020) Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: Proceedings of the European Conference on Computer Vision, pp 229-247
Ye M, Lan X, Li J, Yuen P (2018) Hierarchical discriminative learning for visible thermal person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, In, pp 750–7508
Google Scholar
Ye M, Wang Z, Lan X, Yuen PC (2018) Visible thermal person re-identification via dual-constrained top-ranking. Proceedings of the AAAI Conference on Artificial Intelligence, In, pp 1092–1099
Google Scholar
Ye M, Lan X, Wang Z, Yuen PC (2019) Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Transactions on Information Forensics and Security 15(6):407–419
Google Scholar
Ye M, Lan X, Leng Q, Shen J (2020) Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Transactions on Image Processing 29:9387–9399
Article Google Scholar
Ye M, Shen J, Lin G, Xiang T, Shao L, et al (2021) Deep learning for person re-identification: A survey and outlook. https://arxiv.org/abs/2001.04193, pp 1-20
Yuan Y, Zhang J, Wang Q (2020) Deep gabor convolution network for person re-identification. Neurocomputing 378:387–398
Article Google Scholar
Zhang Y, Li K, Li K, Zhong B, Fu Y (2018) Residual non-local attention networks for image restoration. Proceedings of the International Conference on Conference on Learning Representations, In, pp 1–18
Google Scholar
Zhang JA, Wang Q, Yuan Y (2019) Metric learning by simultaneously learning linear transformation matrix and weight matrix for person re-identification. IET Computer Vision 13(4):428–434
Article Google Scholar
Zhao H, Tian M, Sun S, Shao J, Yan J et al (2017) Spindle net: Person re-identification with human body region guided feature decomposition and fusion. Proceedings of the IEEE conference on computer vision and pattern recognition, In, pp 1077–1085
Google Scholar
Zhao L, Li X, Zhuang Y, Wang J (2017) Deeply-learned part-aligned representations for person re-identification. Proceedings of the IEEE Conference on Computer Vision, In, pp 3219–3228
Google Scholar
Zhao YB, Lin JW, Xuan Q, Xi X (2019) Hpiln: a feature learning framework for cross-modality person re-identification. IET Image Processing 13(14):2897–2904
Article Google Scholar
Zheng F, Deng C, Sun X, Jiang X, Guo X et al (2019) Pyramidal person re-identification via multi-loss dynamic training. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, In, pp 8514–8522
Google Scholar

Download references

Author information

Authors and Affiliations

College of Automation and Artificial Intelligence, Nanjing University of Posts and Telecommunications, Nanjing, 210046, China
Pan Huang, Songhao Zhu, Dongsheng Wang & Zhiwei Liang

Authors

Pan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Songhao Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Dongsheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwei Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Songhao Zhu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, P., Zhu, S., Wang, D. et al. Cross-modality person re-identication with triple-attentive feature aggregation. Multimed Tools Appl 81, 4455–4473 (2022). https://doi.org/10.1007/s11042-021-11739-6

Download citation

Received: 06 May 2021
Revised: 04 September 2021
Accepted: 08 November 2021
Published: 09 December 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s11042-021-11739-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cross-modality person re-identication with triple-attentive feature aggregation

Abstract

Access this article

Similar content being viewed by others

Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-identification

Cross-Modality Visible-Infrared Person Re-Identification with Multi-scale Attention and Part Aggregation

Multi-granularity feature utilization network for cross-modality visible-infrared person re-identification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cross-modality person re-identication with triple-attentive feature aggregation

Abstract

Access this article

Similar content being viewed by others

Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-identification

Cross-Modality Visible-Infrared Person Re-Identification with Multi-scale Attention and Part Aggregation

Multi-granularity feature utilization network for cross-modality visible-infrared person re-identification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation