Cross modality person re-identification via mask-guided dynamic dual-task collaborative learning

Shao, Wenbin; Liu, Yujie; Zhang, Wenxin; Li, Zongmin

doi:10.1007/s10489-024-05344-x

Cross modality person re-identification via mask-guided dynamic dual-task collaborative learning

Published: 08 March 2024

Volume 54, pages 3723–3736, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Wenbin Shao¹,
Yujie Liu ORCID: orcid.org/0000-0003-1001-963X¹,
Wenxin Zhang¹ &
…
Zongmin Li¹

337 Accesses
1 Altmetric
Explore all metrics

Abstract

Visible-infrared cross modality person re-identification (CM-ReID) has received extensive attention on the community due to its profound applicability for 24-h scene surveillance. The huge modality discrepancy makes it very susceptible to background clutter, especially for infrared images. In this paper, we propose a mask-guided dynamic dual-task collaborative learning (MG-DDCL) method to extract background irrelevant pedestrian representation. A dynamic dual-task collaborative learning strategy is proposed to extract pedestrian representation and generate foreground masks by a unified convolutional neural network. This strategy improved the map by 0.95% and improved the Rank-1 by 1.9%. To make the guidance mask to facilitate the cross modality person re-identification task, we modify the hard-mask produced by semantic segmentation into the friendly soft-mask and generate foreground response map by the regression learning manner. Compared with the classification manner, our method has significant advantages. Extensive experiments conducted on two datasets SYSU-MM01 and RegDB demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual Gated Learning for Visible-Infrared Person Re-identification

A triple-path global–local feature complementary network for visible-infrared person re-identification

Article 20 October 2023

A camera style-invariant learning and channel interaction enhancement fusion network for visible-infrared person re-identification

Article 10 October 2023

Availability of data and materials

The experimental data is obtained using publicly available datasets. The source code can be obtained by contacting the corresponding author.

References

Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2021) Deep learning for person re-identification: a survey and outlook. IEEE Trans Pattern Anal Mach Intell 44:2872–2893
Article Google Scholar
Zhang X, Luo H, Fan X, Xiang W, Sun Y, Xiao Q, Jiang W, Zhang C, Sun J (2017) Alignedreid: surpassing human-level performance in person re-identification. arXiv:1711.08184
Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision (ECCV), pp 480–496
Jia M, Cheng X, Zhai Y, Lu S, Ma S, Tian Y, Zhang J (2021) Matching on sets: conquer occluded person re-identification without alignment. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 1673–1681
Wang X, Li S, Liu M, Wang Y, Roy-Chowdhury AK (2021) Multi-expert adversarial attack detection in person re-identification using context inconsistency. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 15097–15107
Chen H, Lagadec B, Bremond F (2021) Ice: Inter-instance contrastive encoding for unsupervised person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 14960–14969
Zheng Y, Tang S, Teng G, Ge Y, Liu K, Qin J, Qi D, Chen D (2021) Online pseudo label generation by hierarchical cluster dynamics for adaptive person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 8371–8381
Isobe T, Li D, Tian L, Chen W, Shan Y, Wang S (2021) Towards discriminative representation learning for unsupervised person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 8526–8536
Fu D, Chen D, Bao J, Yang H, Yuan L, Zhang L, Li H, Chen D (2021) Unsupervised pre-training for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 14745–14754
Li H, Wu G, Zheng W-S (2021) Combined depth space based architecture search for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 6729–6738
Ye M, Wang Z, Lan X, Yuen PC (2018) Visible thermal person re-identification via dual-constrained top-ranking. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, pp 1092–1099
Zhu Y, Yang Z, Wang L, Zhao S, Hu X, Tao D (2020) Hetero-center loss for cross-modality person re-identification. Neurocomputing 386:97–109
Article Google Scholar
Park H, Lee S, Lee J, Ham B (2018) Learning by aligning: visible-infrared person re-identification using cross-modal correspondences. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 12046–12055
Ye M, Shen J, J Crandall D, Shao L, Luo J (2020) Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: Roceedings of the computer vision–ECCV 2020: 16th European conference, pp 229–247
Ye M, Lan X, Leng Q, Shen J (2020) Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Trans Image Process 29:9387–9399
Article Google Scholar
Ye M, Lan X, Li J, Yuen P (2018) Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 32, pp 7501–7508
Dai P, Ji R, Wang H, Wu Q, Huang Y (2018) Cross-modality person re-identification with generative adversarial training. In: Proceedings of the 27th international joint conference on artificial intelligence, vol 1, pp 677–683
Wang G, Zhang T, Yang Y, Cheng J, Chang J, Liang X, Hou Z-G (2020) Cross-modality paired-images generation for rgb-infrared person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 12144–12151
Wang G, Zhang T, Cheng J, Liu S, Yang Y, Hou Z (2019) Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3623–3632
Wang Z, Wang Z, Zheng Y, Chuang Y, Satoh S (2019) Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 618–626
Liu Y, Shao W, Sun X (2022) Learn robust pedestrian representation within minimal modality discrepancy for visible-infrared person re-identification. J Comput Sci Technol 37:641–651
Article Google Scholar
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626
Li D, Chen X, Zhang Z, Huang K (2017) Learning deep context-aware features over body and latent parts for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 384–393
Su C, Li J, Zhang S, Xing J, Gao W, Tian Q (2017) Pose-driven deep convolutional model for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3960–3969
Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X (2017) Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1077–1085
Zheng L, Huang Y, Lu H, Yang Y (2019) Pose-invariant embedding for deep person re-identification. IEEE Trans Image Process 28(9):4500–4509
Article MathSciNet Google Scholar
Song C, Huang Y, Ouyang W, Wang L (2018) Mask-guided contrastive attention model for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1179–1188
Qi L, Huo J, Wang L, Shi Y, Gao Y (2019) A mask based deep ranking neural network for person retrieval. In: 2019 IEEE International conference on multimedia and expo (ICME), pp 496–501
Wu A, Zheng W, Yu H, Gong S, Lai J (2017) Rgb-infrared cross-modality person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 5380–5389
Lu Y, Wu Y, Liu B, Zhang T, Li B, Chu Q, Yu N (2020) Cross-modality person re-identification with shared-specific feature transfer. In: 2020 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 13376–13386
Hao X, Zhao S, Ye M, Shen J (2021) Cross-modality person re-identification via modality confusion and center aggregation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 16403–16412
Chen Y, Wan L, Li Z, Jing Q, Sun Z (2021) Neural feature search for rgb-infrared person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 587–597
Liu S, Johns E, Davison AJ (2019) End-to-end multi-task learning with attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1871–1880
Misra I, Shrivastava A, Gupta A, Hebert M (2016) Cross-stitch networks for multi-task learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3994–4003
Zhang Z, Cui Z, Xu C, Jie Z, Li X, Yang J (2018) Joint task-recursive learning for semantic segmentation and depth estimation. In: Proceedings of the European conference on computer vision (ECCV), pp 235–251
Vandenhende S, Georgoulis S, Van Gansbeke W, Proesmans M, Dai D, Van Gool L (2021) Multi-task learning for dense prediction tasks: a survey. IEEE Trans Pattern Anal Mach Intell pp 3614–3633
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Wu Q, Dai P, Chen J, Lin C-W, Wu Y, Huang F, Zhong B, Ji R (2021) Discover cross-modality nuances for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4330–4339
Fan M, Lai S, Huang J, Wei X, Chai Z, Luo J, Wei X (2021) Rethinking bisenet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9716–9725
Shen X, Tao X, Gao H, Zhou C, Jia J (2016) Deep automatic portrait matting. In: European conference on computer vision, pp 92–107
Nguyen DT, Hong HG, Kim KW, Park KR (2017) Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3):605
Article Google Scholar
Hao Y, Wang N, Li J, Gao X (2019) Hsme: hypersphere manifold embedding for visible thermal person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 8385–8392
Ye M, Lan X, Wang Z, Yuen PC (2020) Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Trans Inf Forensics Secur 15:407–419
Article Google Scholar
Liu J, Song W, Chen C, Liu F (2022) Cross-modality person re-identification via channel-based partition network. Appl Intell 52:2423–2435
Article Google Scholar
Li D, Wei X, Hong X, Gong Y (2020) Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 4610–4617
Liu Q, Teng Q, Chen H, Li B, Qing L (2022) Dual adaptive alignment and partitioning network for visible and infrared cross-modality person re-identification. Appl Intell 52:547–563
Article Google Scholar
Jia M, Zhai Y, Lu S, Ma S, Zhang J (2020) A similarity inference metric for rgb-infrared cross-modality person re-identification. In: International joint conference on artificial intelligence (IJCAI), pp 1026–1032
Zhao Z, Liu B, Chu Q, Lu Y, Yu N (2021) Joint color-irrelevant consistency learning and identity-aware modality adaptation for visible-infrared cross modality person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 3520–3528
Wei X, Li D, Hong X, Ke W, Gong Y (2020) Co-attentive lifting for infrared-visible person re-identification. In: Proceedings of the 28th ACM international conference on multimedia, pp 1028–1037
Yang M, Huang Z, Hu P, Li T, Lv J, Peng X (2022) Learning with twin noisy labels for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14308–14317
Sun H, Liu J, Zhang Z, Wang C, Qu Y, Xie Y, Ma L (2022) Not all pixels are matched: dense contrastive learning for cross-modality person re-identification. In: Proceedings of the 30th ACM international conference on multimedia, pp 5333–5341
Zhang Y, Wang H (2023) Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2153–2162

Download references

Acknowledgements

This work was supported by the National Key R &D Program under Grant No. 2019YFF0301800, the National Natural Science Foundation of China under Grant No. 61379106, and the Shandong Provincial Natural Science Foundation under Grant Nos.ZR2013FM036 and ZR2015FM011.

Author information

Authors and Affiliations

School of Computer Science and Technology, China University of Petroleum (East China), Changjiang West Road, Qingdao, 266580, Shandong, China
Wenbin Shao, Yujie Liu, Wenxin Zhang & Zongmin Li

Authors

Wenbin Shao
View author publications
You can also search for this author in PubMed Google Scholar
Yujie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wenxin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zongmin Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The first author Wenbin Shao is the designer of method and the writer of paper. He is the major contributor of this paper. The corresponding author Yujie Liu provides comprehensive guidance. The third author(Wenxin Zhang) and fourth author(Zongmin Li) provides valuable advice and help.

Corresponding author

Correspondence to Yujie Liu.

Ethics declarations

Competing interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Shao, W., Liu, Y., Zhang, W. et al. Cross modality person re-identification via mask-guided dynamic dual-task collaborative learning. Appl Intell 54, 3723–3736 (2024). https://doi.org/10.1007/s10489-024-05344-x

Download citation

Accepted: 14 February 2024
Published: 08 March 2024
Issue Date: March 2024
DOI: https://doi.org/10.1007/s10489-024-05344-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cross modality person re-identification via mask-guided dynamic dual-task collaborative learning

Abstract

Access this article

Similar content being viewed by others

Dual Gated Learning for Visible-Infrared Person Re-identification

A triple-path global–local feature complementary network for visible-infrared person re-identification

A camera style-invariant learning and channel interaction enhancement fusion network for visible-infrared person re-identification

Availability of data and materials

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cross modality person re-identification via mask-guided dynamic dual-task collaborative learning

Abstract

Access this article

Similar content being viewed by others

Dual Gated Learning for Visible-Infrared Person Re-identification

A triple-path global–local feature complementary network for visible-infrared person re-identification

A camera style-invariant learning and channel interaction enhancement fusion network for visible-infrared person re-identification

Availability of data and materials

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation