Skip to main content
Log in

Cross modality person re-identification via mask-guided dynamic dual-task collaborative learning

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Visible-infrared cross modality person re-identification (CM-ReID) has received extensive attention on the community due to its profound applicability for 24-h scene surveillance. The huge modality discrepancy makes it very susceptible to background clutter, especially for infrared images. In this paper, we propose a mask-guided dynamic dual-task collaborative learning (MG-DDCL) method to extract background irrelevant pedestrian representation. A dynamic dual-task collaborative learning strategy is proposed to extract pedestrian representation and generate foreground masks by a unified convolutional neural network. This strategy improved the map by 0.95% and improved the Rank-1 by 1.9%. To make the guidance mask to facilitate the cross modality person re-identification task, we modify the hard-mask produced by semantic segmentation into the friendly soft-mask and generate foreground response map by the regression learning manner. Compared with the classification manner, our method has significant advantages. Extensive experiments conducted on two datasets SYSU-MM01 and RegDB demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Availability of data and materials

The experimental data is obtained using publicly available datasets. The source code can be obtained by contacting the corresponding author.

References

  1. Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2021) Deep learning for person re-identification: a survey and outlook. IEEE Trans Pattern Anal Mach Intell 44:2872–2893

    Article  Google Scholar 

  2. Zhang X, Luo H, Fan X, Xiang W, Sun Y, Xiao Q, Jiang W, Zhang C, Sun J (2017) Alignedreid: surpassing human-level performance in person re-identification. arXiv:1711.08184

  3. Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision (ECCV), pp 480–496

  4. Jia M, Cheng X, Zhai Y, Lu S, Ma S, Tian Y, Zhang J (2021) Matching on sets: conquer occluded person re-identification without alignment. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 1673–1681

  5. Wang X, Li S, Liu M, Wang Y, Roy-Chowdhury AK (2021) Multi-expert adversarial attack detection in person re-identification using context inconsistency. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 15097–15107

  6. Chen H, Lagadec B, Bremond F (2021) Ice: Inter-instance contrastive encoding for unsupervised person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 14960–14969

  7. Zheng Y, Tang S, Teng G, Ge Y, Liu K, Qin J, Qi D, Chen D (2021) Online pseudo label generation by hierarchical cluster dynamics for adaptive person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 8371–8381

  8. Isobe T, Li D, Tian L, Chen W, Shan Y, Wang S (2021) Towards discriminative representation learning for unsupervised person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 8526–8536

  9. Fu D, Chen D, Bao J, Yang H, Yuan L, Zhang L, Li H, Chen D (2021) Unsupervised pre-training for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 14745–14754

  10. Li H, Wu G, Zheng W-S (2021) Combined depth space based architecture search for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 6729–6738

  11. Ye M, Wang Z, Lan X, Yuen PC (2018) Visible thermal person re-identification via dual-constrained top-ranking. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, pp 1092–1099

  12. Zhu Y, Yang Z, Wang L, Zhao S, Hu X, Tao D (2020) Hetero-center loss for cross-modality person re-identification. Neurocomputing 386:97–109

    Article  Google Scholar 

  13. Park H, Lee S, Lee J, Ham B (2018) Learning by aligning: visible-infrared person re-identification using cross-modal correspondences. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 12046–12055

  14. Ye M, Shen J, J Crandall D, Shao L, Luo J (2020) Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: Roceedings of the computer vision–ECCV 2020: 16th European conference, pp 229–247

  15. Ye M, Lan X, Leng Q, Shen J (2020) Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Trans Image Process 29:9387–9399

    Article  Google Scholar 

  16. Ye M, Lan X, Li J, Yuen P (2018) Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 32, pp 7501–7508

  17. Dai P, Ji R, Wang H, Wu Q, Huang Y (2018) Cross-modality person re-identification with generative adversarial training. In: Proceedings of the 27th international joint conference on artificial intelligence, vol 1, pp 677–683

  18. Wang G, Zhang T, Yang Y, Cheng J, Chang J, Liang X, Hou Z-G (2020) Cross-modality paired-images generation for rgb-infrared person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 12144–12151

  19. Wang G, Zhang T, Cheng J, Liu S, Yang Y, Hou Z (2019) Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3623–3632

  20. Wang Z, Wang Z, Zheng Y, Chuang Y, Satoh S (2019) Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 618–626

  21. Liu Y, Shao W, Sun X (2022) Learn robust pedestrian representation within minimal modality discrepancy for visible-infrared person re-identification. J Comput Sci Technol 37:641–651

    Article  Google Scholar 

  22. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626

  23. Li D, Chen X, Zhang Z, Huang K (2017) Learning deep context-aware features over body and latent parts for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 384–393

  24. Su C, Li J, Zhang S, Xing J, Gao W, Tian Q (2017) Pose-driven deep convolutional model for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3960–3969

  25. Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X (2017) Spindle net: person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1077–1085

  26. Zheng L, Huang Y, Lu H, Yang Y (2019) Pose-invariant embedding for deep person re-identification. IEEE Trans Image Process 28(9):4500–4509

    Article  MathSciNet  Google Scholar 

  27. Song C, Huang Y, Ouyang W, Wang L (2018) Mask-guided contrastive attention model for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1179–1188

  28. Qi L, Huo J, Wang L, Shi Y, Gao Y (2019) A mask based deep ranking neural network for person retrieval. In: 2019 IEEE International conference on multimedia and expo (ICME), pp 496–501

  29. Wu A, Zheng W, Yu H, Gong S, Lai J (2017) Rgb-infrared cross-modality person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 5380–5389

  30. Lu Y, Wu Y, Liu B, Zhang T, Li B, Chu Q, Yu N (2020) Cross-modality person re-identification with shared-specific feature transfer. In: 2020 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 13376–13386

  31. Hao X, Zhao S, Ye M, Shen J (2021) Cross-modality person re-identification via modality confusion and center aggregation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 16403–16412

  32. Chen Y, Wan L, Li Z, Jing Q, Sun Z (2021) Neural feature search for rgb-infrared person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 587–597

  33. Liu S, Johns E, Davison AJ (2019) End-to-end multi-task learning with attention. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1871–1880

  34. Misra I, Shrivastava A, Gupta A, Hebert M (2016) Cross-stitch networks for multi-task learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3994–4003

  35. Zhang Z, Cui Z, Xu C, Jie Z, Li X, Yang J (2018) Joint task-recursive learning for semantic segmentation and depth estimation. In: Proceedings of the European conference on computer vision (ECCV), pp 235–251

  36. Vandenhende S, Georgoulis S, Van Gansbeke W, Proesmans M, Dai D, Van Gool L (2021) Multi-task learning for dense prediction tasks: a survey. IEEE Trans Pattern Anal Mach Intell pp 3614–3633

  37. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  38. Wu Q, Dai P, Chen J, Lin C-W, Wu Y, Huang F, Zhong B, Ji R (2021) Discover cross-modality nuances for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4330–4339

  39. Fan M, Lai S, Huang J, Wei X, Chai Z, Luo J, Wei X (2021) Rethinking bisenet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9716–9725

  40. Shen X, Tao X, Gao H, Zhou C, Jia J (2016) Deep automatic portrait matting. In: European conference on computer vision, pp 92–107

  41. Nguyen DT, Hong HG, Kim KW, Park KR (2017) Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3):605

    Article  Google Scholar 

  42. Hao Y, Wang N, Li J, Gao X (2019) Hsme: hypersphere manifold embedding for visible thermal person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 8385–8392

  43. Ye M, Lan X, Wang Z, Yuen PC (2020) Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Trans Inf Forensics Secur 15:407–419

    Article  Google Scholar 

  44. Liu J, Song W, Chen C, Liu F (2022) Cross-modality person re-identification via channel-based partition network. Appl Intell 52:2423–2435

    Article  Google Scholar 

  45. Li D, Wei X, Hong X, Gong Y (2020) Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 4610–4617

  46. Liu Q, Teng Q, Chen H, Li B, Qing L (2022) Dual adaptive alignment and partitioning network for visible and infrared cross-modality person re-identification. Appl Intell 52:547–563

    Article  Google Scholar 

  47. Jia M, Zhai Y, Lu S, Ma S, Zhang J (2020) A similarity inference metric for rgb-infrared cross-modality person re-identification. In: International joint conference on artificial intelligence (IJCAI), pp 1026–1032

  48. Zhao Z, Liu B, Chu Q, Lu Y, Yu N (2021) Joint color-irrelevant consistency learning and identity-aware modality adaptation for visible-infrared cross modality person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 3520–3528

  49. Wei X, Li D, Hong X, Ke W, Gong Y (2020) Co-attentive lifting for infrared-visible person re-identification. In: Proceedings of the 28th ACM international conference on multimedia, pp 1028–1037

  50. Yang M, Huang Z, Hu P, Li T, Lv J, Peng X (2022) Learning with twin noisy labels for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14308–14317

  51. Sun H, Liu J, Zhang Z, Wang C, Qu Y, Xie Y, Ma L (2022) Not all pixels are matched: dense contrastive learning for cross-modality person re-identification. In: Proceedings of the 30th ACM international conference on multimedia, pp 5333–5341

  52. Zhang Y, Wang H (2023) Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2153–2162

Download references

Acknowledgements

This work was supported by the National Key R &D Program under Grant No. 2019YFF0301800, the National Natural Science Foundation of China under Grant No. 61379106, and the Shandong Provincial Natural Science Foundation under Grant Nos.ZR2013FM036 and ZR2015FM011.

Author information

Authors and Affiliations

Authors

Contributions

The first author Wenbin Shao is the designer of method and the writer of paper. He is the major contributor of this paper. The corresponding author Yujie Liu provides comprehensive guidance. The third author(Wenxin Zhang) and fourth author(Zongmin Li) provides valuable advice and help.

Corresponding author

Correspondence to Yujie Liu.

Ethics declarations

Competing interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shao, W., Liu, Y., Zhang, W. et al. Cross modality person re-identification via mask-guided dynamic dual-task collaborative learning. Appl Intell 54, 3723–3736 (2024). https://doi.org/10.1007/s10489-024-05344-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05344-x

Keywords

Navigation