Skip to main content
Log in

Modality-agnostic learning for robust visible-infrared person re-identification

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

The inherent complexity of visible-infrared person re-identification is characterized by significant intra-class variance and pronounced inter-modal disparities. Existing approaches address these challenges by constructing comprehensive data representations through joint learning of multi-modal samples or cross-modal transformation techniques. However, the lack of a dynamic modulation mechanism limits their ability to adapt modality-specific features effectively, thereby constraining the generalizability of the shared feature space. This limitation results in a shared space that lacks the robustness necessary for effective generalization. To address these issues, we introduce the dual dynamic modality alignment network, a novel framework designed to dynamically calibrate the significance of modality-specific features, optimizing critical data extraction while minimizing reliance on extraneous information. Central to our approach is the class-aware modality hybrid-assisted generator, which conceptualizes the multimodal contrastive representation space as nodes, integrates diverse contrastive representations, and interlinks isolated representations to explore a wider array of contrastive relationships between modalities. Additionally, we propose an auxiliary modal identity center alignment loss that refines feature distribution and reduces divergence between visible and infrared image representations. Extensive evaluation on the SYSU-MM01 and RegDB datasets demonstrates the superior performance of our method, emphasizing its efficacy in creating a more discriminative and balanced shared feature space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Cho, Y., Kim, W.J., Hong, S., Yoon, S.-E.: Part-based pseudo label refinement for unsupervised person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7298–7308 (2022). https://doi.org/10.1109/CVPR52688.2022.00716

  2. Zhong, S., Bao, Z., Gong, S., Xia, K.: Person reidentification based on pose-invariant feature and B-KNN reranking. IEEE Trans. Comput. Soc. Syst. 8(5), 1272–1281 (2021). https://doi.org/10.1109/TCSS.2021.3063318

    Article  MATH  Google Scholar 

  3. Dou, Z., Wang, Z., Li, Y., Wang, S.: Identity-seeking self-supervised representation learning for generalizable person re-identification. In: IEEE International Conference on Computer Vision (ICCV), pp. 15801–15812 (2023). https://doi.org/10.1109/ICCV51070.2023.01452

  4. Zheng, Y., Tang, S., Teng, G., Ge, Y., Liu, K., Qin, J., Qi, D., Chen, D.: Online pseudo label generation by hierarchical cluster dynamics for adaptive person re-identification. In: IEEE International Conference on Computer Vision (ICCV), pp. 8351–8361 (2021). https://doi.org/10.1109/ICCV48922.2021.00826

  5. Nguyen, V.D., Khaldi, K., Nguyen, D., Mantini, P., Shah, S.: Contrastive viewpoint-aware shape learning for long-term person re-identification. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1030–1038 (2024). https://doi.org/10.1109/WACV57701.2024.00108

  6. Dai, P., Ji, R., Wang, H., Wu, Q., Huang, Y.: Cross-modality person re-identification with generative adversarial training. In: International Joint Conference on Artificial Intelligence, pp. 677–683 (2018). https://doi.org/10.5555/3304415.3304512

  7. Hao, Y., Wang, N., Li, J., Gao, X.: Hsme: Hypersphere manifold embedding for visible thermal person re-identification. In: Association for the Advance of Artificial Intelligence (AAAI), vol. 33, pp. 8385–8392 (2019). https://doi.org/10.1609/aaai.v33i01.33018385

  8. Lu, Y., Wu, Y., Liu, B., Zhang, T., Li, B., Chu, Q., Yu, N.: Cross-modality person re-identification with shared-specific feature transfer. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13376–13386 (2020). https://doi.org/10.1109/CVPR42600.2020.01339

  9. Yang, F., Wang, Z., Xiao, J., Satoh, S.: Mining on heterogeneous manifolds for zero-shot cross-modal image retrieval. In: Association for the Advance of Artificial Intelligence (AAAI), vol. 34, pp. 12589–12596 (2020). https://doi.org/10.1109/CVPR42600.2020.01339

  10. Ye, M., Lan, X., Li, J., Yuen, P.: Hierarchical discriminative learning for visible thermal person re-identification. In: Association for the Advance of Artificial Intelligence (AAAI), vol. 33 (2018). https://doi.org/10.1609/aaai.v33i01.33015613

  11. Feng, Y., Yu, J., Chen, F., Ji, Y., Wu, F., Liu, S., Jing, X.-Y.: Visible-infrared person re-identification via cross-modality interaction transformer. IEEE Trans. Multimed. 25, 7647–7659 (2023). https://doi.org/10.1109/TMM.2022.3224663

    Article  Google Scholar 

  12. Wu, A., Zheng, W.-S., Gong, S., Lai, J.: RGB-IR person re-identification by cross-modality similarity preservation. Int. J. Comput. Vision 128, 1765–1785 (2020). https://doi.org/10.1007/s11263-019-01290-1

    Article  MathSciNet  MATH  Google Scholar 

  13. Liao, S., Shao, L.: Graph sampling based deep metric learning for generalizable person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7349–7358 (2022). https://doi.org/10.1109/CVPR52688.2022.00721

  14. Wang, G.-A., Zhang, T., Yang, Y., Cheng, J., Chang, J., Liang, X., Hou, Z.-G.: Cross-modality paired-images generation for RGB-infrared person re-identification. In: Association for the Advance of Artificial Intelligence (AAAI), vol. 34, pp. 12144–12151 (2020). https://doi.org/10.1609/aaai.v34i07.6894

  15. Liu, H., Xia, D., Jiang, W.: Towards homogeneous modality learning and multi-granularity information exploration for visible-infrared person re-identification. IEEE J. Selected Top. Signal Process. 17(3), 545–559 (2023). https://doi.org/10.1109/JSTSP.2022.3233716

    Article  MATH  Google Scholar 

  16. Kong, J., He, Q., Jiang, M., Liu, T.: Dynamic center aggregation loss with mixed modality for visible-infrared person re-identification. IEEE Signal Process. Lett. 28, 2003–2007 (2021). https://doi.org/10.1109/LSP.2021.3115040

    Article  MATH  Google Scholar 

  17. Ling, Y., Zhong, Z., Luo, Z., Li, S., Sebe, N.: Bridge gap in pixel and feature level for cross-modality person re-identification. IEEE Trans. Circuits Syst. Video Technol. 34(6), 5104–5117 (2024). https://doi.org/10.1109/TCSVT.2023.3338813

    Article  MATH  Google Scholar 

  18. Wu, J., Liu, H., Shi, W., Liu, M., Li, W.: Style-agnostic representation learning for visible-infrared person re-identification. IEEE Trans. Multimed. (2024). https://doi.org/10.1109/TMM.2023.3294002

    Article  MATH  Google Scholar 

  19. Zhang, Y., Yan, Y., Lu, Y., Wang, H.: Towards a unified middle modality learning for visible-infrared person re-identification. In: ACM International Conference on Multimedia (ACM), pp. 788–796 (2021). https://doi.org/10.1145/3474085.3475250

  20. Wu, A., Zheng, W.-S., Yu, H.-X., Gong, S., Lai, J.: RGB-infrared cross-modality person re-identification. In: IEEE International Conference on Computer Vision (ICCV), pp. 5390–5399 (2017). https://doi.org/10.1109/ICCV.2017.575

  21. Sun, H., Liu, J., Zhang, Z., Wang, C., Qu, Y., Xie, Y., Ma, L.: Not all pixels are matched: dense contrastive learning for cross-modality person re-identification. In: ACM International Conference on Multimedia (ACM), pp. 5333–5341 (2022). https://doi.org/10.1145/3503161.3547970

  22. Yang, M., Huang, Z., Hu, P., Li, T., Lv, J., Peng, X.: Learning with twin noisy labels for visible-infrared person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14288–14297 (2022). https://doi.org/10.1109/CVPR52688.2022.01391

  23. Yu, H., Cheng, X., Peng, W., Liu, W., Zhao, G.: Modality unifying network for visible-infrared person re-identification. In: IEEE International Conference on Computer Vision (ICCV), pp. 11151–11161 (2023). https://doi.org/10.1109/ICCV51070.2023.01027

  24. Zhang, Y., Wang, H.: Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2153–2162 (2023). https://doi.org/10.1109/CVPR52729.2023.00214

  25. Wang, Z., Wang, Z., Zheng, Y., Chuang, Y.-Y., Satoh, S.: Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 618–626 (2019). https://doi.org/10.1109/CVPR.2019.00071

  26. Choi, S., Lee, S., Kim, Y., Kim, T., Kim, C.: Hi-CMD: Hierarchical cross-modality disentanglement for visible-infrared person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10254–10263 (2020). https://doi.org/10.1109/CVPR42600.2020.01027

  27. Li, D., Wei, X., Hong, X., Gong, Y.: Infrared-visible cross-modal person re-identification with an x modality. In: Association for the Advance of Artificial Intelligence (AAAI), vol. 34, pp. 4610–4617 (2020). https://doi.org/10.1609/aaai.v34i04.5891

  28. Wei, Z., Yang, X., Wang, N., Gao, X.: Syncretic modality collaborative learning for visible infrared person re-identification. In: IEEE International Conference on Computer Vision (ICCV), pp. 225–234 (2021). https://doi.org/10.1109/ICCV48922.2021.00029

  29. Cai, X., Liu, L., Zhu, L., Zhang, H.: Dual-modality hard mining triplet-center loss for visible infrared person re-identification. Knowl.-Based Syst. 215, 106772 (2021). https://doi.org/10.1016/j.knosys.2021.106772

    Article  MATH  Google Scholar 

  30. Feng, Y., Chen, F., Yu, J., Ji, Y., Wu, F., Liu, S., Jing, X.-Y.: Homogeneous and heterogeneous relational graph for visible-infrared person re-identification. Pattern Recogn. 158, 110981 (2025). https://doi.org/10.1016/j.patcog.2024.110981

  31. Nguyen, D.T., Hong, H.G., Kim, K.W., Park, K.R.: Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3), 605 (2017). https://doi.org/10.3390/s17030605

    Article  MATH  Google Scholar 

  32. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  33. Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. Assoc. Adv. Artif. Intell. 34, 13001–13008 (2020). https://doi.org/10.1609/aaai.v34i07.7000

    Article  Google Scholar 

  34. Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.H.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 2872–2893 (2022). https://doi.org/10.1109/TPAMI.2021.3054775

    Article  MATH  Google Scholar 

  35. Ye, M., Shen, J., J. Crandall, D., Shao, L., Luo, J.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: European Conference on Computer Vision (ECCV), pp. 229–247 (2020). https://doi.org/10.1007/978-3-030-58520-4_14

  36. Lu, H., Zou, X., Zhang, P.: Learning progressive modality-shared transformers for effective visible-infrared person re-identification. In: Association for the Advance of Artificial Intelligence (AAAI), vol. 37, pp. 1835–1843 (2023). https://doi.org/10.1609/aaai.v37i2.25273

  37. Wu, S., Shan, S., Xiao, G., Lew, M.S., Gao, X.: Modality blur and batch alignment learning for twin noisy labels-based visible-infrared person re-identification. Eng. Appl. Artif. Intell. 133, 107990 (2024). https://doi.org/10.1016/j.engappai.2024.107990

    Article  Google Scholar 

  38. Sun, R., Chen, L., Zhang, L., Xie, R., Gao, J.: Robust visible-infrared person re-identification based on polymorphic mask and wavelet graph convolutional network. IEEE Trans. Inf. Forensics Secur. 19, 2800–2813 (2024). https://doi.org/10.1109/TIFS.2024.3354377

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 62376041, 62466026), the China Postdoctoral Science Foundation (No. 2021M69236), Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, the Jilin University (No. 93K172021K01), State Key Lab for Novel Software Technology, the Nanjing University (No. KFKT2024B51), the Scientific Research Foundation of Education Department of Jiangxi Province (No. GJJ2200351).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Gengsheng Xie or Shan Zhong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gong, S., Li, S., Xie, G. et al. Modality-agnostic learning for robust visible-infrared person re-identification. SIViP 19, 200 (2025). https://doi.org/10.1007/s11760-024-03749-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11760-024-03749-2

Keywords