Skip to main content

An Effective Visible-Infrared Person Re-identification Network Based on Second-Order Attention and Mixed Intermediate Modality

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14433))

Included in the following conference series:

  • 419 Accesses

Abstract

Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality pedestrian retrieval problem. Due to the significant cross-modality discrepancy, it is difficult to learn discriminative features. Attention-based methods have been widely utilized to extract discriminative features for VI-ReID. However, the existing methods are confined by first-order structures that just exploit simple and coarse information. The existing approach lacks the sufficient capability to learn both modality-irrelevant and modality-relevant features. In this paper, we extract the second-order information from mid-level features to complement the first-order cues. Specifically, we design a flexible second-order module, which considers the correlations between the common features and learns refined feature representations for pedestrian images. Additionally, the visible and infrared modality has a significant gap. Therefore, we propose a plug-and-play mixed intermediate modality module to generate intermediate modality representations to reduce the modality discrepancy between the visible and infrared features. Extensive experimental results on two challenging datasets SYSU-MM01 and RegDB demonstrate that our method considerably achieves competitive performance compared to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cai, S., Zuo, W., Zhang, L.: Higher-order integration of hierarchical convolutional activations for fine-grained visual categorization. In: Proceedings of the ICCV, pp. 511–520 (2017)

    Google Scholar 

  2. Chen, B., Deng, W., Hu, J.: Mixed high-order attention network for person re-identification. In: Proceedings of the ICCV, pp. 371–381 (2019)

    Google Scholar 

  3. Chen, C., Ye, M., Qi, M., Wu, J., Jiang, J., Lin, C.W.: Structure-aware positional transformer for visible-infrared person re-identification. In: IEEE TIP, pp. 2352–2364 (2022)

    Google Scholar 

  4. Chen, D., Wu, P., Jia, T., Xu, F.: Hob-net: high-order block network via deep metric learning for person re-identification. Appl. Intell. 52(5), 4844–4857 (2022)

    Google Scholar 

  5. Chen, Y., Wan, L., Li, Z., Jing, Q., Sun, Z.: Neural feature search for RGB-infrared person re-identification. In: Proceedings of the CVPR, pp. 587–597 (2021)

    Google Scholar 

  6. Dai, Y., Liu, J., Sun, Y., Tong, Z., Zhang, C., Duan, L.Y.: IDM: an intermediate domain module for domain adaptive person re-id. In: Proceedings of the ICCV, pp. 11844–11854 (2021)

    Google Scholar 

  7. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the CVPR, pp. 248–255 (2009)

    Google Scholar 

  8. Fan, X., Zhang, Y., Lu, Y., Wang, H.: Parformer: transformer-based multi-task network for pedestrian attribute recognition. In: IEEE TCSVT, p. 1 (2023)

    Google Scholar 

  9. Fu, C., Hu, Y., Wu, X., Shi, H., Mei, T., He, R.: CM-NAS: cross-modality neural architecture search for visible-infrared person re-identification. In: Proceedings of the ICCV, pp. 11803–11812 (2021)

    Google Scholar 

  10. Gao, Y., et al.: MSO: multi-feature space joint optimization network for RGB-infrared person re-identification. In: Proceedings of the 29th ACM MM, pp. 5257–5265 (2021)

    Google Scholar 

  11. Hao, X., Zhao, S., Ye, M., Shen, J.: Cross-modality person re-identification via modality confusion and center aggregation. In: Proceedings of the ICCV, pp. 16383–16392 (2021)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the CVPR, pp. 770–778 (2016)

    Google Scholar 

  13. Huang, Z., Liu, J., Li, L., Zheng, K., Zha, Z.J.: Modality-adaptive mixup and invariant decomposition for RGB-infrared person re-identification. In: Proceedings of the AAAI, pp. 1034–1042 (2022)

    Google Scholar 

  14. Jacob, P., Picard, D., Histace, A., Klein, E.: Metric learning with HORDE: high-order regularizer for deep embeddings. In: Proceedings of the ICCVw, pp. 6539–6548 (2019)

    Google Scholar 

  15. Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)

    Google Scholar 

  16. Li, D., Wei, X., Hong, X., Gong, Y.: Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI, pp. 4610–4617 (2020)

    Google Scholar 

  17. Li, P., Xie, J., Wang, Q., Zuo, W.: Is second-order information helpful for large-scale visual recognition? In: Proceedings of the ICCV, pp. 2089–2097 (2017)

    Google Scholar 

  18. Liu, L., Zhang, Y., Chen, J., Gao, C.: Fusing global and semantic-part features with multiple granularities for person re-identification. In: 2019 IEEE ISPA/BDCloud/SocialCom/SustainCom, pp. 1436–1440 (2019)

    Google Scholar 

  19. Lu, H., Zou, X., Zhang, P.: Learning progressive modality-shared transformers for effective visible-infrared person re-identification. In: Proceedings of the AAAI, pp. 1835–1843 (2022)

    Google Scholar 

  20. Luo, H., et al.: A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans. Multim. 22(10), 2597–2609 (2020)

    Google Scholar 

  21. Nguyen, D., Hong, H., Kim, K., Park, K.: Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3), 605 (2017)

    Google Scholar 

  22. Park, H., Lee, S., Lee, J., Ham, B.: Learning by aligning: Visible-infrared person re-identification using cross-modal correspondences. In: Proceedings of the ICCV, pp. 12026–12035 (2021)

    Google Scholar 

  23. Shao, R., Lan, X., Li, J., Yuen, P.C.: Multi-adversarial discriminative deep domain generalization for face presentation attack detection. In: Proceedings of the CVPR, pp. 10015–10023 (2019)

    Google Scholar 

  24. Sun, H., et al.: Not all pixels are matched: dense contrastive learning for cross-modality person re-identification. In: Proceedings of the ACM MM, pp. 5333–5341 (2022)

    Google Scholar 

  25. Tay, C.P., Roy, S., Yap, K.H.: Aanet: attribute attention network for person re-identifications. In: Proceedings of the CVPR, pp. 7127–7136 (2019)

    Google Scholar 

  26. Wang, G., Zhang, T., Cheng, J., Liu, S., Yang, Y., Hou, Z.: RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the ICCV, pp. 3622–3631 (2019)

    Google Scholar 

  27. Wang, G.-A., et al.: Cross-modality paired-images generation for RGB-infrared person re-identification. Proc. AAAI Conf. Artif. Intell. 34(07), 12144–12151 (2020)

    Google Scholar 

  28. Wei, Z., Yang, X., Wang, N., Gao, X.: Syncretic modality collaborative learning for visible infrared person re-identification. In: Proceedings of the ICCV, pp. 225–234 (2021)

    Google Scholar 

  29. Woo, S., Park, J., Lee, J., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the ECCV, pp. 3–19 (2018)

    Google Scholar 

  30. Wu, A., Zheng, W.S., Yu, H.X., Gong, S., Lai, J.: RGB-infrared cross-modality person re-identification. In: Proceedings of the ICCV, pp. 5390–5399 (2017)

    Google Scholar 

  31. Wu, Q., et al.: Discover cross-modality nuances for visible-infrared person re-identification. In: Proceedings of the CVPR, pp. 4328–4337 (2021)

    Google Scholar 

  32. Xu, J., Zhao, R., Zhu, F., Wang, H., Ouyang, W.: Attention-aware compositional network for person re-identification. In: Proceedings of the CVPR, pp. 2119–2128 (2018)

    Google Scholar 

  33. Yan, Y., Lu, Y., Wang, H.: Towards a unified middle modality learning for visible-infrared person re-identification. In: Proceedings of the ACM MM, pp. 788–796 (2021)

    Google Scholar 

  34. Yang, M., Huang, Z., Hu, P., Li, T., Lv, J., Peng, X.: Learning with twin noisy labels for visible-infrared person re-identification. In: Proceedings of the CVPR, pp. 14288–14297 (2022)

    Google Scholar 

  35. Ye, M., Lan, X., Li, J., Yuen, P.C.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI, pp. 7501–7508 (2018)

    Google Scholar 

  36. Ye, M., Ruan, W., Du, B., Shou, M.Z.: Channel augmented joint learning for visible-infrared recognition. In: Proceedings of the ICCV, pp. 13547–13556 (2021)

    Google Scholar 

  37. Ye, M., Shen, J., Crandall, D.J., Shao, L., Luo, J.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: Proceedings of the ECCV, pp. 229–247 (2020)

    Google Scholar 

  38. Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.H.: Deep learning for person re-identification: a survey and outlook. In: IEEE TPAMI, pp. 2872–2893 (2022)

    Google Scholar 

  39. Ye, M., Shen, J., Shao, L.: Visible-infrared person re-identification via homogeneous augmented tri-modal learning. In: IEEE TIFS, pp. 728–739 (2021)

    Google Scholar 

  40. Zhang, Y., Wang, H.: Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification. In: Proceedings of the CVPR, pp. 2153–2162 (2023)

    Google Scholar 

  41. Zhang, Y., Yan, Y., Li, J., Wang, H.: MRCN: a novel modality restitution and compensation network for visible-infrared person re-identification. In: Proceedings of the AAAI, pp. 3498–3506 (2023)

    Google Scholar 

  42. Zhao, Z., Liu, B., Chu, Q., Lu, Y., Yu, N.: Joint color-irrelevant consistency learning and identity-aware modality adaptation for visible-infrared cross modality person re-identification. In: Proceedings of the AAAI, pp. 3520–3528 (2021)

    Google Scholar 

  43. Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: Proceedings of the AAAI, pp. 13001–13008 (2020)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant U21A20514, 62002302, by the FuXiaQuan National Independent Innovation Demonstration Zone Collaborative Innovation Platform Project under Grant 3502ZCQXT2022008, and by the China Fundamental Research Funds for the Central Universities under Grants 20720230038.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Lu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tao, H., Zhang, Y., Lu, Y., Wang, H. (2024). An Effective Visible-Infrared Person Re-identification Network Based on Second-Order Attention and Mixed Intermediate Modality. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14433. Springer, Singapore. https://doi.org/10.1007/978-981-99-8546-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8546-3_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8545-6

  • Online ISBN: 978-981-99-8546-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics