Skip to main content

A Local-Global Self-attention Interaction Network for RGB-D Cross-Modal Person Re-identification

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13537))

Included in the following conference series:

  • 1452 Accesses

Abstract

RGB-D cross-modal person re-identification (Re-ID) task aims to match the person images between the RGB and depth modalities. This task is rather challenging for the tremendous discrepancy between these two modalities in addition to common issues such as lighting conditions, human posture, camera angle, etc. Nowadays only few types of research focus on this task, and existing Re-ID methods tend to learn homogeneous structural relationships in an image, which have limited discriminability and weak robustness to noisy images. In this paper, we propose A Local-Global Interaction Network dedicated to processing cross-modal problems. The network can constrain the center distance between two modals, and improve the intra-class cross-modality similarity. Besides, it can also learn the local and global features of different modalities to enrich the features extracted from different modes. We validate the effectiveness of our approach on public benchmark datasets. Experimental results demonstrate our method outperforms other state-of-the-arts in terms of visual quality and quantitative measurement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gong, S., Cristani, M., Loy, C.C., Hospedales, T.M.: The re-identification challenge. In: Gong, S., Cristani, M., Yan, S., Loy, C.C. (eds.) Person Re-Identification. ACVPR, pp. 1–20. Springer, London (2014). https://doi.org/10.1007/978-1-4471-6296-4_1

    Chapter  Google Scholar 

  2. Zhang, P., Xu, J., Wu, Q., Huang, Y., Zhang, J.: Top-push constrained modality-adaptive dictionary learning for cross-modality person re-identification. IEEE Trans. Circuits Syst. Video Technol. 30(12), 4554–4566 (2019)

    Article  Google Scholar 

  3. Zhuo, J., Zhu, J., Lai, J., Xie, X.: Person re-identification on heterogeneous camera network. In: Yang, J., et al. (eds.) CCCV 2017. CCIS, vol. 773, pp. 280–291. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-7305-2_25

    Chapter  Google Scholar 

  4. Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2197–2206 (2015)

    Google Scholar 

  5. Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1116–1124 (2015)

    Google Scholar 

  6. Lv, J., Chen, W., Li, Q., et al.: Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7948–7956 (2018)

    Google Scholar 

  7. Wu, D., Zheng, S.J., Zhang, X.P., et al.: Deep learning-based methods for person re-identification: a comprehensive review. Neurocomputing 337, 354–371 (2019)

    Article  Google Scholar 

  8. Wu, L., Wang, Y., Gao, J.B., et al.: Where-and-when to look: deep Siamese attention networks for video-based person re-identification. IEEE Trans. Multimedia 21(6), 1412–1424 (2019)

    Article  Google Scholar 

  9. Ye, M., Lan, X.Y., Li, J.W., et al.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, pp. 7501–7508 (2018)

    Google Scholar 

  10. Mogelmose, A., Bahnsen, C., Moeslund, T., Clapés, A., Escalera, S.: Tri-modal person re-identification with RGB, depth and thermal features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 301–307 (2013)

    Google Scholar 

  11. Pala, F., Satta, R., Fumera, G., Roli, F.: Multimodal person reidentification using RGB-D cameras. IEEE Trans. Circuits Syst. Video Technol. 26(4), 788–799 (2015)

    Article  Google Scholar 

  12. Hafner, F.M., Bhuiyan, A., Kooij, J.F., Granger, E.: RGB-depth cross-modal person re-identification. In: 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–8. IEEE (2019)

    Google Scholar 

  13. Wu, J., Jiang, J., Qi, M., et al.: An end-to-end heterogeneous restraint network for RGB-D cross-modal person re-identification. ACM Trans. Multimedia Comput. Commun. Appl. 18(4), 1–22 (2022)

    Google Scholar 

  14. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  15. Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)

  16. Zhu, Y., Yang, Z., Wang, L., et al.: Hetero-center loss for cross-modality person re-identification. Neurocomputing 386(2020), 97–109 (2020)

    Article  Google Scholar 

  17. Liu, W., Wen, Y., Yu, Z., et al.: Large-margin softmax loss for convolutional neural networks. In: ICML, vol. 2, no. 3, p. 7 (2016)

    Google Scholar 

  18. Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

    Google Scholar 

  19. Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  20. Sun, Y., Xu, Q., Li, Y., et al.: Perceive where to focus: learning visibility-aware part-level features for partial person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 393–402 (2019)

    Google Scholar 

  21. Ye, M., Shen, J., Shao, L.: Visible-infrared person re-identification via homogeneous augmented tri-modal learning. IEEE Trans. Inf. Forensics Secur. 16, 728–739 (2020)

    Article  Google Scholar 

  22. Wang, G., Yang, S., Liu, H., et al.: High-order information matters: learning relation and topology for occluded person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6449–6458 (2020)

    Google Scholar 

  23. Ye, M., Shen, J., Crandall, D.J., Shao, L., Luo, J.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 229–247. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_14

    Chapter  Google Scholar 

  24. Jiang, J., Jin, K., Qi, M., et al.: A cross-modal multi-granularity attention network for RGB-IR person re-identification. Neurocomputing 406, 59–67 (2020)

    Article  Google Scholar 

  25. Li, D., Wei, X., Hong, X., et al.: Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 4, pp. 4610–4617 (2020)

    Google Scholar 

  26. Liu, H., Hu, L., Ma, L.: Online RGB-D person re-identification based on metric model update. CAAI Trans. Intell. Technol. 2(1), 48–55 (2017)

    Article  Google Scholar 

  27. Munaro, M., Fossati, A., Basso, A., Menegatti, E., Van Gool, L.: One-shot person re-identification with a consumer depth camera. In: Gong, S., Cristani, M., Yan, S., Loy, C.C. (eds.) Person Re-Identification. ACVPR, pp. 161–181. Springer, London (2014). https://doi.org/10.1007/978-1-4471-6296-4_8

    Chapter  Google Scholar 

  28. Deng, J., Dong, W., Socher, R.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  29. Liao, S., Hu, Y., Zhu, X., et al.: Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2197–2206 (2015)

    Google Scholar 

  30. Lisanti, G., Masi, I., Bagdanov, A.D., et al.: Person re-identification by iterative re-weighted sparse ranking. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1629–1642 (2014)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by Hefei Municipal Natural Science Foundation under Grant No. 2021050. This work was also supported by The National Natural Science Foundation of China under Grant No. 62172137.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaohong Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhu, C., Li, X., Qi, M., Liu, Y., Zhang, L. (2022). A Local-Global Self-attention Interaction Network for RGB-D Cross-Modal Person Re-identification. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13537. Springer, Cham. https://doi.org/10.1007/978-3-031-18916-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-18916-6_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18915-9

  • Online ISBN: 978-3-031-18916-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics