Abstract
RGB-D cross-modal person re-identification (Re-ID) task aims to match the person images between the RGB and depth modalities. This task is rather challenging for the tremendous discrepancy between these two modalities in addition to common issues such as lighting conditions, human posture, camera angle, etc. Nowadays only few types of research focus on this task, and existing Re-ID methods tend to learn homogeneous structural relationships in an image, which have limited discriminability and weak robustness to noisy images. In this paper, we propose A Local-Global Interaction Network dedicated to processing cross-modal problems. The network can constrain the center distance between two modals, and improve the intra-class cross-modality similarity. Besides, it can also learn the local and global features of different modalities to enrich the features extracted from different modes. We validate the effectiveness of our approach on public benchmark datasets. Experimental results demonstrate our method outperforms other state-of-the-arts in terms of visual quality and quantitative measurement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gong, S., Cristani, M., Loy, C.C., Hospedales, T.M.: The re-identification challenge. In: Gong, S., Cristani, M., Yan, S., Loy, C.C. (eds.) Person Re-Identification. ACVPR, pp. 1–20. Springer, London (2014). https://doi.org/10.1007/978-1-4471-6296-4_1
Zhang, P., Xu, J., Wu, Q., Huang, Y., Zhang, J.: Top-push constrained modality-adaptive dictionary learning for cross-modality person re-identification. IEEE Trans. Circuits Syst. Video Technol. 30(12), 4554–4566 (2019)
Zhuo, J., Zhu, J., Lai, J., Xie, X.: Person re-identification on heterogeneous camera network. In: Yang, J., et al. (eds.) CCCV 2017. CCIS, vol. 773, pp. 280–291. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-7305-2_25
Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2197–2206 (2015)
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1116–1124 (2015)
Lv, J., Chen, W., Li, Q., et al.: Unsupervised cross-dataset person re-identification by transfer learning of spatial-temporal patterns. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7948–7956 (2018)
Wu, D., Zheng, S.J., Zhang, X.P., et al.: Deep learning-based methods for person re-identification: a comprehensive review. Neurocomputing 337, 354–371 (2019)
Wu, L., Wang, Y., Gao, J.B., et al.: Where-and-when to look: deep Siamese attention networks for video-based person re-identification. IEEE Trans. Multimedia 21(6), 1412–1424 (2019)
Ye, M., Lan, X.Y., Li, J.W., et al.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, pp. 7501–7508 (2018)
Mogelmose, A., Bahnsen, C., Moeslund, T., Clapés, A., Escalera, S.: Tri-modal person re-identification with RGB, depth and thermal features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 301–307 (2013)
Pala, F., Satta, R., Fumera, G., Roli, F.: Multimodal person reidentification using RGB-D cameras. IEEE Trans. Circuits Syst. Video Technol. 26(4), 788–799 (2015)
Hafner, F.M., Bhuiyan, A., Kooij, J.F., Granger, E.: RGB-depth cross-modal person re-identification. In: 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp. 1–8. IEEE (2019)
Wu, J., Jiang, J., Qi, M., et al.: An end-to-end heterogeneous restraint network for RGB-D cross-modal person re-identification. ACM Trans. Multimedia Comput. Commun. Appl. 18(4), 1–22 (2022)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)
Zhu, Y., Yang, Z., Wang, L., et al.: Hetero-center loss for cross-modality person re-identification. Neurocomputing 386(2020), 97–109 (2020)
Liu, W., Wen, Y., Yu, Z., et al.: Large-margin softmax loss for convolutional neural networks. In: ICML, vol. 2, no. 3, p. 7 (2016)
Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Sun, Y., Xu, Q., Li, Y., et al.: Perceive where to focus: learning visibility-aware part-level features for partial person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 393–402 (2019)
Ye, M., Shen, J., Shao, L.: Visible-infrared person re-identification via homogeneous augmented tri-modal learning. IEEE Trans. Inf. Forensics Secur. 16, 728–739 (2020)
Wang, G., Yang, S., Liu, H., et al.: High-order information matters: learning relation and topology for occluded person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6449–6458 (2020)
Ye, M., Shen, J., Crandall, D.J., Shao, L., Luo, J.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 229–247. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_14
Jiang, J., Jin, K., Qi, M., et al.: A cross-modal multi-granularity attention network for RGB-IR person re-identification. Neurocomputing 406, 59–67 (2020)
Li, D., Wei, X., Hong, X., et al.: Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 4, pp. 4610–4617 (2020)
Liu, H., Hu, L., Ma, L.: Online RGB-D person re-identification based on metric model update. CAAI Trans. Intell. Technol. 2(1), 48–55 (2017)
Munaro, M., Fossati, A., Basso, A., Menegatti, E., Van Gool, L.: One-shot person re-identification with a consumer depth camera. In: Gong, S., Cristani, M., Yan, S., Loy, C.C. (eds.) Person Re-Identification. ACVPR, pp. 161–181. Springer, London (2014). https://doi.org/10.1007/978-1-4471-6296-4_8
Deng, J., Dong, W., Socher, R.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Liao, S., Hu, Y., Zhu, X., et al.: Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2197–2206 (2015)
Lisanti, G., Masi, I., Bagdanov, A.D., et al.: Person re-identification by iterative re-weighted sparse ranking. IEEE Trans. Pattern Anal. Mach. Intell. 37(8), 1629–1642 (2014)
Acknowledgements
This work was supported by Hefei Municipal Natural Science Foundation under Grant No. 2021050. This work was also supported by The National Natural Science Foundation of China under Grant No. 62172137.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhu, C., Li, X., Qi, M., Liu, Y., Zhang, L. (2022). A Local-Global Self-attention Interaction Network for RGB-D Cross-Modal Person Re-identification. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13537. Springer, Cham. https://doi.org/10.1007/978-3-031-18916-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-18916-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18915-9
Online ISBN: 978-3-031-18916-6
eBook Packages: Computer ScienceComputer Science (R0)