Abstract
Cross-modality person re-identification (VI-ReID) is a challenging pedestrian retrieval problem, where the two main challenges are intra-class differences and cross-modality differences between visible and infrared images. To address these issues, many state-of-the-art methods attempt to learn coarse image alignment or part-level person features, however, it is often limited by the effects of intra-identity variation and image alignment is not always good. In this paper, to overcome these two shortcomings, a relational alignment and distance optimization network (RADONet) is constructed. Firstly, we design a cross-modal relational alignment (CM-RA) that exploits the correspondence between cross-modal images to handle cross-modal differences at the pixel level. Secondly, we propose a cross-modal Wasserstein Distance (CM-WD) to mitigate the effects of intra-identity variation in modal alignment. In this way, our network is able to overcome the effects of identity variations by focusing on reducing inter-modal differences and performing more effective feature alignment. Extensive experiments show that our method outperforms state-of-the-art methods on two challenging datasets, with improvements of 3.39% and 2.06% on the SYSU-MM01 dataset for Rank-1 and mAP, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chan, S., Du, F., Lei, Y., Lai, Z., Mao, J., Li, C., et al.: Learning identity-consistent feature for cross-modality person re-identification via pixel and feature alignment. Mob. Inf. Syst. 2022, 4131322 (2022)
Chen, Y., Wan, L., Li, Z., Jing, Q., Sun, Z.: Neural feature search for RGB-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 587–597 (2021). https://doi.org/10.1109/CVPR46437.2021.00065
Choi, S., Lee, S., Kim, Y., Kim, T., Kim, C.: Hi-CMD: hierarchical cross-modality disentanglement for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10257–10266 (2020). https://doi.org/10.1109/CVPR42600.2020.01027
Dai, P., Ji, R., Wang, H., Wu, Q., Huang, Y.: Cross-modality person re-identification with generative adversarial training. In: International Joint Conference on Artificial Intelligence (IJCAI 2018), pp. 677–683 (2018). https://doi.org/10.24963/ijcai.2018/94
Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009). https://doi.org/10.1109/CVPR.2009.5206848
Feng, Z., Lai, J., Xie, X.: Learning modality-specific representations for visible-infrared person re-identification. IEEE Trans. Image Process. 29, 579–590 (2020). https://doi.org/10.1109/TIP.2019.2928126
Gong, S., Cristani, M., Loy, C.C., Hospedales, T.M.: The re-identification challenge. In: Gong, S., Cristani, M., Yan, S., Loy, C.C. (eds.) Person Re-Identification. ACVPR, pp. 1–20. Springer, London (2014). https://doi.org/10.1007/978-1-4471-6296-4_1
Hao, Y., Wang, N., Gao, X., Li, J., Wang, X.: Dual-alignment feature embedding for cross-modality person re-identification. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 57–65. ACM (2019). https://doi.org/10.1145/3343031.3351006
Hao, Y., Wang, N., Li, J., Gao, X.: HSME: hypersphere manifold embedding for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 8385–8392. AAAI Press (2019). https://doi.org/10.1609/aaai.v33i01.33018385
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE Computer Society (2016). https://doi.org/10.1109/CVPR.2016.90
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. CoRR abs/1703.07737 (2017). http://arxiv.org/abs/1703.07737
Li, D., Wei, X., Hong, X., Gong, Y.: Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 34, pp. 4610–4617. AAAI Press (2020). https://ojs.aaai.org/index.php/AAAI/article/view/5891
Liu, H., Tan, X., Zhou, X.: Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification. IEEE Trans. Multim. 23, 4414–4425 (2021). https://doi.org/10.1109/TMM.2020.3042080
Lu, Y., et al.: Cross-modality person re-identification with shared-specific feature transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13379–13389 (2020). https://doi.org/10.1109/CVPR42600.2020.01339
Luo, H., Gu, Y., Liao, X., Lai, S., Jiang, W.: Bag of tricks and a strong baseline for deep person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Computer Vision Foundation/IEEE (2019). https://doi.org/10.1109/CVPRW.2019.00190
Nguyen, D.T., Hong, H.G., Kim, K.W., Park, K.R.: Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3), 605 (2017). https://doi.org/10.3390/s17030605
Pu, N., Chen, W., Liu, Y., Bakker, E.M., Lew, M.S.: Dual gaussian-based variational subspace disentanglement for visible-infrared person re-identification. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2149–2158. ACM (2020). https://doi.org/10.1145/3394171.3413673
Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), pp. 59–66. IEEE Computer Society (1998). https://doi.org/10.1109/ICCV.1998.710701
Sreenu, G., Durai, M.A.S.: Intelligent video surveillance: a review through deep learning techniques for crowd analysis. J. Big Data 6, 48 (2019). https://doi.org/10.1186/s40537-019-0212-5
Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: person retrieval with refined part pooling (and A strong convolutional baseline). In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 501–518. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_30
Wang, G.A., et al.: Cross-modality paired-images generation for RGB-infrared person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 34, pp. 12144–12151 (2020). https://doi.org/10.1016/j.neunet.2020.05.008
Wang, G., Zhang, T., Cheng, J., Liu, S., Yang, Y., Hou, Z.: RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3623–3632. IEEE (2019). https://doi.org/10.1109/ICCV.2019.00372
Wang, Z., Wang, Z., Zheng, Y., Chuang, Y.Y., Satoh, S.: Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 618–626. Computer Vision Foundation/IEEE (2019). https://doi.org/10.1109/CVPR.2019.00071
Wu, A., Zheng, W., Yu, H., Gong, S., Lai, J.: RGB-infrared cross-modality person re-identification. In: 2017 IEEE International Conference on Computer Vision (ICCV 2017), pp. 5390–5399. IEEE Computer Society (2017). https://doi.org/10.1109/ICCV.2017.575
Yang, F., Wang, Z., Xiao, J., Satoh, S.: Mining on heterogeneous manifolds for zero-shot cross-modal image retrieval. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 34, pp. 12589–12596. AAAI Press (2020). https://ojs.aaai.org/index.php/AAAI/article/view/6949
Ye, M., Lan, X., Leng, Q., Shen, J.: Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Trans. Image Process. 29, 9387–9399 (2020)
Ye, M., Lan, X., Li, J., Yuen, P.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 32. AAAI Press (2018). https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16734
Ye, M., Shen, J., J. Crandall, D., Shao, L., Luo, J.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 229–247. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_14
Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.H.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. 44(6), 2872–2893 (2022). https://doi.org/10.1109/TPAMI.2021.3054775
Ye, M., Wang, Z., Lan, X., Yuen, P.C.: Visible thermal person re-identification via dual-constrained top-ranking. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI). vol. 1, p. 2 (2018). https://doi.org/10.24963/ijcai.2018/152
Ye, M., Wang, Z., Lan, X., Yuen, P.C.: Visible thermal person re-identification via dual-constrained top-ranking. In: International Joint Conference on Artificial Intelligence (IJCAI). vol. 1, p. 2 (2018)
Zheng, L., Yang, Y., Hauptmann, A.G.: Person re-identification: Past, present and future. CoRR abs/1610.02984 (2016)
Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 34, pp. 13001–13008. AAAI Press (2020). https://ojs.aaai.org/index.php/AAAI/article/view/7000
Acknowledgements
This work is partially supported by the National Natural Science Foundation of China (Grant No. 61906168, 62176237); Zhejiang Provincial Natural Science Foundation of China (Grant No. LY23F020023); Construction of Hubei Provincial Key Laboratory for Intelligent Visual Monitoring of Hydropower Projects (2022SDSJ01); the Hangzhou AI major scientific and technological innovation project (Grant No. 2022AIZD0061); Zhejiang Provincial Education Department Scientific Research Project (Y202249633);
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Du, F., Li, Z., Mao, J., Lei, Y., Chan, S. (2023). Relational Alignment and Distance Optimization for Cross-Modality Person Re-identification. In: Yang, H., et al. Intelligent Robotics and Applications. ICIRA 2023. Lecture Notes in Computer Science(), vol 14267. Springer, Singapore. https://doi.org/10.1007/978-981-99-6483-3_39
Download citation
DOI: https://doi.org/10.1007/978-981-99-6483-3_39
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-6482-6
Online ISBN: 978-981-99-6483-3
eBook Packages: Computer ScienceComputer Science (R0)