Cross-Modality Visible-Infrared Person Re-Identification with Multi-scale Attention and Part Aggregation

Fan, Li; Gong, Shengrong; Zhong, Shan

doi:10.1007/978-981-99-1645-0_20

Li Fan¹⁰,
Shengrong Gong^10,11 &
Shan Zhong¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1793))

Included in the following conference series:

International Conference on Neural Information Processing

824 Accesses

Abstract

In the cross-modality visible-infrared person re-identification (VI-ReID) task, the cross-modality matching degree of visible-infrared images is low due to the large difference in cross-modality image features. Existing methods often impose constraints on the original pixels or extracted features to extract discriminative features, which are prone to introduce irrelevant background clutter and have a weak ability to extract cross-modality invariant features. This paper proposes an end-to-end neural network called multi-scale attention part aggregation network (MSAPANet). The framework consists of an intra-modality multi-scale attention (IMSA) module and a fine-grained part aggregation learning (FPAL). IMSA module is used to mine intra-modality attention-enhanced discriminative part features and suppress background feature extraction. FPAL fuses fine-grained local features and global semantic information through channel-spatial joint soft attention (CSA) to efficiently extract cross-modality shared features. Experiments were carried out on SYSU-MM01 and RegDB, two common datasets for VI-ReID, and the results show that under various settings, our method outperforms the reference current state-of-the-art methods. In this paper, by designing the network structure, the network mine the intra-modality and inter-modality salient information at the same time, improving the discriminative performance of fine-grained features in the channel and spatial dimensions, promoting modality-invariant and discriminative feature representation learning for VI-ReID tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, Y., Wan, L., Li, Z., Jing, Q., Sun, Z.: Neural feature search for RGB-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 587–597 (2021)
Google Scholar
Choi, S., Lee, S., Kim, Y., Kim, T., Kim, C.: Hi-CMD: hierarchical cross-modality disentanglement for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10257–10266 (2020)
Google Scholar
Feng, Z., Lai, J., Xie, X.: Learning modality-specific representations for visible-infrared person re-identification. IEEE Trans. Image Process. (2020)
Google Scholar
Gao, G., Shao, H., Wu, F., Yang, M., Yu, Y.: Leaning compact and representative features for cross-modality person re-identification. In: World Wide Web, pp. 1–18 (2022)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. IN: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Huang, Z., Liu, J., Li, L., Zheng, K., Zha, Z.J.: Modality-adaptive mixup and invariant decomposition for RGB-infrared person re-identification. arXiv preprint arXiv:2203.01735 (2022)
Jambigi, C., Rawal, R., Chakraborty, A.: MMD-ReiD: a simple but effective solution for visible-thermal person ReID. arXiv preprint arXiv:2111.05059 (2021)
Leng, Q., Ye, M., Tian, Q.: A survey of open-world person re-identification. IEEE Trans. Circuits Syst. Video Technol. 1092–1108 (2019)
Google Scholar
Li, W., Zhu, X., Gong, S.: Harmonious attention network for person re-identification. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Li, Y., He, J., Zhang, T., Liu, X., Zhang, Y., Wu, F.: Diverse part discovery: occluded person re-identification with part-aware transformer (2021)
Google Scholar
Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2197–2206 (2015)
Google Scholar
Liu, H., Cheng, J., Wang, W., Su, Y., Bai, H.: Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification. Neurocomputing 398, 11–19 (2020)
Article Google Scholar
Liu, H., Ma, S., Xia, D., Li, S.: Sfanet: a spectrum-aware feature augmentation network for visible-infrared person reidentification. IEEE Trans. Neural Netw. Learn. Syst. (2021)
Google Scholar
Moon, H., Phillips, P.J.: Computational and performance aspects of PCA-based face-recognition algorithms. Perception 30(3), 303–321 (2001)
Article Google Scholar
Nguyen, D.T., Hong, H.G., Kim, K.W., Park, K.R.: Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3), 605 (2017)
Article Google Scholar
Rao, Y., Chen, G., Lu, J., Zhou, J.: Counterfactual attention learning for fine-grained visual categorization and re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1025–1034 (2021)
Google Scholar
Wan, L., Sun, Z., Jing, Q., Chen, Y., Lu, L., Li, Z.: G2DA: geometry-guided dual-alignment learning for RGB-infrared person re-identification. arXiv preprint arXiv:2106.07853 (2021)
Wang, F., et al.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017)
Google Scholar
Wang, G., Zhang, T., Cheng, J., Liu, S., Yang, Y., Hou, Z.: RGB-infrared cross-modality person re-identification via joint pixel and feature alignment (2019)
Google Scholar
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Google Scholar
Wang, Z., Wang, Z., Zheng, Y., Chuang, Y.Y., Satoh, S.: Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Google Scholar
Wu, A., Zheng, W.S., Yu, H.X., Gong, S., Lai, J.: RGB-infrared cross-modality person re-identification. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)
Google Scholar
Wu, Q., et al.: Discover cross-modality nuances for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4330–4339 (2021)
Google Scholar
Yang, J., et al.: Learning to know where to see: a visibility-aware approach for occluded person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11885–11894 (2021)
Google Scholar
Ye, M., Shen, J., Lin, G., Xiang, T., Hoi, S.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2021)
Google Scholar
Ye, M., Lan, X., Leng, Q., Shen, J.: Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Trans. Image Process. 29, 9387–9399 (2020)
Article MATH Google Scholar
Ye, M., Lan, X., Li, J., Yuen, P.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Ye, M., Lan, X., Wang, Z., Yuen, P.C.: Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Trans. Inf. Forensics Secur. 15, 407–419 (2019)
Article Google Scholar
Ye, M., Shen, J., J. Crandall, D., Shao, L., Luo, J.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 229–247. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_14
Chapter Google Scholar
Zhang, C., Liu, H., Guo, W., Ye, M.: Multi-scale cascading network with compact feature learning for RGB-infrared person re-identification. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 8679–8686. IEEE (2021)
Google Scholar
Zhang, Z., Lan, C., Zeng, W., Jin, X., Chen, Z.: Relation-aware global attention for person re-identification. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Zheng, Z., Yang, X., Yu, Z., Zheng, L., Yang, Y., Kautz, J.: Joint discriminative and generative learning for person re-identification. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Zhu, Y., Yang, Z., Wang, L., Zhao, S., Hu, X., Tao, D.: Hetero-center loss for cross-modality person re-identification. Neurocomputing 386, 97–109 (2020)
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (61972059, 61773272, 62102347), China Postdoctoral Science Foundation(2021M69236), Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University (93K172017K18), Natural Science Foundation of Jiangsu Province under Grant (BK20191474, BK20191475, BK20161268), Qinglan Project of Jiangsu Province (No.2020).

Author information

Authors and Affiliations

Northeast Petroleum University, Daqing, China
Li Fan & Shengrong Gong
Changshu Institute of Technology, Changshu, China
Shengrong Gong & Shan Zhong

Authors

Li Fan
View author publications
You can also search for this author in PubMed Google Scholar
Shengrong Gong
View author publications
You can also search for this author in PubMed Google Scholar
Shan Zhong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shengrong Gong .

Editor information

Editors and Affiliations

Indian Institute of Technology Indore, Indore, India
Mohammad Tanveer
Indian Institute of Information Technology - Allahabad, Prayagraj, India
Sonali Agarwal
Kobe University, Kobe, Japan
Seiichi Ozawa
Indian Institute of Technology Patna, Patna, India
Asif Ekbal
University of Innsbruck, Innsbruck, Austria
Adam Jatowt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fan, L., Gong, S., Zhong, S. (2023). Cross-Modality Visible-Infrared Person Re-Identification with Multi-scale Attention and Part Aggregation. In: Tanveer, M., Agarwal, S., Ozawa, S., Ekbal, A., Jatowt, A. (eds) Neural Information Processing. ICONIP 2022. Communications in Computer and Information Science, vol 1793. Springer, Singapore. https://doi.org/10.1007/978-981-99-1645-0_20

Download citation

DOI: https://doi.org/10.1007/978-981-99-1645-0_20
Published: 14 April 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-1644-3
Online ISBN: 978-981-99-1645-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cross-Modality Visible-Infrared Person Re-Identification with Multi-scale Attention and Part Aggregation