Abstract
The task of visible and infrared person re-identification (VI-ReID) aims to retrieve person images across visible and infrared images. However, the significant modality discrepancy and intra-modality variations render this task extremely challenging. Existing VI-ReID methods ignore the design for lightweight network. To address the above problems, we design a lightweight two-stream network based on omni-scale network (OSNet) for this task, we further explore how many parameters are shared is more efficient for two-stream network. On this basis, we propose a novel self-distillation module (SDM) to improve the feature extraction capability of this two-stream network. The SDM introduces the deepest classifier as a teacher model and constructs three shallow classifiers as student models. Under the guidance of the teacher model, these student models absorb rich deep knowledge from the deepest classifier to achieve optimization of low-level features, thus promoting the improvement of high-level feature representation. Subsequently, in order to extract highly discriminative part-informed features, we introduce a multi-granularity information mining(MGIM) block that not only learns local features but also considers the internal relationships between local features. This helps to fully mine local detail information within the images. The extensive experiments on the SYSU-MM01,RegDB,and LLCM datasets show that our proposed method achieves superior performance.












Similar content being viewed by others
Data availability
In this research, we have ensured the accessibility of all datasets used. Specifically, the RegDB dataset can be downloaded from http://dm.dongguk.edu/link.html, while the SYSU-MM01 dataset is available at https://github.com/wuancong/SYSU-MM01?tab=readme-ov-file. For the LLCM dataset, you must visit https://github.com/ZYK100/LLCM. In all three cases, a signed dataset release agreement must be sent to the designated contact in order to obtain the necessary download links or access permissions. This ensures a smooth and compliant process for data acquisition.
Data availability
The software and other materials required for this paper are freely available online.
Code availability
The code and data in this paper are currently not publicly shared. Upon reasonable request, we will provide the source code to readers as needed. Please contact carole_zhang@vip.163.com and ovolition@163.com for inquiries.
References
Lin Y, Wu Y, Yan C, Xu M, Yang Y (2020) Unsupervised person re-identification via cross-camera similarity exploration. IEEE Trans Image Process 29:5481–5490. https://doi.org/10.1109/tip.2020.2982826
Liu H, Chai Y, Tan X, Li D, Zhou X (2021) Strong but simple baseline with dual-granularity triplet loss for visible-thermal person re-identification. IEEE Signal Process Lett 28:653–657. https://doi.org/10.1109/lsp.2021.3065903
Zhang Y, Yan Y, Li J, Wang H (2023) Mrcn: a novel modality restitution and compensation network for visible-infrared person re-identification. Proc AAAI Conf Artif Intell 37:3498–3506. https://doi.org/10.1609/aaai.v37i3.25459
Feng Y, Chen F, Yu J, Ji Y, Wu F, Liu S, Jing XY (2021) Homogeneous and heterogeneous relational graph for visible-infrared person re-identification. arXiv preprint arXiv:2109.08811
iu J, Sun Y, Zhu F, Pei H, Yang Y, Li W (2022)Learning memory-augmented unidirectional metrics for cross-modality person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19366–19375. https://doi.org/10.1109/cvpr52688.2022.01876
Ye M, Shen J, J. Crandall D, Shao L, Luo J (2020) In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII Dynamic Dual-attentive Aggregation Learning for Visible-infrared Person Re-identification16, pp. 229–247 (2020). https://doi.org/10.1007/978-3-030-58520-4_14. Springer
Elharrouss O, Almaadeed N, Al-Maadeed S, Bouridane A (2021) Gait recognition for person re-identification. J Supercomput 77:3653–3672. https://doi.org/10.1007/s11227-020-03409-5
Leng Q, Ye M, Tian Q (2019) A survey of open-world person re-identification. IEEE Trans Circuits Syst Video Technol 30(4):1092–1108. https://doi.org/10.1109/TCSVT.2019.2898940
Jiang K, Zhang T, Liu X, Qian B, Zhang Y, Wu F (2022) Cross-modality transformer for visible-infrared person re-identification. In: European Conference on Computer Vision, pp. 480–496. https://doi.org/10.1109/tmm.2023.3237155. Springer
Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2021) Deep learning for person re-identification: a survey and outlook. IEEE Trans Pattern Anal Mach Intell 44(6):2872–2893. https://doi.org/10.1109/TPAMI.2021.3054775
Park H, Lee S, Lee J, Ham B (2021) Learning by aligning: visible-infrared person re-identification using cross-modal correspondences. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12046–12055. https://doi.org/10.1109/iccv48922.2021.01183
Si T, He F, Li P, Ye M (2023) Homogeneous and heterogeneous optimization for unsupervised cross-modality person re-identification in visual internet of things. IEEE Internet Things J. https://doi.org/10.1109/jiot.2023.3332077
Zhang Y, Wang H (2023) Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2153–2162. https://doi.org/10.1109/cvpr52729.2023.00214
Qiu L, Chen S, Yan Y, Xue J-H, Wang D-H, Zhu S (2024) High-order structure based middle-feature learning for visible-infrared person re-identification. Proc AAAI Conf Artif Intell 38:4596–4604. https://doi.org/10.1609/aaai.v38i5.28259
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Advances in neural information processing systems
Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114
Wang Z, Wang Z, Zheng Y, Chuang YY, Satoh SI (2019) Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 618–626. https://doi.org/10.1109/cvpr.2019.00071
Wang G-A, Zhang T, Yang Y, Cheng J, Chang J, Liang X, Hou Z-G (2020) Cross-modality paired-images generation for rgb-infrared person re-identification. Proc AAAI Conf Artif Intell 34:12144–12151. https://doi.org/10.1016/j.neunet.2020.05.008
Zhang Q, Lai C, Liu J, Huang N, Han J (2022) Fmcnet: Feature-level modality compensation for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7349–7358. https://doi.org/10.1109/cvpr52688.2022.00720
Ling Y, Luo Z, Lin Y, Li S (2021) A multi-constraint similarity learning with adaptive weighting for visible-thermal person re-identification. In: IJCAI, pp. 845–851. https://doi.org/10.24963/ijcai.2021/117
Chen F, Wu F, Wu Q, Wan Z (2021) Memory regulation and alignment toward generalizer rgb-infrared person. arXiv preprint arXiv:2109.08843
Wang G, Zhang T, Cheng J, Liu S, Yang, Y, Hou Z (2019) Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3623–3632. https://doi.org/10.1109/iccv.2019.00372
HeK M, Rens Q, et al. (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778
Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 480–496. https://doi.org/10.1007/978-3-030-01225-0_30
Liu H, Tan X, Zhou X (2020) Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification. IEEE Trans Multimedia 23:4414–4425. https://doi.org/10.1109/tmm.2020.3042080
Chen F, Wu F, Wu Q, Wan Z (2021) Memory regulation and alignment toward generalizer rgb-infrared person. arXiv preprint arXiv:2109.08843
Zhang Y, Yan Y, Lu Y, Wang H (2021) Towards a unified middle modality learning for visible-infrared person re-identification. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 788–796. https://doi.org/10.1145/3474085.3475250
Nie J, Lin S, Kot AC (2024) Color space learning for cross-color person re-identification. arXiv preprint arXiv:2405.09487
Hu W, Liu B, Zeng H, Hou Y, Hu H (2022) Adversarial decoupling and modality-invariant representation learning for visible-infrared person re-identification. IEEE Trans Circuits Syst Video Technol 32(8):5095–5109. https://doi.org/10.1109/TCSVT.2022.3147813
Oh SH, Han S-W, Choi B-S, Kim G-W, Lim K-S (2018) Deep feature learning for person re-identification in a large-scale crowdsourced environment. J Supercomput 74(12):6753–6765. https://doi.org/10.1007/s11227-017-2221-5
Lyu C, Xu T, Wang K, Chen J (2023) Person re-identification based on human semantic parsing and message passing. J Supercomput 79(5):5223–5247. https://doi.org/10.1007/s11227-022-04866-w
Kim H, Kim H, Ko B, Shim J, Hwang E (2022) Two-stage person re-identification scheme using cross-input neighborhood differences. J Supercomput. https://doi.org/10.1007/s11227-021-03994-z
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737
Batool E, Gillani S, Naz S, Bukhari M, Maqsood M, Yeo S-S, Rho S (2023) Posnet: a hybrid deep learning model for efficient person re-identification. J Supercomput 79(12):13090–13118. https://doi.org/10.1007/s11227-023-05169-4
Zhou K, Yang Y, Cavallaro A, Xiang T (2019) Omni-scale feature learning for person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. https://doi.org/10.1109/iccv.2019.00380
Herzog F, Ji X, Teepe T, Hörmann S, Gilg J, Rigoll G (2021) Lightweight multi-branch network for person re-identification. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 1129–1133. https://doi.org/10.1109/icip42928.2021.9506733
Cheng K, Hua X, Lu H, Tu J, Wang Y, Wang S (2023) Multi-scale semantic correlation mining for visible-infrared person re-identification. arxiv:2311.14395
Wang G, Yuan Y, Chen X, Li J, Zhou X (2018) Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM International Conference on Multimedia, pp. 274–282. https://doi.org/10.1145/3240508.3240552
Fu Y, Wei Y, Zhou Y, Shi H, Huang G, Wang X, Yao Z, Huang T (2019) Horizontal pyramid matching for person re-identification. Proc AAAI Conf Artif Intell 33:8295–8302. https://doi.org/10.1609/aaai.v33i01.33018295
Tian X, Zhang Z, Lin S, Qu Y, Xie Y, Ma L (2021) Farewell to mutual information: Variational distillation for cross-modal person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1522–1531. https://doi.org/10.1109/cvpr46437.2021.00157
Sun Z, Mu Y (2022) Patch-based knowledge distillation for lifelong person re-identification. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 696–707. https://doi.org/10.1145/3503161.3548179
Ren K, Zhang L (2024) Implicit discriminative knowledge learning for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 393–402. https://doi.org/10.2139/ssrn.4585446
Shen Y, Xu L, Yang Y, Li Y, Guo Y (2022) Self-distillation from the last mini-batch for consistency regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11943–11952. https://doi.org/10.1109/cvpr52688.2022.01164
Yang C, An Z, Zhou H, Cai L, Zhi X, Wu J, Xu Y, Zhang Q (2022) Mixskd: Self-knowledge distillation from mixup for image recognition. In: European Conference on Computer Vision, pp. 534–551 (2022). https://doi.org/10.1007/978-3-031-20053-3_31. Springer
Zhang L, Song J, Gao A, Chen J, Bao C, Ma K (2019) Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3713–3722. https://doi.org/10.1109/iccv.2019.00381
Zhou Y, Li R, Sun Y, Dong K, Li S (2022) Knowledge self-distillation for visible-infrared cross-modality person re-identification. Appll Intell. https://doi.org/10.1007/s10489-021-02814-4
e M, Ruan W, Du B, Shou MZ (2021) Channel augmented joint learning for visible-infrared recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13567–13576. https://doi.org/10.1109/iccv48922.2021.01331
Li X, Hu X, Yang J (1905) Spatial group-wise enhance: Improving semantic feature learning in convolutional networks. arxiv 2019. arXiv preprint arXiv:1905.09646
Jambigi C, Rawal R, Chakraborty A (2021) Mmd-reid: A simple but effective solution for visible-thermal person reid. arXiv preprint arXiv:2111.05059
Nguyen DT, Hong HG, Kim KW, Park KR (2017) Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3):605. https://doi.org/10.3390/s17030605
Lu Y, Wu Y, Liu B, Zhang T, Li B, Chu Q, Yu N (2020) Cross-modality person re-identification with shared-specific feature transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13379–13389. https://doi.org/10.1109/cvpr42600.2020.01339
Li D, Wei X, Hong X, Gong Y (2020) Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 4610–4617
Chen Y, Wan L, Li Z, Jing Q, Sun Z (2021) Neural feature search for rgb-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 587–597. https://doi.org/10.1109/cvpr46437.2021.00065
Wu Q, Dai P, Chen J, Lin CW, Wu Y, Huang F, Zhong B, Ji R (2021) Discover cross-modality nuances for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4330–4339. https://doi.org/10.1109/cvpr46437.2021.00431
Wan L, Sun Z, Jing Q, Chen Y, Lu L, Li Z (2023) G2da: Geometry-guided dual-alignment learning for rgb-infrared person re-identification. Pattern Recogn 135:109150. https://doi.org/10.1016/j.patcog.2022.109150
Yang X, Dong W, Li M, Wei Z, Wang N, Gao X (2024) Cooperative separation of modality shared-specific features for visible-infrared person re-identification. IEEE Trans Multimedia. https://doi.org/10.1109/TMM.2024.3377139
Hao X, Zhao S, Ye M, Shen J (2021) Cross-modality person re-identification via modality confusion and center aggregation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16403–16412. https://doi.org/10.1109/iccv48922.2021.01609
Chen C, Ye M, Qi M, Wu J, Jiang J, Lin CW (2022) Structure-aware positional transformer for visible-infrared person re-identification. IEEE Trans Image Process 31:2352–2364. https://doi.org/10.1109/tip.2022.3141868
Yang M, Huang Z, Hu P, Li T, Lv J, Peng X (2022) Learning with twin noisy labels for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14308–14317. https://doi.org/10.1109/cvpr52688.2022.01391
Lu H, Zou X, Zhang P (2023) Learning progressive modality-shared transformers for effective visible-infrared person re-identification. Proc AAAI Conf Artif Intell 37:1835–1843. https://doi.org/10.1609/aaai.v37i2.25273
Yang B, Chen J, Ye M (2023) Towards grand unified representation learning for unsupervised visible-infrared person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11069–11079. https://doi.org/10.1109/iccv51070.2023.01016
Liu J, Wang J, Huang N, Zhang Q, Han J (2022) Revisiting modality-specific feature compensation for visible-infrared person re-identification. IEEE Trans Circuits Syst Video Technol 32(10):7226–7240. https://doi.org/10.1109/tcsvt.2022.3168999
Huang Z, Liu J, Li L, Zheng K, Zha Z-J (2022) Modality-adaptive mixup and invariant decomposition for rgb-infrared person re-identification. Proc AAAI Conf Artif Intell 36:1034–1042. https://doi.org/10.1609/aaai.v36i1.19987
Luo H, Gu Y, Liao X, Lai S, Jiang W (2019) Bag of tricks and a strong baseline for deep person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0. https://doi.org/10.1109/cvprw.2019.00190
Hinton G, Van Der Maaten L (2008) Visualizing data using t-sne journal of machine learning research. J Mach Learn Res 9:2579–2605
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh, D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626
Acknowledgements
We wish to thank the data providers for SYSU-MM01,RegDB and LLCM. We also wish to thank all reviewers and editors who provided valuable suggestions for our paper.
Funding
No funding was received to assist with the preparation of this manuscript.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study’s conception and design. Hongying Zhang and Jiangbing Zeng handled material preparation, data collection, analysis, and software validation. Jiangbing Zeng drafted the initial manuscript; while, Hongying Zhang conducted writing—review and editing. All authors reviewed previous versions and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no Conflict of interest to declare that are relevant to the content of this article.
Consent to participate
Written informed consent for publication of this paper was obtained from the Civil Aviation University of China and all authors.
Consent for publication
All authors consent to the publication of this paper, confirming originality licensing of all content.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, H., Zeng, J. Lightweight network for visible-infrared person re-identification via self-distillation and multi-granularity information mining. J Supercomput 81, 56 (2025). https://doi.org/10.1007/s11227-024-06543-6
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-024-06543-6