Abstract
Visible-Infrared cross-modality person Re-IDentification (VI-ReID) is a tough task due to the large modality discrepancy and intra-modality variations. Even so, increasing interest has still been attracted by virtue of its significant role in public security. In this paper, we propose a novel VI-ReID method based on Knowledge Self-Distillation (KSD), which aims to improve the discrimination ability of the common neural network through better feature exploration. KSD is achieved by first constructing shallow recognizers with the same structure as the deepest recognizer in the same convolutional neural network and then using the deepest one to teach shallower ones under multi-dimensional supervision. Subsequently, the lower-level features extracted from shallower layers that have absorbed deep knowledge further boost the higher-level feature learning in turn. During the training process, multi-dimensional loss functions are integrated as the mentor for more effective learning supervision. Finally, a VI-ReID model with better feature representation capability is produced via abundant knowledge transfer and feedback. Extensive experiments on two public databases demonstrate the significant superiority of the proposed method in terms of identification accuracy. Furthermore, our method is also proved to be effective to achieve model lightweight on the premise of guaranteeing the performance, which indicates the huge application potential on resource-limited edge devices.
Similar content being viewed by others
References
Wang GA, Zhang TZ, Cheng J, Liu S, Yang Y, Hou ZG (2019) RGB-Infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proc IEEE int Conf Comput Vis (ICCV), pp 3622–3631
Li R, Zhang BP, Teng Z, Fan JP (2021) A divided-and-unite deep network for person re-identification. Applied Intelligence, pp 1479–1491
Chong YW, Peng CW, Zhang C, Wang YJ, Feng WQ, Pan SM (2021) Learning domain invariant and specific representation for cross-domain person re-identification. Applied Intelligence. https://doi.org/10.1007/s10489-020-02107-2
Li W, Zhu XT, Gong SG (2018) Harmonious attention network for person re-identification. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 2285–2294
Li DW, Chen XT, Zhang Z, Huang KQ (2017) Learning deep context-aware features over body and latent parts for person re-identification. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 384–393
Fu Y, Wei Y, Zhou Y, Shi H, Huang G, Wang X, Yao Z, Huang T (2019) Horizontal pyramid matching for person reidentification. In: Proc AAAI
Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: Person retrieval with refined part pooling. In: Proc Eur Conf Comput Vis (ECCV), pp 480–496
Suh Y, Wang J, Tang S, Mei T, Lee KMu (2018) Partaligned bilinear representations for person re-identification. In: Proc Eur Conf Comput Vis (ECCV), pp 402–419
Wu L, Wang Y, Gao J, Li X (2019) Where-and-when to look: Deep siamese attention networks for video-based person reidentification. IEEE Trans Multimedia 21(6):1412–1424
Wei L, Zhang S, Yao H, Gao W, Tian Q (2019) Glad: Global Clocal-alignment descriptor for scalable person re-identification. IEEE Trans Multimedia 21(4):986–999
Zheng F, Deng C, Sun X, Jiang X, Guo X, Yu Z, Huang F, Ji R (2019) Pyramidal person re-identification via multi-loss dynamic training. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 8514–8522
Tay C-P, Roy S, Yap K-H (2019) Aanet: Attribute attention network for person re-identifications. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 7134–7143
Chen W, Chen X, Zhang J, Huang K (2017) Beyond triplet loss: a deep quadruplet network for person re-identification. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 403– 412
Zhou S, Wang J, Shi R, Hou Q, Gong Y, Zheng N (2017) Large margin learning in set-to-set similarity comparison for person reidentification. IEEE Trans Multimedia 20(3):593–604
Yu R, Dou Z, Bai S, Zhang Z, Xu Y, Bai X (2019) Hardaware point-to-set deep metric for person re-identification. In: Proc ECCV, pp 188–204
Ye M, Lan XY, Leng QM, Shen JB (2020) Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Trans Image Process 29:9387–9399
Liu HJ, Tan XH, Zhou XC (2020) Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification. IEEE Trans Multimedia. https://doi.org/10.1109/TMM.2020.3042080
Wu A, Zheng W-S, Yu H-X, Gong S, Lai J (2017) Rgb-infrared cross-modality person re-identification. In: Proc IEEE int Conf Comput Vis (ICCV), pp 5380–5389
Wang Z, Wang Z, Zheng Y, Chuang Y-Y, Satoh S (2019) Learning to reduce dual-level discrepancy for infrared-visible person reidentification. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 618–626
Ye M, Lan X, Li J, Yuen PC (2018) Hierarchical discriminative learning for visible thermal person re-identification. In: Proc AAAI, pp 7501–7508
Ye HR, Liu H, Meng FY, Li X (2021) Bi-directional exponential angular triplet loss for RGB-infrared person re-identification. IEEE Trans Image Process 30:1583–1595
Ye M, Lan X, Wang Z, Yuen PC (2020) Bi-directional center constrained top-ranking for visible thermal person re-identification. IEEE Trans Inf Forensics Security 15:407–419
Ye M, Wang Z, Lan X, Yuen PC (2018) Visible thermal person re-identification via dual-constrained top-ranking. In: Proc 27th int Joint conf Artif Intell, pp 1092–1099
Liu H, Cheng J, Wang W, Su Y, Bai H (2020) Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification. Neurocomputing 398:11–19
Dai P, Ji R, Wang H, Wu Q, Huang Y (2018) Cross-modality person re-identification with generative adversarial training. IJCAI, pp 677–683
Feng Z, Lai J, Xie X (2020) Learning modality-specific representations for visible-infrared person re-identification. IEEE Trans Image Process 29:579–590
Feng Z, Lai J, Xie X (2018) Learning view-specific deep networks for person re-identification. IEEE Trans Image Process 27(7):3472–3483
Zhu Y, Yang Z, Wang L, Zhao S, Hu X, Tao D (2019) Hetero-center loss for cross-modality person re-identification. Neurocomputing 386:97–109
Zhao Y-B, Lin J-W, Xuan Q, Xi X (2019) Hpiln: a feature learning framework for cross-modality person re-identification. IET Image Process 13(14):2897–2904
Wang PY, Su F, Zhao ZC, Zhao YY, Yang L, Li Y (2020) Deep hard modality alignment for visible thermal person re-identification. Pattern Recogn Lett 133:195–201
Nguyen D, Hong H, Kim K, Park K (2017) Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3):605
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 770–778
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proc International conference on learning representations. (ICLR), pp 2818–2826
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proc Int Conf Mach Learn (ICML), pp 448–456
Radenović F, Tolias G, Chum O (2019) Fine-tuning CNN image retrieval with no human annotation. IEEE Trans Pattern Anal Mach Intell 41(7):1655–1668
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proc Int Conf Neural inf Process Syst, pp 1097–1105
Hao Y, Wang N, Li J, Gao X (2019) Hsme: Hypersphere manifold embedding for visible thermal person re-identification. In: Proc AAAI, pp 8385–8392
Li D-G, Wei X, Hong X, Gong Y (2020) Infrared-visible cross-modal person re-identification with an X modality. In: Proc AAAI
Wang G, Zhang T, Yang Y, Cheng J, Chang J, Liang X, Hou Z (2020) Cross-modality paired-images generation for rgb-infrared person reidentification. In: Proc AAAI
Ye M, Shen J, Lin G-J, Xiang T, Shao L, Hoi SCH (2021) Deep learning for person re-identification: A survey and outlook. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3054775
Xu T-B, Liu C-L (2020) Deep neural network self-distillation exploiting data representation invariance. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2020.3027634
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv:1703.07737
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 815–823
Kniaz VV, Knyaz VA, Hladuvka J, Kropatsch WG, Mizginov V (2018) Thermalgan: Multimodal color-to-thermal image translation for person re-identification in multi spectral dataset. In: ECCV Workshops
Zhang LF, Song JB, Gao AN, Chen JW, Bao CL, Ma KS (2019) Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In: Proc IEEE int Conf Comput Vis (ICCV), pp 3713–3722
Sandler M, Howard A, Zhu ML, Zhmoginov A, Chen L. -C. (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 4510–4520
Howard A, Sandler M, Chu G, Chen L. -C., Chen B, Tan MX (2019) Searching for MobileNetV3. In: Proc IEEE int Conf Comput Vis (ICCV), pp 1314–1324
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proc IEEE int Conf Comput Vis (ICCV), pp 618–626
Laurens VDM, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605
Acknowledgements
This work was supported in part by the Natural Science Foundation of Jiangsu Province under Grant BK20200649, in part by the National Natural Science Foundation of China under Grant 62001475 and Grant 62071472, and in part by the Program for “Industrial IoT and Emergency Collaboration” Innovative Research Team in China University of Mining and Technology (CUMT) under Grant 2020ZY002.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhou, Y., Li, R., Sun, Y. et al. Knowledge self-distillation for visible-infrared cross-modality person re-identification. Appl Intell 52, 10617–10631 (2022). https://doi.org/10.1007/s10489-021-02814-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02814-4