Skip to main content
Log in

Knowledge self-distillation for visible-infrared cross-modality person re-identification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Visible-Infrared cross-modality person Re-IDentification (VI-ReID) is a tough task due to the large modality discrepancy and intra-modality variations. Even so, increasing interest has still been attracted by virtue of its significant role in public security. In this paper, we propose a novel VI-ReID method based on Knowledge Self-Distillation (KSD), which aims to improve the discrimination ability of the common neural network through better feature exploration. KSD is achieved by first constructing shallow recognizers with the same structure as the deepest recognizer in the same convolutional neural network and then using the deepest one to teach shallower ones under multi-dimensional supervision. Subsequently, the lower-level features extracted from shallower layers that have absorbed deep knowledge further boost the higher-level feature learning in turn. During the training process, multi-dimensional loss functions are integrated as the mentor for more effective learning supervision. Finally, a VI-ReID model with better feature representation capability is produced via abundant knowledge transfer and feedback. Extensive experiments on two public databases demonstrate the significant superiority of the proposed method in terms of identification accuracy. Furthermore, our method is also proved to be effective to achieve model lightweight on the premise of guaranteeing the performance, which indicates the huge application potential on resource-limited edge devices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Wang GA, Zhang TZ, Cheng J, Liu S, Yang Y, Hou ZG (2019) RGB-Infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proc IEEE int Conf Comput Vis (ICCV), pp 3622–3631

  2. Li R, Zhang BP, Teng Z, Fan JP (2021) A divided-and-unite deep network for person re-identification. Applied Intelligence, pp 1479–1491

  3. Chong YW, Peng CW, Zhang C, Wang YJ, Feng WQ, Pan SM (2021) Learning domain invariant and specific representation for cross-domain person re-identification. Applied Intelligence. https://doi.org/10.1007/s10489-020-02107-2

  4. Li W, Zhu XT, Gong SG (2018) Harmonious attention network for person re-identification. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 2285–2294

  5. Li DW, Chen XT, Zhang Z, Huang KQ (2017) Learning deep context-aware features over body and latent parts for person re-identification. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 384–393

  6. Fu Y, Wei Y, Zhou Y, Shi H, Huang G, Wang X, Yao Z, Huang T (2019) Horizontal pyramid matching for person reidentification. In: Proc AAAI

  7. Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: Person retrieval with refined part pooling. In: Proc Eur Conf Comput Vis (ECCV), pp 480–496

  8. Suh Y, Wang J, Tang S, Mei T, Lee KMu (2018) Partaligned bilinear representations for person re-identification. In: Proc Eur Conf Comput Vis (ECCV), pp 402–419

  9. Wu L, Wang Y, Gao J, Li X (2019) Where-and-when to look: Deep siamese attention networks for video-based person reidentification. IEEE Trans Multimedia 21(6):1412–1424

    Article  Google Scholar 

  10. Wei L, Zhang S, Yao H, Gao W, Tian Q (2019) Glad: Global Clocal-alignment descriptor for scalable person re-identification. IEEE Trans Multimedia 21(4):986–999

    Article  Google Scholar 

  11. Zheng F, Deng C, Sun X, Jiang X, Guo X, Yu Z, Huang F, Ji R (2019) Pyramidal person re-identification via multi-loss dynamic training. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 8514–8522

  12. Tay C-P, Roy S, Yap K-H (2019) Aanet: Attribute attention network for person re-identifications. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 7134–7143

  13. Chen W, Chen X, Zhang J, Huang K (2017) Beyond triplet loss: a deep quadruplet network for person re-identification. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 403– 412

  14. Zhou S, Wang J, Shi R, Hou Q, Gong Y, Zheng N (2017) Large margin learning in set-to-set similarity comparison for person reidentification. IEEE Trans Multimedia 20(3):593–604

    Google Scholar 

  15. Yu R, Dou Z, Bai S, Zhang Z, Xu Y, Bai X (2019) Hardaware point-to-set deep metric for person re-identification. In: Proc ECCV, pp 188–204

  16. Ye M, Lan XY, Leng QM, Shen JB (2020) Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Trans Image Process 29:9387–9399

    Article  Google Scholar 

  17. Liu HJ, Tan XH, Zhou XC (2020) Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification. IEEE Trans Multimedia. https://doi.org/10.1109/TMM.2020.3042080

  18. Wu A, Zheng W-S, Yu H-X, Gong S, Lai J (2017) Rgb-infrared cross-modality person re-identification. In: Proc IEEE int Conf Comput Vis (ICCV), pp 5380–5389

  19. Wang Z, Wang Z, Zheng Y, Chuang Y-Y, Satoh S (2019) Learning to reduce dual-level discrepancy for infrared-visible person reidentification. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 618–626

  20. Ye M, Lan X, Li J, Yuen PC (2018) Hierarchical discriminative learning for visible thermal person re-identification. In: Proc AAAI, pp 7501–7508

  21. Ye HR, Liu H, Meng FY, Li X (2021) Bi-directional exponential angular triplet loss for RGB-infrared person re-identification. IEEE Trans Image Process 30:1583–1595

    Article  Google Scholar 

  22. Ye M, Lan X, Wang Z, Yuen PC (2020) Bi-directional center constrained top-ranking for visible thermal person re-identification. IEEE Trans Inf Forensics Security 15:407–419

    Article  Google Scholar 

  23. Ye M, Wang Z, Lan X, Yuen PC (2018) Visible thermal person re-identification via dual-constrained top-ranking. In: Proc 27th int Joint conf Artif Intell, pp 1092–1099

  24. Liu H, Cheng J, Wang W, Su Y, Bai H (2020) Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification. Neurocomputing 398:11–19

    Article  Google Scholar 

  25. Dai P, Ji R, Wang H, Wu Q, Huang Y (2018) Cross-modality person re-identification with generative adversarial training. IJCAI, pp 677–683

  26. Feng Z, Lai J, Xie X (2020) Learning modality-specific representations for visible-infrared person re-identification. IEEE Trans Image Process 29:579–590

    Article  MathSciNet  Google Scholar 

  27. Feng Z, Lai J, Xie X (2018) Learning view-specific deep networks for person re-identification. IEEE Trans Image Process 27(7):3472–3483

    Article  MathSciNet  Google Scholar 

  28. Zhu Y, Yang Z, Wang L, Zhao S, Hu X, Tao D (2019) Hetero-center loss for cross-modality person re-identification. Neurocomputing 386:97–109

    Article  Google Scholar 

  29. Zhao Y-B, Lin J-W, Xuan Q, Xi X (2019) Hpiln: a feature learning framework for cross-modality person re-identification. IET Image Process 13(14):2897–2904

    Article  Google Scholar 

  30. Wang PY, Su F, Zhao ZC, Zhao YY, Yang L, Li Y (2020) Deep hard modality alignment for visible thermal person re-identification. Pattern Recogn Lett 133:195–201

    Article  Google Scholar 

  31. Nguyen D, Hong H, Kim K, Park K (2017) Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3):605

    Article  Google Scholar 

  32. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 770–778

  33. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proc International conference on learning representations. (ICLR), pp 2818–2826

  34. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proc Int Conf Mach Learn (ICML), pp 448–456

  35. Radenović F, Tolias G, Chum O (2019) Fine-tuning CNN image retrieval with no human annotation. IEEE Trans Pattern Anal Mach Intell 41(7):1655–1668

    Article  Google Scholar 

  36. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proc Int Conf Neural inf Process Syst, pp 1097–1105

  37. Hao Y, Wang N, Li J, Gao X (2019) Hsme: Hypersphere manifold embedding for visible thermal person re-identification. In: Proc AAAI, pp 8385–8392

  38. Li D-G, Wei X, Hong X, Gong Y (2020) Infrared-visible cross-modal person re-identification with an X modality. In: Proc AAAI

  39. Wang G, Zhang T, Yang Y, Cheng J, Chang J, Liang X, Hou Z (2020) Cross-modality paired-images generation for rgb-infrared person reidentification. In: Proc AAAI

  40. Ye M, Shen J, Lin G-J, Xiang T, Shao L, Hoi SCH (2021) Deep learning for person re-identification: A survey and outlook. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3054775

  41. Xu T-B, Liu C-L (2020) Deep neural network self-distillation exploiting data representation invariance. IEEE Transactions on Neural Networks and Learning Systems. https://doi.org/10.1109/TNNLS.2020.3027634

  42. Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv:1703.07737

  43. Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 815–823

  44. Kniaz VV, Knyaz VA, Hladuvka J, Kropatsch WG, Mizginov V (2018) Thermalgan: Multimodal color-to-thermal image translation for person re-identification in multi spectral dataset. In: ECCV Workshops

  45. Zhang LF, Song JB, Gao AN, Chen JW, Bao CL, Ma KS (2019) Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In: Proc IEEE int Conf Comput Vis (ICCV), pp 3713–3722

  46. Sandler M, Howard A, Zhu ML, Zhmoginov A, Chen L. -C. (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proc IEEE conf Comput Vis Pattern recognit (CVPR), pp 4510–4520

  47. Howard A, Sandler M, Chu G, Chen L. -C., Chen B, Tan MX (2019) Searching for MobileNetV3. In: Proc IEEE int Conf Comput Vis (ICCV), pp 1314–1324

  48. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proc IEEE int Conf Comput Vis (ICCV), pp 618–626

  49. Laurens VDM, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605

    MATH  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Natural Science Foundation of Jiangsu Province under Grant BK20200649, in part by the National Natural Science Foundation of China under Grant 62001475 and Grant 62071472, and in part by the Program for “Industrial IoT and Emergency Collaboration” Innovative Research Team in China University of Mining and Technology (CUMT) under Grant 2020ZY002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Y., Li, R., Sun, Y. et al. Knowledge self-distillation for visible-infrared cross-modality person re-identification. Appl Intell 52, 10617–10631 (2022). https://doi.org/10.1007/s10489-021-02814-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02814-4

Keywords

Navigation