Abstract
Data in the natural open world tends to follow a long-tailed class distribution, leading deep models trained on such datasets to frequently exhibit inferior performance on the tail classes. Although existing approaches improve a model’s performance on tail categories through strategies such as class rebalancing, they often sacrifice the deep features that the model has already learned. In this paper, we propose a new joint distillation framework called JWAFD (Joint weighted knowledge distillation and multi-scale feature distillation) to address the long-tailed recognition problem from the perspective of knowledge distillation. The framework comprises two effective modules. Firstly, the weighted knowledge distillation module, which uses a category prior to adjust the weights of each category. By doing so, the training process becomes more balanced across all categories. Then, the multi-scale feature distillation module, which helps to further optimize the feature representation, thus solving the problem of under-learning of features encountered in previous studies. Compared with previous studies, the proposed framework significantly improves the performance of rare classes while maintaining the performance of head classes recognition. Extensive experiments on three benchmark datasets(CIFAR-100-LT, ImageNet-LT and iNaturalist2018) have demonstrated that the proposed novel distillation framework achieves comparable performance to the state-of-the-art long-tailed recognition methods. Our code is available at: https://github.com/xiaohe6/JWAFD.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and materials
The data used to support the findings of this study are available from the corresponding author upon request.
References
Zhu X, Men J, Yang L et al (2022) Imbalanced driving scene recognition with class focal loss and data augmentation. Int J Mach Learn Cybern 13(10):2957–2975. https://doi.org/10.1007/s13042-022-01575-x
Zhao Z, Zuo M, Yu J et al (2022) Siamese network based on global and local feature matching for object tracking. J Electron Imaging 31:063,022-063,022. https://doi.org/10.1117/1.JEI.31.6.063022
Han M, Guo H, Li J et al (2022) Global-local information based oversampling for multi-class imbalanced data. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-022-01746-w
Everingham M, Van Gool L, Williams CK et al (2009) The pascal visual object classes (voc) challenge. Int J Comput Vision 88:303–308. https://doi.org/10.1007/s11263-009-0275-4
Deng J, Dong W, Socher R et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. Ieee, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Russakovsky O, Deng J, Su H et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115:211–252. https://doi.org/10.1007/s11263-015-0816-y
Lin TY, Maire M, Belongie S et al (2014) Microsoft coco: common objects in context. In: Computer vision–ECCV 2014: 13th European conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, part V 13, Springer, pp 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
Fu Y, Xiang L, Zahid Y et al (2022) Long-tailed visual recognition with deep models: a methodological survey and evaluation. Neurocomputing. https://doi.org/10.1016/j.neucom.2022.08.031
Zhu H, Liu H, Fu A (2021) Class-weighted neural network for monotonic imbalanced classification. Int J Mach Learn Cybern 12:1191–1201. https://doi.org/10.1007/s13042-020-01228-x
Cui Y, Jia M, Lin TY et al (2019) Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 9268–9277. https://doi.org/10.1109/CVPR.2019.00949
Wang YX, Ramanan D, Hebert M (2017) Learning to model the tail. Adv Neural Inf Process Syst. https://doi.org/10.5555/3295222.3295446
Huang C, Li Y, Loy CC et al (2016) Learning deep representation for imbalanced classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 5375–5384. https://doi.org/10.1109/CVPR.2016.580
Lin TY, Goyal P, Girshick R et al (2017) Focal loss for dense object detection. In: IEEE Transactions on pattern analysis & machine intelligence. PP(99):2999–3007. https://doi.org/10.1109/TPAMI.2018.2858826
Jamal MA, Brown M, Yang MH et al (2020) Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 7610–7619. https://doi.org/10.1109/CVPR42600.2020.00763
Cao K, Wei C, Gaidon A et al (2019) Learning imbalanced datasets with label-distribution-aware margin loss. Adv Neural Inf Process Syst. https://doi.org/10.5555/3454287.3454427
Zhai J, Qi J, Zhang S (2022) Imbalanced data classification based on diverse sample generation and classifier fusion. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-021-01321-9
Zhou B, Cui Q, Wei XS et al (2020) Bbn: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 9719–9728. https://doi.org/10.1109/CVPR42600.2020.00974
Liu S, Garrepalli R, Dietterich T et al (2018) Open category detection with pac guarantees. In: International conference on machine learning. PMLR, pp 3169–3178
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284. https://doi.org/10.1109/TKDE.2008.239
Van Horn G, Mac Aodha O, Song Y et al (2018) The inaturalist species classification and detection dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 8769–8778. https://doi.org/10.1109/CVPR.2018.00914
Han H, Wang WY, Mao BH (2005) Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: Advances in intelligent computing: international conference on intelligent computing, ICIC 2005, Hefei, China, August 23-26, 2005, Proceedings, Part I 1, Springer, pp 878–887. https://doi.org/10.1007/11538059_91
Ye X, Li H, Imakura A et al (2020) An oversampling framework for imbalanced classification based on laplacian eigenmaps. Neurocomputing 399:107–116. https://doi.org/10.1016/j.neucom.2020.02.081
Zhu L, Yang Y (2020) Inflated episodic memory with region self-attention for long-tailed visual recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 4344–4353. https://doi.org/10.1109/CVPR42600.2020.00440
Zhang H, Jiang L, Li C (2021) Cs-resnet: Cost-sensitive residual convolutional neural network for pcb cosmetic defect detection. Expert Syst Appl 185(115):673. https://doi.org/10.1016/j.eswa.2021.115673
Kang B, Xie S, Rohrbach M, Yan Z, Gordo A, Feng J, Kalantidis Y (2019) Decoupling representation and classifier for long-tailed recognition. Comput Vis Pattern Pattern Recognit. https://doi.org/10.48550/arXiv.1910.09217
Chu P, Bian X, Liu S et al (2020) Feature space augmentation for long-tailed data. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX 16, Springer, pp 694–710. https://doi.org/10.1007/978-3-030-58526-6_41
Liu Z, Miao Z, Zhan X et al (2019) Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 2537–2546. https://doi.org/10.1109/CVPR.2019.00264
Chen Q, Liu Q, Lin E (2021) A knowledge-guide hierarchical learning method for long-tailed image classification. Neurocomputing 459:408–418. https://doi.org/10.1016/j.neucom.2021.07.008
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. Comput Sci 14(7):38–39. https://doi.org/10.4140/TCP.n.2015.249
Yim J, Joo D, Bae J et al (2017) A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 4133–4141. https://doi.org/10.1109/CVPR.2017.754
Xiang L, Ding G, Han J (2020) Learning from multiple experts: self-paced knowledge distillation for long-tailed classification. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part V 16, Springer, pp 247–263. https://doi.org/10.1007/978-3-030-58558-7_15
Mullapudi RT, Poms F, Mark WR et al (2021) Background splitting: finding rare classes in a sea of background. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 8043–8052. https://doi.org/10.1109/CVPR46437.2021.00795
Yue C, Long M, Wang J et al (2016) Deep quantization network for efficient image retrieval. In: Proc. 13th AAAI Conf. Artif. Intell. pp 3457–3463. https://doi.org/10.1609/aaai.v30i1.10455
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Xie S, Girshick R, Dollár P et al (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1492–1500. https://doi.org/10.1109/CVPR.2017.634
He YY, Wu J, Wei XS (2021) Distilling virtual examples for long-tailed recognition. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 235–244. https://doi.org/10.1109/ICCV48922.2021.00030
Tang K, Huang J, Zhang H (2020) Long-tailed classification by keeping the good and removing the bad momentum causal effect. Adv Neural Inf Process Syst 33:1513–1524. https://doi.org/10.5555/3495724.3495852
Menon AK, Jayasumana S, Rawat AS, jain H, Veit A, Kumar S (2020) Long-tail learning via logit adjustment. Mach Learn. https://doi.org/10.48550/arXiv.2007.07314
Li T, Wang L, Wu G (2021) Self supervision to distillation for long-tailed visual recognition. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 630–639. https://doi.org/10.1109/ICCV48922.2021.00067
Zhao X, Xiao J, Yu S et al (2023) Weight-guided class complementing for long-tailed image recognition. Pattern Recogn 138(109):374. https://doi.org/10.1016/j.patcog.2023.109374
Li T, Cao P, Yuan Y et al (2022a) Targeted supervised contrastive learning for long-tailed recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 6918–6928. https://doi.org/10.1109/CVPR52688.2022.00679
Li M, Cheung Ym HuZ (2022) Key point sensitive loss for long-tailed visual recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3196044
Yang Y, Chen S, Li X et al (2022) Inducing neural collapse in imbalanced learning: do we really need a learnable classifier at the end of deep neural network? In: Advances in neural information processing systems
Li M, Cheung YM, Jiang J (2022) Feature-balanced loss for long-tailed visual recognition. In: 2022 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6. https://doi.org/10.1109/ICME52920.2022.9860003
Wang W, Zhao Z, Wang P et al (2022) Attentive feature augmentation for long-tailed visual recognition. IEEE Trans Circuits Syst Video Technol 32(9):5803–5816. https://doi.org/10.1109/TCSVT.2022.3161427
Zhang ML, Zhang XY, Wang C et al (2023) Towards prior gap and representation gap for long-tailed recognition. Pattern Recogn 133(109):012. https://doi.org/10.1016/j.patcog.2022.109012
Samuel D, Chechik G (2021) Distributional robustness loss for long-tail learning. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 9495–9504. https://doi.org/10.1109/ICCV48922.2021.00936
Yang Y, Xu Z (2020) Rethinking the value of labels for improving class-imbalanced learning. Adv Neural Inf Process Syst 33:19,290-19,301. https://doi.org/10.5555/3495724.3497342
Zhang S, Li Z, Yan S et al (2021) Distribution alignment: a unified framework for long-tail visual recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 2361–2370. https://doi.org/10.1109/CVPR46437.2021.00239
Zhou B, Khosla A, Lapedriza A et al (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2016.319
Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(2605):2579–2605
Acknowledgements
This work was supported by the Key Science and Technology Project of Henan Province (Grant No.201300210400) and Henan Province Science and Technology Research Project (Grant No.232102210031).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by, JY, SW, CL, XH, HL and YH. The first draft of the manuscript was written by YH and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Consent to participate
Informed consent was obtained from all individual participants included in the study.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
He, Y., Wang, S., Yu, J. et al. Joint weighted knowledge distillation and multi-scale feature distillation for long-tailed recognition. Int. J. Mach. Learn. & Cyber. 15, 1647–1661 (2024). https://doi.org/10.1007/s13042-023-01988-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-023-01988-2