Skip to main content

Advertisement

Log in

MEKF: long-tailed visual recognition via multiple experts with knowledge fusion

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The distribution of data in real-world applications often shows a long-tailed shape, making long-tailed recognition challenging as traditional models bias toward majority categories. Multi-expert ensemble methods have shown promise but often suffer from insufficient expert diversity and high model variance. To address these issues, we propose multiple experts with knowledge fusion (MEKF), which includes diversified fusion experts and dual-view self-distillation. MEKF enhances expert diversity by fusing features of different depths and introduces distribution diversity loss with distribution weights. Dual-view self-distillation reduces model variance by extracting semantic information from weakly augmented data predictions. Experiments on CIFAR100-LT, ImageNet-LT, and Places-LT benchmarks validate MEKF’s effectiveness, showing excellent performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. The code that supports the findings of this study is available from the corresponding author upon reasonable request.

Code availability

The code that support the findings of this study are available from the corresponding author upon reasonable request.

References

  1. Deng J, Dong W, Socher R, Li J, Li K, Li F (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 248–255

  2. Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A (2017) Places: a 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell 40(6):1452–1464

    Article  MATH  Google Scholar 

  3. Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images

  4. Buda M, Maki A, Mazurowski M (2018) A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw 106:249–259

    Article  MATH  Google Scholar 

  5. Byrd J, Lipton Z (2019) What is the effect of importance weighting in deep learning? In: International Conference on Machine Learning, pp 872–881

  6. He H, Garcia E (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  MATH  Google Scholar 

  7. Cui J, Zhong Z, Liu S, Yu B, Jia, J (2021) Parametric contrastive learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 715–724

  8. Hinton, G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531

  9. Cai J, Wang Y, Hwang J (2021) Ace: Ally complementary experts for solving long-tailed recognition in one-shot. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 112–121

  10. Zhang Y, Kang B, Hooi B, Yan S, Feng J (2023) Deep long-tailed learning: a survey. IEEE Trans Pattern Anal Mach Intell 45(9):10795–10816

    Article  MATH  Google Scholar 

  11. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

  12. Girshick R, Donahue J, Darrell T and Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 580–587

  13. Khan S, Hayat M, Bennamoun M, Sohel F, Togneri R (2017) Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans Neural Netw Learn Syst 29(8):3573–3587

    Article  MATH  Google Scholar 

  14. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

  15. Lin T, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2980–2988

  16. Cubuk E, Zoph B, Shlens J, Le Q (2020) Randaugment: Practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp 702–703

  17. Zhu J, Wang Z, Chen J, Chen Y, Jiang Y (2022) Balanced contrastive learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6908–6917

  18. Li J, Tan Z, Wan J, Lei Z, Guo G (2022) Nested collaborative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6949–6958

  19. Kang B, Xie, S, Rohrbach, M, Yan, Z, Gordo A, Feng, J, Kalantidis Y (2019) Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217

  20. Park S, Lim J, Jeon Y, Choi J (2021) Influence-balanced loss for imbalanced visual classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 735–744

  21. Zhong Z, Cui J, Liu S, Jia J (2021) Improving calibration for long-tailed recognition. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition:16489–16498

  22. Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6(5):429–449

    Article  MATH  Google Scholar 

  23. Zhou B, Cui Q, Wei X, Chen Z (2020) BBN: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9719–9728

  24. Cao K, Wei C, Gaidon A, Arechiga N, Ma T (2019) Learning imbalanced datasets with label-distribution-aware margin loss. In: Advances in neural information processing systems, vol 32

  25. Hou Z, Yu B, Tao D (2022) BatchFormer: learning to explore sample relationships for robust representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7256–7266

  26. Cui J, Liu S, Tian Z, Zhong Z, Jia J (2022) ResLT: residual learning for long-tailed recognition. IEEE Trans Pattern Anal Mach Intell 45(3):3695–3706

    MATH  Google Scholar 

  27. Ren J, Yu C, Ma X, Zhao H, Yi S (2020) Balanced meta-softmax for long-tailed visual recognition. Adv Neural Inf Process Syst 33:4175–4186

    MATH  Google Scholar 

  28. Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III, vol 18, pp 234–241

  29. Cho J, Hariharan B (2019) On the efficacy of knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 4794–4802

  30. Menon A, Jayasumana S, Rawat A, Jain H, Veit A, Kumar S (2020) Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314

  31. Li B, Han Z, Li H, Fu,H, Zhang C (2022) Trustworthy long-tailed classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6970–6979

  32. Li Y, Wang T, Kang B, Tang S, Wang C, Li J, Feng J (2020) Overcoming classifier imbalance for long-tail object detection with balanced group softmax. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10991–11000

  33. Kim J, Park S, Kwak N (2018) Paraphrasing complex network: network compression via factor transfer. In: Advances in neural information processing systems 31

  34. Ahn S, Hu S, Damianou A, Lawrence N, Dai Z (2019) Variational information distillation for knowledge transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9163–9171

  35. Zagoruyko S, Komodakis N (2016) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928

  36. Liu Z, Miao Z, Zhan X, Wang J, Gong B, Yu S (2019) Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2537–2546

  37. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3431–3440

  38. Xiang L, Ding G, Han J (2020) Learning from multiple experts: self-paced knowledge distillation for long-tailed classification. In: Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020. Proceedings, Part V, vol 16, pp 247–263

  39. Wang X, Lian L, Miao Z, Liu Z, Yu S (2020) Long-tailed recognition by routing diverse distribution-aware experts. arXiv preprint arXiv:2010.01809

  40. Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European Conference on Computer Vision, pp 213–229

  41. Zhang Y, Hooi B, Hong L, Feng J (2022) Self-supervised aggregation of diverse experts for test-agnostic long-tailed recognition. In: Advances in neural information processing systems, pp 34077–34090

  42. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788

Download references

Acknowledgements

This work described in this paper was supported by the National Natural Science Foundation of China (No. 61673396) and the Natural Science Foundation of Shandong Province (No. ZR2022MF260).

Funding

The work was supported by the National Natural Science Foundation of China (No. 61673396) and the Natural Science Foundation of Shandong Province (No. ZR2022MF260).

Author information

Authors and Affiliations

Authors

Contributions

H.L. contributed to data curation; M.S. involved in funding acquisition; C.H. prepared writing—original draft; Q.Z. revised writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Chenghao Ji.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Q., Ji, C., Shao, M. et al. MEKF: long-tailed visual recognition via multiple experts with knowledge fusion. J Supercomput 81, 407 (2025). https://doi.org/10.1007/s11227-025-06920-9

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-025-06920-9

Keywords