Abstract
The distribution of data in real-world applications often shows a long-tailed shape, making long-tailed recognition challenging as traditional models bias toward majority categories. Multi-expert ensemble methods have shown promise but often suffer from insufficient expert diversity and high model variance. To address these issues, we propose multiple experts with knowledge fusion (MEKF), which includes diversified fusion experts and dual-view self-distillation. MEKF enhances expert diversity by fusing features of different depths and introduces distribution diversity loss with distribution weights. Dual-view self-distillation reduces model variance by extracting semantic information from weakly augmented data predictions. Experiments on CIFAR100-LT, ImageNet-LT, and Places-LT benchmarks validate MEKF’s effectiveness, showing excellent performance.




Similar content being viewed by others
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. The code that supports the findings of this study is available from the corresponding author upon reasonable request.
Code availability
The code that support the findings of this study are available from the corresponding author upon reasonable request.
References
Deng J, Dong W, Socher R, Li J, Li K, Li F (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 248–255
Zhou B, Lapedriza A, Khosla A, Oliva A, Torralba A (2017) Places: a 10 million image database for scene recognition. IEEE Trans Pattern Anal Mach Intell 40(6):1452–1464
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images
Buda M, Maki A, Mazurowski M (2018) A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw 106:249–259
Byrd J, Lipton Z (2019) What is the effect of importance weighting in deep learning? In: International Conference on Machine Learning, pp 872–881
He H, Garcia E (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Cui J, Zhong Z, Liu S, Yu B, Jia, J (2021) Parametric contrastive learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 715–724
Hinton, G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531
Cai J, Wang Y, Hwang J (2021) Ace: Ally complementary experts for solving long-tailed recognition in one-shot. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 112–121
Zhang Y, Kang B, Hooi B, Yan S, Feng J (2023) Deep long-tailed learning: a survey. IEEE Trans Pattern Anal Mach Intell 45(9):10795–10816
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Girshick R, Donahue J, Darrell T and Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 580–587
Khan S, Hayat M, Bennamoun M, Sohel F, Togneri R (2017) Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans Neural Netw Learn Syst 29(8):3573–3587
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Lin T, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2980–2988
Cubuk E, Zoph B, Shlens J, Le Q (2020) Randaugment: Practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp 702–703
Zhu J, Wang Z, Chen J, Chen Y, Jiang Y (2022) Balanced contrastive learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6908–6917
Li J, Tan Z, Wan J, Lei Z, Guo G (2022) Nested collaborative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6949–6958
Kang B, Xie, S, Rohrbach, M, Yan, Z, Gordo A, Feng, J, Kalantidis Y (2019) Decoupling representation and classifier for long-tailed recognition. arXiv preprint arXiv:1910.09217
Park S, Lim J, Jeon Y, Choi J (2021) Influence-balanced loss for imbalanced visual classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 735–744
Zhong Z, Cui J, Liu S, Jia J (2021) Improving calibration for long-tailed recognition. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition:16489–16498
Japkowicz N, Stephen S (2002) The class imbalance problem: a systematic study. Intell Data Anal 6(5):429–449
Zhou B, Cui Q, Wei X, Chen Z (2020) BBN: bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9719–9728
Cao K, Wei C, Gaidon A, Arechiga N, Ma T (2019) Learning imbalanced datasets with label-distribution-aware margin loss. In: Advances in neural information processing systems, vol 32
Hou Z, Yu B, Tao D (2022) BatchFormer: learning to explore sample relationships for robust representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7256–7266
Cui J, Liu S, Tian Z, Zhong Z, Jia J (2022) ResLT: residual learning for long-tailed recognition. IEEE Trans Pattern Anal Mach Intell 45(3):3695–3706
Ren J, Yu C, Ma X, Zhao H, Yi S (2020) Balanced meta-softmax for long-tailed visual recognition. Adv Neural Inf Process Syst 33:4175–4186
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III, vol 18, pp 234–241
Cho J, Hariharan B (2019) On the efficacy of knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 4794–4802
Menon A, Jayasumana S, Rawat A, Jain H, Veit A, Kumar S (2020) Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314
Li B, Han Z, Li H, Fu,H, Zhang C (2022) Trustworthy long-tailed classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6970–6979
Li Y, Wang T, Kang B, Tang S, Wang C, Li J, Feng J (2020) Overcoming classifier imbalance for long-tail object detection with balanced group softmax. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10991–11000
Kim J, Park S, Kwak N (2018) Paraphrasing complex network: network compression via factor transfer. In: Advances in neural information processing systems 31
Ahn S, Hu S, Damianou A, Lawrence N, Dai Z (2019) Variational information distillation for knowledge transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9163–9171
Zagoruyko S, Komodakis N (2016) Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928
Liu Z, Miao Z, Zhan X, Wang J, Gong B, Yu S (2019) Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2537–2546
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3431–3440
Xiang L, Ding G, Han J (2020) Learning from multiple experts: self-paced knowledge distillation for long-tailed classification. In: Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020. Proceedings, Part V, vol 16, pp 247–263
Wang X, Lian L, Miao Z, Liu Z, Yu S (2020) Long-tailed recognition by routing diverse distribution-aware experts. arXiv preprint arXiv:2010.01809
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European Conference on Computer Vision, pp 213–229
Zhang Y, Hooi B, Hong L, Feng J (2022) Self-supervised aggregation of diverse experts for test-agnostic long-tailed recognition. In: Advances in neural information processing systems, pp 34077–34090
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788
Acknowledgements
This work described in this paper was supported by the National Natural Science Foundation of China (No. 61673396) and the Natural Science Foundation of Shandong Province (No. ZR2022MF260).
Funding
The work was supported by the National Natural Science Foundation of China (No. 61673396) and the Natural Science Foundation of Shandong Province (No. ZR2022MF260).
Author information
Authors and Affiliations
Contributions
H.L. contributed to data curation; M.S. involved in funding acquisition; C.H. prepared writing—original draft; Q.Z. revised writing—review and editing. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, Q., Ji, C., Shao, M. et al. MEKF: long-tailed visual recognition via multiple experts with knowledge fusion. J Supercomput 81, 407 (2025). https://doi.org/10.1007/s11227-025-06920-9
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-025-06920-9