Skip to main content
Log in

GREB: gradient re-balanced loss for long-tailed multi-lable classification

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Image classification has witnessed a remarkable advancement in class-balanced benchmarks. However, the natural distribution of datasets in real-world scenarios are long-tailed. Long-tailed classification has become a significant challenge in critical real-world image classification applications. A deep learning network trained on a long-tailed dataset tends to classify tail classes with few samples as head classes with many samples. The severe sample imbalance leads to the overwhelming dominance of negative samples on the tail classes; then, the massive gradient descent of negative samples leads to the classifier’s performance poorly. To tackle this problem, we propose a gradient re-balanced (GREB) loss with two synergistic factors, i.e., balance factor and correction factor. First, GREB estimates the balance and correction factors by accumulating the classifier outputs and their corresponding labels during the training process. Then, GREB dynamically reweights the gradients of positive and negative samples based on the balance factor to minimize the classification bias and improve the classifier performance. Finally, GREB compensates for sample gradients based on the correction factor to minimize the occurrence of misclassifications and improve the precision rate. Experiment results show that our GREB loss achieves state-of-the-art performance on long-tailed multi-label classification datasets (MSCOCO and MultiMNIST) and long-tailed single-label classification datasets (CIFAR10-LT and CIFAR100-LT).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The datasets generated during and analysed during the current study are available from the corresponding author on reasonable request.

References

  • Akhbardeh F, Alm CO, Zampieri M, Desell T (2021) Handling extreme class imbalance in technical logbook datasets. Proc Annu Meet Assoc Comput Linguist Int Jt Conf Ntl Lang Process 1:4034–4045

    Google Scholar 

  • Alafif T, Alzahrani B, Cao Y, Alotaibi R, Barnawi A, Chen M (2022) Generative adversarial network based abnormal behavior detection in massive crowd videos: a hajj case study. J Ambient Intell Humaniz Comput 13(8):4077–4088

    Article  Google Scholar 

  • Buda M, Maki A, Mazurowski MA (2018) A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw 106:249–259

    Article  Google Scholar 

  • Cai J, Wang Y, Hwang J-N (2021) Ace: ally complementary experts for solving long-tailed recognition in one-shot. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 112–121

  • Cao K, Wei C, Gaidon A, Arechiga N, Ma T (2019) Learning imbalanced datasets with label-distribution-aware margin loss. Advances in neural information processing systems. Springer, Cham, p 32

    Google Scholar 

  • Cao D, Zhu X, Huang X, Guo J, Lei Z (2020). Domain balancing: face recognition on long-tailed domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5671–5679

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  MATH  Google Scholar 

  • Chinnappa G, Rajagopal MK (2021) Residual attention network for deep face recognition using micro-expression image analysis. J Ambient Intell Humaniz Comput 1:1–14

    Google Scholar 

  • Cui Y, Jia M, Lin T-Y, Song Y, Belongie S (2019) Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9268–9277

  • Dai X (2019) Hybridnet: a fast vehicle detection system for autonomous driving. Signal Process 70:79–88

    Google Scholar 

  • De Arriba López V, Cobos-Guzman S (2022) Development of a deep learning model for recognising traffic sings focused on difficult cases. J Ambient Intell Humaniz Comput 13(9):4175–4187

    Article  Google Scholar 

  • Deng Z, Liu H, Wang Y, Wang C, Yu Z, Sun X (2021). Pml: Progressive margin loss for long-tailed age classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10503–10512

  • Devunooru S, Alsadoon A, Chandana P, Beg A (2021) Deep learning neural networks for medical image segmentation of brain tumours for diagnosis: a recent review and taxonomy. J Ambient Intell Humaniz Comput 12(1):455–483

    Article  Google Scholar 

  • Ding R, Guo K, Zhu X, Wu Z, Wang L (2022) ComGAN: unsupervised disentanglement and segmentation via image composition. In: Oh AH, Agarwal A, Belgrave D, Cho K (eds) Advances in neural information processing systems. Springer, Cham

    Google Scholar 

  • Duarte K, Rawat Y, Shah M (2021) Plm: Partial label masking for imbalanced multi-label classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2739–2748

  • Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. arXiv:1803.07728

  • Gu C, Sun C, Ross D. A, Vondrick C, Pantofaru C, Li Y, Vijayanarasimhan S, Toderici G, Ricco S, Sukthankar R, et al. (2018) Ava: a video dataset of spatio-temporally localized atomic visual actions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 6047–6056

  • Halim Z, Sulaiman M, Waqas M, Aydın D (2022) Deep neural network-based identification of driving risk utilizing driver dependent vehicle driving features: a scheme for critical infrastructure protection. J Ambient Intell Humaniz Comput 1:1–19

    Google Scholar 

  • He K, Zhang X, Ren S, Sun J (2016). Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 770–778

  • Hsieh T-I, Robb E, Chen H-T, Huang J-B (2021) Droploss for long-tail instance segmentation. AAAI 3:15

    Google Scholar 

  • Huang C, Li Y, Loy CC, Tang X (2016) Learning deep representation for imbalanced classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5375–5384

  • Jamal M. A, Brown M, Yang M-H, Wang L, Gong B (2020). Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7610–7619

  • Jing X-Y, Zhang X, Zhu X, Wu F, You X, Gao Y, Shan S, Yang J-Y (2021) Multiset feature learning for highly imbalanced data classification. IEEE Trans Pattern Anal Mach Intell 43:139–156

    Article  Google Scholar 

  • Kang B, Xie S, Rohrbach M, Yan Z, Gordo A, Feng J, Kalantidis Y (2020) Decoupling representation and classifier for long-tailed recognition

  • Korycki Ł, Krawczyk B (2021). Concept drift detection from multi-class imbalanced data streams. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), IEEE, pp 1068–1079

  • Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical Report

  • Kumar Y, Koul A, Singla R, Ijaz MF (2022) Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. J Ambient Intell Humaniz Comput 1:1–28

    Google Scholar 

  • LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp 2278–2324

  • Li Y Wang T, Kang B, Tang S, Wang C, Li J, Feng J (2020a) Overcoming classifier imbalance for long-tail object detection with balanced group softmax. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10991–11000

  • Li T, Cao P, Yuan Y, Fan L, Yang Y, Feris R, Indyk P, Katabi D (2021b) Targeted supervised contrastive learning for long-tailed recognition. arXiv:2111.13998

  • Li S, Gong K, Liu C H, Wang Y, Qiao F, Cheng X (2021c). Metasaug: Meta semantic augmentation for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5212–5221

  • Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. Computer vision - ECCV 2014. Springer, Cham, pp 740–755

    Chapter  Google Scholar 

  • Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017). Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2980–2988

  • Liu Z, Miao Z, Zhan X, Wang J, Gong B, Yu S. X (2019). Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2537–2546

  • Ren J, Zhang M, Yu C, Liu Z (2022) Balanced mse for imbalanced visual regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 7926–7935

  • Tan J, Wang C, Li B, Li Q, Ouyang W, Yin C,Yan J (2020). Equalization loss for long-tailed object recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11662–11671

  • Tan J, Lu X, Zhang G, Yin C, Li Q (2021). Equalization loss v2: A new gradient balance approach for long-tailed object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1685–1694

  • Tian J, Chen S, Zhang X, Feng Z, Xiong D, Wu S, Dou C (2021) Re-embedding difficult samples via mutual information constrained semantically oversampling for imbalanced text classification. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp 3148–3161

  • Wang Y-X, Ramanan D, Hebert M (2017) Learning to model the tail. In: Ch M (ed) Advances in neural information processing systems. Springer, Cham, p 30

    Google Scholar 

  • Wang J, Lukasiewicz T, Hu X, Cai J, Xu Z (2021a) Rsg: A simple but effective module for learning imbalanced datasets. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3784–3793

  • Wang J, Zhang W, Zang Y, Cao Y, Pang J, Gong T, Chen, K, Liu Z, Loy C C, Lin D (2021b) Seesaw loss for long-tailed instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9695–9704

  • Wang P, Han K, Wei, X.-S, Zhang L, Wang L (2021c) Contrastive learning based hybrid networks for long-tailed image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 943–952

  • Wang X, Lian L, Miao Z, Liu Z, Yu S (2021e) Long-tailed recognition by routing diverse distribution-aware experts. arXiv:2010.01809

  • Wang T, Zhu Y, Zhao C, Zeng W, Wang J, Tang M (2021d). Adaptive class suppression loss for long-tail object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3103–3112

  • Yu S, Guo J, Zhang R, Fan Y, Wang Z, Cheng X (2022). A re-balancing strategy for class-imbalanced classification based on instance difficulty. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 70–79

  • Zhou B, Cui Q, Wei X.-S, Chen Z.-M (2020). Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9719–9728

  • Zhu X, Guo K, Fang H, Chen L, Ren S, Hu B (2022a) Cross view capture for stereo image super-resolution. IEEE Trans Multimed 24:3074–3086

    Article  Google Scholar 

  • Zhu X, Guo K, Ren S, Hu B, Hu M, Fang H (2022b) Lightweight image super-resolution with expectation-maximization attention mechanism. IEEE Trans Circuits Syst Video Technol 32(3):1273–1284

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Natural Science Foundation of China under Grant 62076255; in part by the Open Research Projects of Zhejiang Lab (NO. 2022RC0AB07); in part by the Hunan Provincial Science and Technology Plan Project 2020SK2059; in part by the Key projects of Hunan Education Department 20A88; in part by the National Science Foundation of Hunan Province 2021JJ30082; in part by the Yongzhou City Instructive Science and Technology Plan Project 2021YZKJZD003; in part by the Scientific Research Projects of Hunan University of Science and Engineering 20XKY054.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kehua Guo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest in this work. The authors have employed some public datasets, namely, MSCOCO, MNIST, CIFAR10, CIFAR100 for performing the experiments in the considered work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Z., Guo, K., Ren, S. et al. GREB: gradient re-balanced loss for long-tailed multi-lable classification. J Ambient Intell Human Comput 14, 7937–7948 (2023). https://doi.org/10.1007/s12652-023-04602-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-023-04602-z

Keywords

Navigation