Output Layer Multiplication for Class Imbalance Problem in Convolutional Neural Networks

Yang, Zhao; Zhu, Yuanxin; Liu, Tie; Zhao, Sai; Wang, Yunyan; Tao, Dapeng

doi:10.1007/s11063-020-10366-w

Output Layer Multiplication for Class Imbalance Problem in Convolutional Neural Networks

Published: 19 October 2020

Volume 52, pages 2637–2653, (2020)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Zhao Yang¹,
Yuanxin Zhu¹,
Tie Liu¹,
Sai Zhao¹,
Yunyan Wang² &
…
Dapeng Tao ORCID: orcid.org/0000-0003-0783-5273³

333 Accesses
1 Citation
Explore all metrics

Abstract

Convolutional neural networks (CNNs) have demonstrated remarkable performance in the field of computer vision. However, they are prone to suffer from the class imbalance problem, in which the number of some classes is significantly higher or lower than that of other classes. Commonly, there are two main strategies to handle the problem, including dataset-level methods via resampling and algorithmic-level methods by modifying the existing learning frameworks. However, most of these methods need extra data resampling or elaborate algorithm design. In this work we provide an effective but extremely simple approach to tackle the imbalance problem in CNNs with cross-entropy loss. Specifically, we multiply a coefficient $ \alpha > 1 $ to output of the last layer in a CNN model. With this modification, the final loss function can dynamically adjust the contributions of examples from different classes during the imbalanced training procedure. Because of its simplicity, the proposed method can be easily applied in the off-the-shelf models with little change. To prove the effectiveness on imbalance problem, we design three experiments on classification tasks of increasing complexity. The experimental results show that our approach could improve the convergence rate in the training stage and/or increase accuracy for test.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effects of Class Imbalance Problem in Convolutional Neural Network Based Image Classification

Impact of Class Imbalance on Convolutional Neural Network Training in Multi-class Problems

Image classification method on class imbalance datasets using multi-scale CNN and two-stage transfer learning

Article 24 September 2021

References

Alejo R, GarciaV P-SJ (2015) An efficient over-sampling approach based on mean square error back-propagation for dealing with the multi-class imbalance problem. Neural Process Lett 42:603–617
Article Google Scholar
Aurelio YS, De Almeida GM, De Castro CL, Braga AP (2019) Learning from imbalanced data sets with weighted cross-entropy function. Neural Process Lett 50:1937–1949
Article Google Scholar
Batuwita R, Palade V (2010) Efficient resampling methods for training support vector machines with imbalanced datasets. In: IEEE international joint conference on neural networks, pp 1–8
Barandela R, Valdovinos RM, Sanchez JS (2003) New applications of ensembles of classifiers. Pattern Anal Appl 6(3):245–256
Article MathSciNet Google Scholar
Buda M, Maki A, Mazurowski MA (2018) A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw. https://doi.org/10.1016/j.neunet.2018.07.011
Article Google Scholar
Cao K, Wei C, Gaidon A, Arechiga N, Ma T (2019) Learning imbalanced datasets with label-distribution-aware margin loss. In: Neural information processing systems, pp 1567–1578
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Article Google Scholar
Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEBoost: improving prediction of the minority class in boosting. In: European conference on principles and practice of knowledge discovery in databases, pp 107–119
Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Chen Q, Huang J, Feris R, Brown LM, Dong J, Yan S (2018) Deep domain adaptation for describing people based on fine-grained clothing attributes. In: IEEE Conference on computer vision and pattern recognition, pp 5315–5325
Cui Y, Jia M, Lin TY, Song Y, Belongie S (2019) Class-balanced loss based on effective number of samples. In: IEEE conference on computer vision and pattern recognition
Dong Q, Gong S, Zhu X (2019) Imbalanced deep learning by minority class incremental rectification. IEEE Trans Pattern Anal Mach Intell 41(6):1367–1381
Article Google Scholar
Drummond C, Holte RC (2003) C4.5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: ICML workshop on learning from imbalanced data II, pp 1–8
Elkan C (2001) The foundations of cost-sensitive learning. In: International joint conference on artificial intelligence, pp 973–978
Everingham M, Gool LV, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vision 88(2):303–338
Article Google Scholar
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: gagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C 42(4):463–484
Article Google Scholar
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 580–587
Guo S, Liu Y, Chen R, Sun X, Wang X (2019) Improved SMOTE algorithm to deal with imbalanced activity classes in smart homes. Neural Process Lett 50:1503–1526
Article Google Scholar
He H, Bai Y, Garcia EA, Li S (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE international joint conference on neural networks, pp 1322–1328
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778
Hensman P, Masko D (2015) The impact of imbalanced training data for convolutional neural networks. Degree project, KTH Royal Institute of Technology
Huang C, Li Y, Loy CC, Tang X (2016) Learning deep representation for imbalanced classification. In: IEEE conference on computer vision and pattern recognition, pp 5375–5384
Huang C, Li Y, Loy CC, Tang X (2018) Deep imbalanced learning for face recognition and attribute prediction. arXiv: 1806.00194
Huang K, Zhang R, Yin XC (2015) Learning imbalanced classifiers locally and globally with one-side probability machine. Neural Process Lett 41:311–323
Article Google Scholar
Katharopoulos A, Fleuret F (2018) Not all samples are created equal: deep learning with importance sampling. In: International conference on machine learning, pp 2525–2534
Khan SH, Hayat M, Bennamoun M, Sohel FA, Togneri R (2018) Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans Neural Netw Learn Syst 29(8):3573–3587
Article Google Scholar
Krizhevsky A, Hinton GE (2009) Learning multiple layers of features from tiny images. Ms. thesis, University of Toronto
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Neural information processing systems, pp 1097–1105
Kumar N, Berg A, Belhumeur PN, Nayar S (2011) Describable visual attributes for face verification and image search. IEEE Trans Pattern Anal Mach Intell 33(10):1962–1977
Article Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Li S, Deng W (2016) Real world expression recognition: a highly imbalanced detection problem. In: IEEE international conference on biometrics, pp 1–6
Lim P, Goh CK, Tan KC (2017) Evolutionary cluster-based synthetic oversampling ensemble (eco-ensemble) for imbalance learning. IEEE Trans Cybern 47(9):2850–2861
Article Google Scholar
Lin TY, Goyal P, Girshick R, He K, Dallar P (2014) Focal loss for dense object detection. In: IEEE international conference on computer vision, pp 2999–3007
Ling CX, Sheng VS (2017) Cost-sensitive learning. Encyclopedia of machine learning and data mining. Springer, Boston
Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2018) SSD: single shot multibox detector. In: European conference on computer vision, pp 21–37
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: IEEE conference on computer vision and recognition, pp 779–788
Shelhamer E, Long J, Darrell T (2017) Fully Convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651
Article Google Scholar
Oksuz K, Cam BC, Kalkan S, Akbas E (2019) Imbalance problems in object detection: a review. arXiv: 1909.00169
Pang J, Chen K, Shi J, Feng H, Ouyang W, Lin D (2019) Libra R-CNN: towards balanced learning for object detection. arXiv: 1904.02701
Pytorch, https://pytorch.org/
Ren M, Zeng W, Yang B, Urtasun R (2018) Learning to reweight examples for robust deep learning. In: International conference on machine learning, pp 4334–4343
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Seiffert C, Khoshgoftaar TM, Hulse JV, Napolitano A (2010) RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybern Part A Syst Humans 40(1):185–197
Article Google Scholar
Tripathi S, Chandra S, Agrawal A, Tyagi A, Rehg JM, Chari V (2019) Learning to generate synthetic data via compositing. In: IEEE conference on computer vision and pattern recognition, pp 461–470
Vuttipittayamongkol P, Elyan E (2020) Neighbourhood-based undersampling approach for handling imbalanced and overlapped data. Inf Sci 509:47–70
Article Google Scholar
Wang X, Shrivastava A, Gupta A (2017) A-fast-rcnn: hard positive generation via adversary for object detection. In: IEEE conference on computer vision and pattern recognition, pp 3039–3048
Wang J, Xu M, Wang H, Zhang J (2006) Classification of imbalanced data by using the SMOTE algorithm and locally linear embedding. In: International conference on signal processing, https://doi.org/10.1109/icosp.2006.345752
Wang P, Su F, Zhao Z, Guo Y, Zhao Y, Zhuang B (2019) Deep class-skewed learning for face recognition. Neurocomputing 363:35–45
Article Google Scholar
Wang Q, Gao J, Lin W, Yuan Y (2019) Learning from synthetic data for crowd counting in the wild. In: IEEE conference on computer vision and pattern recognition, pp 8190–8199
Wang Q, Gao J, Li X (2019) Weakly supervised adversarial domain adaptation for semantic segmentation in urban scenes. IEEE Trans Image Process 28(9):4376–4386
Article MathSciNet Google Scholar
Wang S, Yao X (2009) Diversity analysis on imbalanced data sets by using ensemble models. In: IEEE symposium on computational intelligence and data mining, pp 324–331
Yen SJ, Lee YS (2009) Cluster-based under-sampling approaches for imbalanced data distributions. Expert Syst Appl 36:5718–5727
Article Google Scholar
Zhang C, Tan KC. Ren R (2016) Training cost-sensitive deep belief networks on imbalance data problems. In: International joint conference on neural networks, pp 4362–4367
Zhang L, Zhang Q, Zhang L, Tao D, Huang X, Du B (2015) Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding. Pattern Recogn 48:3102–3112
Article Google Scholar
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: IEEE international conference on computer vision, pp 1116–1124

Download references

Acknowledgements

This research was supported by NSFC (No. 61501177, 61772455, U1713213, 41601394, 61902084), Guangzhou University’s training program for excellent new-recruited doctors (No. YB201712), Major Science and Technology Project of Precious Metal Materials Genetic Engineering in Yunnan Province (No. 2019ZE001-1, 202002AB080001), Yunnan Natural Science Funds (No. 2018FY001(-013), 2019FA-045), Yunnan University Natural Science Funds (No. 2018YDJQ004), the Project of Innovative Research Team of Yunnan Province (No. 2018HC019), Guangdong Natural Science Foundation (No. 2017A030310639), and Featured Innovation Project of Guangdong Education Department (No. 2018KTSCX174).

Author information

Authors and Affiliations

School of Mechanical and Electric Engineering, Guangzhou University, Guangzhou, People’s Republic of China
Zhao Yang, Yuanxin Zhu, Tie Liu & Sai Zhao
School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan, People’s Republic of China
Yunyan Wang
School of Information Science and Engineering, Yunnan University, Kunming, People’s Republic of China
Dapeng Tao

Authors

Zhao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yuanxin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Tie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Sai Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yunyan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dapeng Tao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dapeng Tao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof for that Eq. (10) is negative with $ \alpha > 1 $.

Assume that there is a function

$$ f\left( x \right) = \frac{log\left( x \right)}{1 - x} , x \in \left( {0 , 1} \right). $$

(15)

By computing its derivation with $ x $, we get

$$ \frac{\partial f\left( x \right)}{\partial x} = \frac{{\frac{1}{x} - 1 + log\left( x \right)}}{{\left( {1 - x} \right)^{2} }}. $$

(16)

Let $ g\left( x \right) $ denote the numerator of Eq. (16), we have

$$ g\left( x \right) = \frac{1}{x} - 1 + \log \left( x \right) , x \in \left( {0 , 1} \right). $$

(17)

By computing its derivation with $ x $, we have

$$ \frac{\partial g\left( x \right)}{\partial x} = \frac{1}{x}\left( {1 - \frac{1}{x}} \right). $$

(18)

Since Eq. (18) is always negative with $ x \in \left( {0 , 1} \right) $, we can conclude that $ g\left( x \right) $ is a decreasing function. The minimum value of $ g\left( x \right) $ approaches zero, as $ g\left( 1 \right) = 0 $. Thus, $ g\left( x \right) $ is always positive with $ x \in \left( {0 , 1} \right) $. It indicates that $ f\left( x \right) $ is an increasing function with $ x \in \left( {0 , 1} \right) $.

Let $ p_{1} = e^{{o_{k} }} /\mathop \sum_{j = 1}^{C} e^{{o_{j} }} , $ $ p_{\alpha } = e^{{\alpha o_{k} }} /\mathop \sum _{j = 1}^{C} e^{{\alpha o_{j} }} $ for short. There are $ p_{1} , p_{\alpha } \in \left( {0 , 1} \right) $ and $ p_{1} < p_{\alpha } $ with $ \alpha > 1 $, which can be inferred from the proof of Theorem 2. By considering the monotonicity of $ f\left( x \right) $, we have

$$ \frac{{log\left( {p_{1} } \right)}}{{1 - p_{1} }} < \frac{{log\left( {p_{a} } \right)}}{{1 - p_{a} }}, $$

(19)

which can be transformed as

$$ \left( {1 - p_{a} } \right)log\left( {p_{1} } \right) < \left( {1 - p_{1} } \right)log\left( {p_{a} } \right). $$

(20)

Because both sides of Eq. (20) are negative, we can obtain

$$ \alpha \left( {1 - p_{a} } \right)log\left( {p_{1} } \right) < {\left( {1 - p_{1} } \right)log\left( {p_{a} } \right) , \alpha } > 1. $$

(21)

So we can conclude that Eq. (10) is negative with $ \alpha > 1 $.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, Z., Zhu, Y., Liu, T. et al. Output Layer Multiplication for Class Imbalance Problem in Convolutional Neural Networks. Neural Process Lett 52, 2637–2653 (2020). https://doi.org/10.1007/s11063-020-10366-w

Download citation

Accepted: 04 October 2020
Published: 19 October 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11063-020-10366-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Output Layer Multiplication for Class Imbalance Problem in Convolutional Neural Networks

Abstract

Access this article

Similar content being viewed by others

Effects of Class Imbalance Problem in Convolutional Neural Network Based Image Classification

Impact of Class Imbalance on Convolutional Neural Network Training in Multi-class Problems

Image classification method on class imbalance datasets using multi-scale CNN and two-stage transfer learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Output Layer Multiplication for Class Imbalance Problem in Convolutional Neural Networks

Abstract

Access this article

Similar content being viewed by others

Effects of Class Imbalance Problem in Convolutional Neural Network Based Image Classification

Impact of Class Imbalance on Convolutional Neural Network Training in Multi-class Problems

Image classification method on class imbalance datasets using multi-scale CNN and two-stage transfer learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation