Skip to main content
Log in

An improved deep forest for alleviating the data imbalance problem

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Most deep learning methods have inherent defects and are rarely applied in the classification task of small-sized imbalanced datasets. On the one hand, data imbalance causes the classification results of the model to be biased toward the majority class. On the other hand, limited training data results in over-fitting. Deep forest (DF) is an interesting deep learning model that can perfectly work on small-sized datasets, and its performance is highly competitive with deep neural networks. In the present study, a variant of the DF called the imbalanced deep forest (IMDF) is proposed to effectively improve the classification performance of the minority class. It aims to explore the application of deep learning on small-sized imbalanced datasets. The IMDF is the cascade of multiple layers, where each layer is the ensemble of multiple units. The main idea behind the proposed method is to enable each unit of the IMDF to handle imbalanced data so that the classification results of the entire IMDF are biased toward minority class. Performed experiments demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61772023, 61502402) and the Fundamental Research Funds for the Central Universities (No. 20720180073).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Kunhong Liu or Beizhan Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animals rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, J., Liu, K., Wang, B. et al. An improved deep forest for alleviating the data imbalance problem. Soft Comput 25, 2085–2101 (2021). https://doi.org/10.1007/s00500-020-05279-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-05279-8

Keywords

Navigation