Skip to main content

Actively Balanced Bagging for Imbalanced Data

  • Conference paper
  • First Online:
Foundations of Intelligent Systems (ISMIS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10352))

Included in the following conference series:

  • 2017 Accesses


Under-sampling extensions of bagging are currently the most accurate ensembles specialized for class imbalanced data. Nevertheless, since improvements of recognition of the minority class, in this type of ensembles, are usually associated with a decrease of recognition of majority classes, we introduce a new, two phase, ensemble called Actively Balanced Bagging. The proposal is to first learn a bagging classifier and then iteratively improve it by updating its bootstraps with a limited number learning examples. The examples are selected according to an active learning strategy, which takes into account: decision margin of votes, example class distribution in the training set and/or in its neighbourhood, and prediction errors of component classifiers. Experiments with synthetic and real-world data confirm usefulness of this proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others


  1. 1.

    We are grateful to prof. W. Michalowski and the MET Research Group from the University of Ottawa for allowing us to use scrotal-pain data set.

  2. 2.

    We have published detailed results for specific values of parameters, on-line, in the appendix,


  1. Abe, N., Mamitsuka, H.: Query learning strategies using boosting and bagging. In: Proceedings of 15th International Conference on Machine Learning, pp. 1–10 (2004)

    Google Scholar 

  2. Aggarwal, C., Kong, X., Gu, Q., Han, J., Yu, P.: Active learning: a survey. In: Data Classification: Algorithms and Applications, pp. 571–606. CRC Press (2015)

    Google Scholar 

  3. Błaszczyński, J., Stefanowski, J., Idkowiak, L.: Extending bagging for imbalanced data. In: Burduk, R., Jackowski, K., Kurzynski, M., Wozniak, M., Zolnierek, A. (eds.) Proceedings of the 8th CORES 2013. AISC, vol. 226, pp. 269–278. Springer, Heidelberg (2013)

    Google Scholar 

  4. Błaszczyński, J., Stefanowski, J.: Neighbourhood sampling in bagging for imbalanced data. Neurocomputing 150A, 184–203 (2015)

    Google Scholar 

  5. Borisov, A., Tuv, E., Runger, G.: Active batch learning with Stochastic Query-by-Forest (SQBF). In: JMLR Workshop on Active Learning and Experimental Design 2011, vol. 16, pp. 59–69 (2011)

    Google Scholar 

  6. Branco, P., Torgo, L., Ribeiro, R.: A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. (CSUR) 49(2), 31 (2016)

    Article  Google Scholar 

  7. Chang, E.: Statistical learning for effective visual information retrieval. In: Proceedings of ICIP 2003, pp. 609–612 (2003)

    Google Scholar 

  8. Ertekin, S., Huang, J., Bottou, L., Giles, L.: Learning on the border: active learning in imbalanced data classification. In: Proceedings of the 16th ACM Conference on Information and Knowledge Management, pp. 127–136 (2007)

    Google Scholar 

  9. Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 99, 1–22 (2011)

    Google Scholar 

  10. He, H., Yungian, M. (eds.): Imbalanced Learning. Foundations, Algorithms and Applications. IEEE - Wiley, Hoboken (2013)

    MATH  Google Scholar 

  11. Hido, S., Kashima, H.: Roughly balanced bagging for imbalance data. In: Proceedings of the SIAM International Conference on Data Mining, pp. 143–152 (2008) - An extended version in Stat. Anal. Data Mining 2(5–6), 412–426 (2009)

    Google Scholar 

  12. Khoshgoftaar, T., Van Hulse, J., Napolitano, A.: Comparing boosting and bagging techniques with noisy and imbalanced data. IEEE Trans. Syst. Man Cybern.-Part A 41(3), 552–568 (2011)

    Article  Google Scholar 

  13. Napierala, K., Stefanowski, J.: The influence of minority class distribution on learning from imbalance data. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, S.-B. (eds.) HAIS 2012. LNCS, vol. 7209, pp. 139–150. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  14. Napierala, K., Stefanowski, J.: Types of minority class examples and their influence on learning classifiers from imbalanced data. J. Intell. Inf. Syst. 46(3), 563–597 (2016)

    Article  Google Scholar 

  15. Settles, B.: Active learning literature survey. Technical report, Computer Sciences Technical Report (2009)

    Google Scholar 

  16. Yang, Y., Ma, G.: Ensemble-based active learning for class imbalance problem. J. Biomed. Sci. Eng. 3(10), 1022–1029 (2010)

    Article  Google Scholar 

  17. Wojciechowski, S., Wilk, S.: The generator of synthetic multi-dimensional data. Poznan University of Technology Report RB-16/14 (2014)

    Google Scholar 

Download references


The research was funded by the Polish National Science Center, grant no. DEC-2013/11/B/ST6/00963.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jerzy Błaszczyński .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Błaszczyński, J., Stefanowski, J. (2017). Actively Balanced Bagging for Imbalanced Data. In: Kryszkiewicz, M., Appice, A., Ślęzak, D., Rybinski, H., Skowron, A., Raś, Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science(), vol 10352. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60437-4

  • Online ISBN: 978-3-319-60438-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics