Skip to main content

An Ensemble Pruning Approach Based on Reinforcement Learning in Presence of Multi-class Imbalanced Data

  • Conference paper
  • First Online:
Proceedings of the Third International Conference on Soft Computing for Problem Solving

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 258))

Abstract

In recent years, learning from imbalanced data sets has become a challenging issue in machine learning and data mining communities. This problem occurs when some classes of data have smaller number of instances than other classes. Multi-class imbalanced data sets have been pervasively observed in many real world applications. Many typical machine learning algorithms pose many difficulties dealing with these kinds of data sets. In this paper, we proposed an ensemble pruning approach which is based on Reinforcement Learning framework. In effect, we were inspired by Markov Decision Process and considered the ensemble pruning problem as a one player game, and select the best classifiers among our initial state space. These selected classifiers which can produce a good ensemble model, are employed to learn from multi-class imbalanced data sets. Our experimental results on some UCI and KEEL benchmark data sets show promising improvements in terms of minority class recall, G-mean, and MAUC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  2. Wasikowski, M., Chen, X.W.: Combating the small sample class imbalance problem using feature selection. IEEE Trans. Knowl. Data Eng. 22(10), 1388–1400 (2010)

    Article  Google Scholar 

  3. Alibeigi, M., Hashemi, S., Hamzeh, A.: DBFS: An effective density based feature selection scheme for small sample size and high dimensional imbalanced data sets. Data Knowl. Eng. 8182, 67–103 (2012)

    Google Scholar 

  4. Chawla, N., Lazarevic, A., Hall, L., Bowyer, K.: SMOTEBoost: improving prediction of the minority class in boosting. Knowl. Disc Databases 2003, 107–119 (2003)

    Google Scholar 

  5. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16 341–378 (2002)

    Google Scholar 

  6. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)

    Google Scholar 

  7. Sun, Y., Kamel, M.S., Wang, Y.: Boosting for learning multiple classes with imbalanced class distribution. In: Proceedings of the 6th International Conference on Data Mining (ICDM 06), pp. 592–602 (2006)

    Google Scholar 

  8. Wang, S., Yao, X.: Multiclass imbalance problems: analysis and potential solutions. IEEE Trans. Syst. Man Cybern. Part B 42(4), 1119–1130 (2012)

    Article  Google Scholar 

  9. Liao, T.W.: Classification of weld flaws with imbalanced class data. Expert Syst. Appl. 35(3), 1041–1052 (2008)

    Google Scholar 

  10. Fernandez, A., del Jesus, M.J., Herrera, F.: Multi-class imbalanced data-sets with linguistic fuzzy rule based classification systems based on pairwise learning. Comput. Intell. Knowl. Based Syst. Des. 6178, 8998 (2010)

    Google Scholar 

  11. Hazrati, S. M., Hamzeh, A., Hashemi, S.: A game theoretic framework for feature selection. In: 9th International Conference on. IEEE Fuzzy Systems and Knowledge Discovery (FSKD) (2012)

    Google Scholar 

  12. Gaudel, R., Sebag, M.: Feature selection as a one-player game. In: International Conference on Machine Learning, pp. 359–366 (2010)

    Google Scholar 

  13. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1, MIT press, Cambridge (1998)

    Google Scholar 

  14. Kocsis, L., Szepesvri, C.: Bandit based monte-carlo planning, Mach. Learn.: ECML 282–293 (2006)

    Google Scholar 

  15. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)

    Google Scholar 

  16. Kuncheva, L.I., Whitaker, C.J.: Ten measures of diversity in classifier ensembles: limits for two classifiers, In: Intelligent Sensor Processing, A DERA/IEE Workshop, pp. 10–1, IET (2001)

    Google Scholar 

  17. Frank, A., Asuncion, A.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2010)

  18. Alcala-Fdez, J., Fernandez, A., Luego, J., Derrac, J., Garcia, S., Sanchez, L., Herrera, F.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput. 17(2–3), 255–287 (2011)

    Google Scholar 

  19. Witten, I.H., Frank E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    Google Scholar 

  20. Hand, D.J., Till, R.J.: A simple generalization of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45(2), 171–186 (2001)

    Google Scholar 

  21. Wang, S., Yao, X.: Relationships between diversity of classification ensembles and single-class performance measures. IEEE Trans. Knowl. Data Eng. 25(1), 206–219 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sattar Hashemi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer India

About this paper

Cite this paper

Abdi, L., Hashemi, S. (2014). An Ensemble Pruning Approach Based on Reinforcement Learning in Presence of Multi-class Imbalanced Data. In: Pant, M., Deep, K., Nagar, A., Bansal, J. (eds) Proceedings of the Third International Conference on Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol 258. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1771-8_52

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-1771-8_52

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-1770-1

  • Online ISBN: 978-81-322-1771-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics