An Ensemble Pruning Approach Based on Reinforcement Learning in Presence of Multi-class Imbalanced Data

Abdi, Lida; Hashemi, Sattar

doi:10.1007/978-81-322-1771-8_52

Lida Abdi⁶ &
Sattar Hashemi⁶

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 258))

1767 Accesses
3 Citations

Abstract

In recent years, learning from imbalanced data sets has become a challenging issue in machine learning and data mining communities. This problem occurs when some classes of data have smaller number of instances than other classes. Multi-class imbalanced data sets have been pervasively observed in many real world applications. Many typical machine learning algorithms pose many difficulties dealing with these kinds of data sets. In this paper, we proposed an ensemble pruning approach which is based on Reinforcement Learning framework. In effect, we were inspired by Markov Decision Process and considered the ensemble pruning problem as a one player game, and select the best classifiers among our initial state space. These selected classifiers which can produce a good ensemble model, are employed to learn from multi-class imbalanced data sets. Our experimental results on some UCI and KEEL benchmark data sets show promising improvements in terms of minority class recall, G-mean, and MAUC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Wasikowski, M., Chen, X.W.: Combating the small sample class imbalance problem using feature selection. IEEE Trans. Knowl. Data Eng. 22(10), 1388–1400 (2010)
Article Google Scholar
Alibeigi, M., Hashemi, S., Hamzeh, A.: DBFS: An effective density based feature selection scheme for small sample size and high dimensional imbalanced data sets. Data Knowl. Eng. 8182, 67–103 (2012)
Google Scholar
Chawla, N., Lazarevic, A., Hall, L., Bowyer, K.: SMOTEBoost: improving prediction of the minority class in boosting. Knowl. Disc Databases 2003, 107–119 (2003)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16 341–378 (2002)
Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)
Google Scholar
Sun, Y., Kamel, M.S., Wang, Y.: Boosting for learning multiple classes with imbalanced class distribution. In: Proceedings of the 6th International Conference on Data Mining (ICDM 06), pp. 592–602 (2006)
Google Scholar
Wang, S., Yao, X.: Multiclass imbalance problems: analysis and potential solutions. IEEE Trans. Syst. Man Cybern. Part B 42(4), 1119–1130 (2012)
Article Google Scholar
Liao, T.W.: Classification of weld flaws with imbalanced class data. Expert Syst. Appl. 35(3), 1041–1052 (2008)
Google Scholar
Fernandez, A., del Jesus, M.J., Herrera, F.: Multi-class imbalanced data-sets with linguistic fuzzy rule based classification systems based on pairwise learning. Comput. Intell. Knowl. Based Syst. Des. 6178, 8998 (2010)
Google Scholar
Hazrati, S. M., Hamzeh, A., Hashemi, S.: A game theoretic framework for feature selection. In: 9th International Conference on. IEEE Fuzzy Systems and Knowledge Discovery (FSKD) (2012)
Google Scholar
Gaudel, R., Sebag, M.: Feature selection as a one-player game. In: International Conference on Machine Learning, pp. 359–366 (2010)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1, MIT press, Cambridge (1998)
Google Scholar
Kocsis, L., Szepesvri, C.: Bandit based monte-carlo planning, Mach. Learn.: ECML 282–293 (2006)
Google Scholar
Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)
Google Scholar
Kuncheva, L.I., Whitaker, C.J.: Ten measures of diversity in classifier ensembles: limits for two classifiers, In: Intelligent Sensor Processing, A DERA/IEE Workshop, pp. 10–1, IET (2001)
Google Scholar
Frank, A., Asuncion, A.: UCI machine learning repository. http://archive.ics.uci.edu/ml (2010)
Alcala-Fdez, J., Fernandez, A., Luego, J., Derrac, J., Garcia, S., Sanchez, L., Herrera, F.: Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput. 17(2–3), 255–287 (2011)
Google Scholar
Witten, I.H., Frank E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Google Scholar
Hand, D.J., Till, R.J.: A simple generalization of the area under the ROC curve for multiple class classification problems. Mach. Learn. 45(2), 171–186 (2001)
Google Scholar
Wang, S., Yao, X.: Relationships between diversity of classification ensembles and single-class performance measures. IEEE Trans. Knowl. Data Eng. 25(1), 206–219 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Shiraz University, Shiraz, Iran
Lida Abdi & Sattar Hashemi

Authors

Lida Abdi
View author publications
You can also search for this author in PubMed Google Scholar
Sattar Hashemi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sattar Hashemi .

Editor information

Editors and Affiliations

Department of Paper Technology, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Millie Pant
Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Kusum Deep
Department of Mathematics and Computer Science, Liverpool Hope University, Liverpool, United Kingdom
Atulya Nagar
Department of Applied Mathematics, South Asian University, New Delhi, India
Jagdish Chand Bansal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abdi, L., Hashemi, S. (2014). An Ensemble Pruning Approach Based on Reinforcement Learning in Presence of Multi-class Imbalanced Data. In: Pant, M., Deep, K., Nagar, A., Bansal, J. (eds) Proceedings of the Third International Conference on Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol 258. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1771-8_52

Download citation

DOI: https://doi.org/10.1007/978-81-322-1771-8_52
Published: 04 March 2014
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1770-1
Online ISBN: 978-81-322-1771-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics