A Genetic-Based Ensemble Learning Applied to Imbalanced Data Classification

Klikowski, Jakub; Ksieniewicz, Paweł; Woźniak, Michał

doi:10.1007/978-3-030-33617-2_35

Jakub Klikowski¹⁴,
Paweł Ksieniewicz¹⁴ &
Michał Woźniak¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11872))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1034 Accesses
5 Citations

Abstract

Imbalanced data classification is still a focus of intense research, due to its ever-growing presence in the real-life decision tasks. In this article, we focus on a classifier ensemble for imbalanced data classification. The ensemble is formed on the basis of the individual classifiers trained on supervise-selected feature subsets. There are several methods employing this concept to ensure a high diverse ensemble, nevertheless most of them, as Random Subspace or Random Forest, select attributes for a particular classifier randomly. The main drawback of mentioned methods is not giving the ability to supervise and control this task. In following work, we apply a genetic algorithm to the considered problem. Proposition formulates an original learning criterion, taking into consideration not only the overall classification performance but also ensures that trained ensemble is characterised by high diversity. The experimental study confirmed the high efficiency of the proposed algorithm and its superiority to other ensemble forming method based on random feature selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/w4k2/genetic-ensemble-selection.

References

Alcalá-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Mult. Valued Log. Soft Comput. 17, 255–287 (2011)
Google Scholar
Back, T., Fogel, D., Michalewicz, Z.: Handbook of Evolutionary Computation. Oxford University Press, New York (1997)
Book Google Scholar
Branco, P., Torgo, L., Ribeiro, R.P.: Relevance-based evaluation metrics for multi-class imbalanced domains. In: Kim, J., Shim, K., Cao, L., Lee, J.-G., Lin, X., Moon, Y.-S. (eds.) PAKDD 2017. LNCS (LNAI), vol. 10234, pp. 698–710. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-57454-7_54
Chapter Google Scholar
Canuto, A.M., Nascimento, D.S.: A genetic-based approach to features selection for ensembles using a hybrid and adaptive fitness function. In: The 2012 international joint conference on neural networks (IJCNN), pp. 1–8. IEEE (2012)
Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar
Du, L., Xu, Y., Jin, L.: Feature selection for imbalanced datasets based on improved genetic algorithm. In: Decision Making and Soft Computing: Proceedings of the 11th International FLINS Conference, pp. 119–124. World Scientific (2014)
Google Scholar
García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf. Sci. 180(10), 2044–2064 (2010)
Article Google Scholar
Haque, M.N., Noman, N., Berretta, R., Moscato, P.: Heterogeneous ensemble combination search using genetic algorithm for class imbalanced data classification. PloS One 11(1), e0146116 (2016)
Article Google Scholar
Koziarski, M., Krawczyk, B., Woźniak, M.: The deterministic subspace method for constructing classifier ensembles. Pattern Anal. Appl. 20(4), 981–990 (2017)
Article MathSciNet Google Scholar
Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Prog. Artif. Intell. 5(4), 221–232 (2016). https://doi.org/10.1007/s13748-016-0094-0
Article Google Scholar
Ksieniewicz, P., Woźniak, M.: Imbalanced data classification based on feature selection techniques. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A.J. (eds.) IDEAL 2018. LNCS, vol. 11315, pp. 296–303. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03496-2_33
Chapter Google Scholar
Lee, H.M., Chen, C.M., Chen, J.M., Jou, Y.L.: An efficient fuzzy classifier with feature selection based on fuzzy entropy. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 31(3), 426–432 (2001)
Article Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Wozniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014)
Article Google Scholar

Download references

Acknowledgments

This work was supported by the Polish National Science Centre under the grant No. 2017/27/B/ST6/01325 as well as by the statutory funds of the Department of Systems and Computer Networks, Faculty of Electronics, Wroclaw University of Science and Technology.

Author information

Authors and Affiliations

Department of Systems and Computer Networks, Wrocław University of Science and Technology, Wrocław, Poland
Jakub Klikowski, Paweł Ksieniewicz & Michał Woźniak

Authors

Jakub Klikowski
View author publications
You can also search for this author in PubMed Google Scholar
Paweł Ksieniewicz
View author publications
You can also search for this author in PubMed Google Scholar
Michał Woźniak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paweł Ksieniewicz .

Editor information

Editors and Affiliations

University of Manchester, Manchester, UK
Hujun Yin
Technical University of Madrid, Madrid, Spain
David Camacho
University of Birmingham, Birmingham, UK
Peter Tino
University of Huelva, Huelva, Spain
Antonio J. Tallón-Ballesteros
University of Exeter, Exeter, UK
Ronaldo Menezes
University of Manchester, Manchester, UK
Richard Allmendinger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Klikowski, J., Ksieniewicz, P., Woźniak, M. (2019). A Genetic-Based Ensemble Learning Applied to Imbalanced Data Classification. In: Yin, H., Camacho, D., Tino, P., Tallón-Ballesteros, A., Menezes, R., Allmendinger, R. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2019. IDEAL 2019. Lecture Notes in Computer Science(), vol 11872. Springer, Cham. https://doi.org/10.1007/978-3-030-33617-2_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-33617-2_35
Published: 18 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33616-5
Online ISBN: 978-3-030-33617-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics