Skip to main content

DE-Forest – Optimized Decision Tree Ensemble

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2023)

Abstract

Classifier ensembles are still in the spotlight due to their proven applications for many practical problems. One of the most interesting approaches is methods proposing the construction of such models based on learned classifiers using randomly selected attributes. Among the essential algorithms in this area is Random Forest, which uses decision trees as base classifiers. Although these methods obtain outstanding quality for many tasks, the random selection of attributes for individual base classifiers used above does not guarantee the optimal ensemble selection. In this paper, we proposed DE-Forest - an ensemble classifier learning method using evolutionary techniques to select non-random and optimized subsets. Differential Evolution was chosen as the optimization algorithm, selecting the best set of attributes to learn the individual base classifiers. The quality of the proposed method has been evaluated based on computer experiments performed on many benchmark datasets. The obtained results are promising and show the superiority of the proposed method over the benchmark solutions.

This work was supported by the Polish National Science Centre under the grant No. 2019/35/B/ST6/04442.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/w4k2/DE-Forest.

References

  1. Alcalá-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17, 255–287 (2011)

    Google Scholar 

  2. Blank, J., Deb, K.: Pymoo: multi-objective optimization in python. IEEE Access 8, 89497–89509 (2020)

    Article  Google Scholar 

  3. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  4. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Routledge (2017)

    Google Scholar 

  5. Canuto, A.M., Vale, K.M., Feitos, A., Signoretti, A.: Reinsel: a class-based mechanism for feature selection in ensemble of classifiers. Appl. Soft Comput. 12(8), 2517–2529 (2012)

    Article  Google Scholar 

  6. Das, S., Suganthan, P.N.: Differential evolution: a survey of the state-of-the-art. IEEE Trans. Evol. Comput. 15(1), 4–31 (2011). https://doi.org/10.1109/TEVC.2010.2059031

    Article  Google Scholar 

  7. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  8. Elyan, E., Gaber, M.M.: A genetic algorithm approach to optimising random forests applied to class engineered data. Inf. Sci. 384, 220–234 (2017)

    Article  Google Scholar 

  9. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998). https://doi.org/10.1109/34.709601

    Article  Google Scholar 

  10. Hunter, J.D.: Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007)

    Article  Google Scholar 

  11. Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Berlin Heidelberg, Berlin, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25566-3_40

    Chapter  Google Scholar 

  12. Kaur, M., Gianey, H.K., Singh, D., Sabharwal, M.: Multi-objective differential evolution based random forest for e-health applications. Mod. Phys. Lett. B 33(05), 1950022 (2019). https://doi.org/10.1142/S0217984919500222

    Article  Google Scholar 

  13. Koziarski, M., Krawczyk, B., WoźNiak, M.: The deterministic subspace method for constructing classifier ensembles. Pattern Anal. Appl. 20(4), 981–990 (2017). https://doi.org/10.1007/s10044-017-0655-2

    Article  MathSciNet  Google Scholar 

  14. Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Hoboken (2014)

    Book  MATH  Google Scholar 

  15. Lemaître, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(17), 1–5 (2017)

    Google Scholar 

  16. Lin, W., Wu, Z., Lin, L., Wen, A., Li, J.: An ensemble random forest algorithm for insurance big data analysis. IEEE Access 5, 16568–16575 (2017). https://doi.org/10.1109/ACCESS.2017.2738069

    Article  Google Scholar 

  17. Smac3: a versatile Bayesian optimization package for hyperparameter optimization. J. Mach. Learn. Res. 23(54), 1–9 (2022)

    Google Scholar 

  18. Liu, Z., Yang, Z., Liu, S., Shi, Y.: Semi-random subspace method for writeprint identification. Neurocomputing 108, 93–102 (2013)

    Article  Google Scholar 

  19. McKinney, W.: Data structures for statistical computing in python. In: van der Walt, S., Millman, J (eds.) Proceedings of the 9th Python in Science Conference, pp. 56–61 (2010)

    Google Scholar 

  20. Nag, K., Pal, N.R.: A multiobjective genetic programming-based ensemble for simultaneous feature selection and classification. IEEE Trans. Cybern. 46(2), 499–510 (2016). https://doi.org/10.1109/TCYB.2015.2404806

    Article  Google Scholar 

  21. Napierala, K., Stefanowski, J.: Types of minority class examples and their influence on learning classifiers from imbalanced data. J. Intell. Inf. Syst. 46, 563–597 (2016)

    Article  Google Scholar 

  22. Oliphant, T.E.: A guide to NumPy, vol. 1. Trelgol Publishing USA (2006)

    Google Scholar 

  23. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  24. Sani, H.M., Lei, C., Neagu, D.: Computational complexity analysis of decision tree algorithms. In: Bramer, M., Petridis, M. (eds.) SGAI 2018. LNCS, vol. 11311, pp. 191–197. Springer International Publishing, Cham (2018). https://doi.org/10.1007/978-3-030-04191-5_17

    Chapter  Google Scholar 

  25. Stapor, K., Ksieniewicz, P., García, S., Woźniak, M.: How to design the fair experimental classifier evaluation. Appl. Soft Comput. 104, 107219 (2021)

    Article  Google Scholar 

  26. Storn, R., Price, K.: Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. J. Global Optim. 11(4), 341–359 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  27. Woźniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014). special Issue on Information Fusion in Hybrid Intelligent Fusion Systems

    Google Scholar 

  28. Wu, G., Mallipeddi, R., Suganthan, P.N.: Ensemble strategies for population-based optimization algorithms - a survey. Swarm Evol. Comput. 44, 695–711 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joanna Grzyb .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grzyb, J., Woźniak, M. (2023). DE-Forest – Optimized Decision Tree Ensemble. In: Nguyen, N.T., et al. Computational Collective Intelligence. ICCCI 2023. Lecture Notes in Computer Science(), vol 14162. Springer, Cham. https://doi.org/10.1007/978-3-031-41456-5_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41456-5_61

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41455-8

  • Online ISBN: 978-3-031-41456-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics