A study of model and hyper-parameter selection strategies for classifier ensembles: a robust analysis on different optimization algorithms and extended results

Feitosa-Neto, Antonino A.; Xavier-Júnior, João C.; Canuto, Anne M. P.; Oliveira, Alexandre C. M.

doi:10.1007/s11047-020-09816-0

A study of model and hyper-parameter selection strategies for classifier ensembles: a robust analysis on different optimization algorithms and extended results

Published: 30 October 2020

Volume 20, pages 805–819, (2021)
Cite this article

Natural Computing Aims and scope Submit manuscript

428 Accesses
3 Citations
Explore all metrics

Abstract

It is well known that machine learning (ML) techniques have been playing an important role in several real world applications. However, one of the main challenges is the selection of the most accurate technique to be used in a specific application. In the classification context, for instance, two main approaches can be applied, model selection and hyper-parameter selection. In the first approach, the best classification algorithm is selected for a given input dataset, by doing a heuristic search in a large space of candidate classification algorithms and their corresponding hyper-parameter settings. As the main focus of this approach is the selection of the classification algorithms, it is referred to as model selection and they are also called automated machine learning (Auto-ML). The second approach defines one classification system and performs an extensive search to select the best hyper-parameters for this model. In this paper, we perform a wide and robust comparative analysis of both approaches for Classifier Ensembles. In this analysis, two methods of the first approach (Auto-WEKA and H$_{2}$O) are compared to four methods of the second approach (Genetic Algorithm, Particle Swarm Optimization, Tabu Search and GRASP). The main aim is to determine which of these techniques generate more accurate Classifier Ensembles, given a time constraint. Additionally, an empirical analysis will be conducted with 21 classification datasets for evaluating the performance of the aforementioned techniques. Our findings indicate that the use of a hyper-parameter selection method provides the most accurate classifier ensembles, but this improvement was not detected by the statistical test.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ensemble classification for imbalanced data based on feature space partitioning and hybrid metaheuristics

Article 07 February 2019

Multiclass feature selection with metaheuristic optimization algorithms: a review

Article 30 August 2022

Ensemble Feature Selection Method Based on Recently Developed Nature-Inspired Algorithms

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Anh T, Austin W, Jeremy G, Keegan H, Bayan BC, Reza F (2019) Towards automated machine learning: evaluation and comparison of AutoML approaches and tools. ArXiv e-prints arXiv:1908.05557
Apoorva C (2018) A study on framework of H$_{2}$O for data science. Int J Adv Res Big Data Manag Syst 2(2):1–8
Google Scholar
Bergstra J, Komer B, Eliasmith C, Yamins D, Cox DD (2015) Hyperopt: a python library for model selection and hyperparameter optimization. Comput Sci Discov 8(1):014008
Article Google Scholar
Charon I, Hudry O (2001) The noising methods: a generalization of some metaheuristics. Eur J Oper Res 135:86–101
Article Google Scholar
de S’a AGC, Pinto WJGS, Oliveira LOVB, Pappa GL (2017) ’RECIPE: a grammar-based framework for automatically evolving classification pipelines. In: Proceedings of the 20th European conference on genetic programming (EuroGP’17), LNCS 10196. Springer, pp 246–261
Demšar J (2006) Statistical comparisons of classifiers over multiple datasets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Feitosa-Neto A, Xavier-Junior JC, Canuto A, Oliveira A (2019) A comparative study on automatic model and hyper-parameter selection in classifier ensembles. In: 8th Brazilian conference on intelligent systems (BRACIS). pp. 323–328
Feo TA, Resende MGC (1989) A probabilistic heuristic for a computationally difficult set covering problem. Oper Res Lett 8(2):67–71
Article MathSciNet Google Scholar
Feo TA, Resende MGC (1995) Greedy randomized adaptive search procedures. J Glob Optim 6:109–133
Article MathSciNet Google Scholar
Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015) Efficient and robust automated machine learning. Adva Neural Info Process Syst 28:2962–2970
Google Scholar
Gendreau M, Potvin J (2010) Handbook of metaheuristics, 2nd edn. Springer, New York
Book Google Scholar
Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13(5):533–549
Article MathSciNet Google Scholar
Glover F, Laguna M, Martí R (2000) Fundamentals of scatter search and path relinking. Control Cybern 29(3):653–684
MathSciNet MATH Google Scholar
Goldbarg EFG, Goldbarg MC, de Souza GR (2006) Particle swarm optimization algorithm for the traveling salesman problem. In: Gottlieb J, Raidl GR (eds) Evolutionary computation in combinatorial optimization. EvoCOP, Lecture notes in computer science, vol 3906. Springer, Berlin
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18
Article Google Scholar
Holland JH (1992) Genetic algorithms. Sci Am 267(1):66–72
Article Google Scholar
Jin H, Song Q, Hu X (2018) Auto-Keras: an efficient neural architecture search system. ArXiv e-prints arXiv:1806.10282
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4. pp 1942–1948
Kotthoff L, Thornton C, Hoos HH, Hutter F, Leyton-Brown K (2017) Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA. J Mach Learn Res 18(1):826–830l
MathSciNet Google Scholar
Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, Hoboken
Book Google Scholar
Lacoste A, Larochelle H, Laviolette F, Marchand M (2014) Sequential model-based ensemble optimization. Computing Research Repository (CoRR)
Lawal IA, Abdulkarim SA (2017) Adaptive SVM for data stream classification. S Afr Comput J 29(1):27–42
Google Scholar
Lévesque J, Gagné C, Sabourin R (2016) Bayesian hyperparameter optimization for ensemble learning. In: Proceedings of the 32nd conference on uncertainty in artificial intelligence (UAI). Jersey City, pp 437–446
Mohr F, Wever M, Hüllermeier E (2018) ML-Plan: automated machine learning via hierarchical planning. Mach Learn 107:1495–1515
Article MathSciNet Google Scholar
Neto AF, Canuto A (2018) An exploratory study of mono and multi-objective metaheuristics to ensemble of classifiers. Appl Intell J 48:416–431
Article Google Scholar
Thornton C, Hutter F, Hoos HH, Leyton-Brown K (2013) Auto-WEKA: combined Selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, pp 847–855
Veloso B, Gama J, Malheiro B (2018) Self hyper-parameter tuning for data streams. In: International conference on discovery science. Springer, Cham, pp 241–255
Wang Y, Ni XS (2019) A XGBoost risk model via feature selection and Bayesian hyper-parameter optimization. arXiv e-prints
Wistuba M, Schilling N and Schmidt-Thieme L (2017) Automatic Frankensteining: creating complex ensembles autonomously. In: Proceedings SIAM international conference on data mining. SIAM, pp 741–749
Wolpert D (1996) The lack of a priori distinctions between learning algorithms. Neural Comput 8:1341–1390
Article Google Scholar
Xavier-Junior JC, Freitas AA, Feitosa-Neto A, Ludermir T (2018) A novel evolutionary algorithm for automated machine learning focusing on classifier ensembles. In: Proceedings of the 7th Brazilian conference on intelligent systems (BRACIS). São Paulo, pp 462–467
Yang C, Akimoto Y, Kim DW et al (2018) Oboe: collaborative filtering for AutoML model selection. ArXiv e-prints arXiv:1808.03233

Download references

Acknowledgements

This work has been financially supported by Capes/Brazil.

Author information

Authors and Affiliations

Federal University of Rio Grande do Norte, Natal, RN, Brazil
Antonino A. Feitosa-Neto, João C. Xavier-Júnior & Anne M. P. Canuto
Federal University of Maranhão, São Luiz, MA, Brazil
Alexandre C. M. Oliveira

Authors

Antonino A. Feitosa-Neto
View author publications
You can also search for this author inPubMed Google Scholar
João C. Xavier-Júnior
View author publications
You can also search for this author inPubMed Google Scholar
Anne M. P. Canuto
View author publications
You can also search for this author inPubMed Google Scholar
Alexandre C. M. Oliveira
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to João C. Xavier-Júnior.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Feitosa-Neto, A.A., Xavier-Júnior, J.C., Canuto, A.M.P. et al. A study of model and hyper-parameter selection strategies for classifier ensembles: a robust analysis on different optimization algorithms and extended results. Nat Comput 20, 805–819 (2021). https://doi.org/10.1007/s11047-020-09816-0

Download citation

Accepted: 16 October 2020
Published: 30 October 2020
Issue Date: December 2021
DOI: https://doi.org/10.1007/s11047-020-09816-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A study of model and hyper-parameter selection strategies for classifier ensembles: a robust analysis on different optimization algorithms and extended results

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Ensemble classification for imbalanced data based on feature space partitioning and hybrid metaheuristics

Multiclass feature selection with metaheuristic optimization algorithms: a review

Ensemble Feature Selection Method Based on Recently Developed Nature-Inspired Algorithms

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now