A review of automatic selection methods for machine learning algorithms and hyper-parameter values

Luo, Gang

doi:10.1007/s13721-016-0125-6

A review of automatic selection methods for machine learning algorithms and hyper-parameter values

Original Article
Published: 23 May 2016

Volume 5, article number 18, (2016)
Cite this article

Network Modeling Analysis in Health Informatics and Bioinformatics Aims and scope Submit manuscript

Gang Luo¹

6339 Accesses
185 Citations
6 Altmetric
Explore all metrics

Abstract

Machine learning studies automatic algorithms that improve themselves through experience. It is widely used for analyzing and extracting value from large biomedical data sets, or “big biomedical data,” advancing biomedical research, and improving healthcare. Before a machine learning model is trained, the user of a machine learning software tool typically must manually select a machine learning algorithm and set one or more model parameters termed hyper-parameters. The algorithm and hyper-parameter values used can greatly impact the resulting model’s performance, but their selection requires special expertise as well as many labor-intensive manual iterations. To make machine learning accessible to layman users with limited computing expertise, computer science researchers have proposed various automatic selection methods for algorithms and/or hyper-parameter values for a given supervised machine learning problem. This paper reviews these methods, identifies several of their limitations in the big biomedical data environment, and provides preliminary thoughts on how to address these limitations. These findings establish a foundation for future research on automatically selecting algorithms and hyper-parameter values for analyzing big biomedical data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What Is Machine Learning?

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

Article 13 January 2022

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

References

Adankon MM, Cheriet M (2009) Model selection for the LS-SVM. Application to handwriting recognition. Pattern Recognit 42(12):3264–3270
Article MATH Google Scholar
Ali A, Caruana R, Kapoor A (2014) Active learning with model selection. In: Proceedings of AAAI’14, pp 1673–1679
Alpaydin E (2014) Introduction to machine learning, 3rd edn. The MIT Press, Cambridge
MATH Google Scholar
Bardenet R, Brendel M, Kégl B, Sebag M (2013) Collaborative hyperparameter tuning. In: Proceedings of ICML’13, pp 199–207
Bengio Y (2000) Gradient-based optimization of hyperparameters. Neural Comput 12(8):1889–1900
Article Google Scholar
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305
MathSciNet MATH Google Scholar
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Proceedings of NIPS’11, pp 2546–2554
Bergstra J, Yamins D, Cox DD (2013) Hyperopt: a Python library for optimizing the hyperparameters of machine learning algorithms. In: Proceedings of SciPy 2013, pp 13–20
Bertsekas DP (1999) Nonlinear programming, 2nd edn. Athena Scientific, Belmont
MATH Google Scholar
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, Cambridge
Book MATH Google Scholar
Brazdil P, Soares C, da Costa JP (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn 50(3):251–277
Article MATH Google Scholar
Burnham KP, Anderson DR (2003) Model selection and multimodel inference: a practical information-theoretic approach, 2nd edn. Springer, New York
MATH Google Scholar
Caruana R, Niculescu-Mizil A, Crew G, Ksikes A (2004) Ensemble selection from libraries of models. In: Proceedings of ICML’04
Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M et al. (2006) Bigtable: a distributed storage system for structured data. In: Proceedings of OSDI’06, pp 205–218
Claeskens G, Hjort N (2008) Model selection and model averaging. Cambridge University Press, Cambridge
Book MATH Google Scholar
Cleophas TJ, Zwinderman AH (2013a) Machine learning in medicine. Springer, New York
Book Google Scholar
Cleophas TJ, Zwinderman AH (2013b) Machine learning in medicine: Part 2. Springer, New York
Book Google Scholar
Cleophas TJ, Zwinderman AH (2013c) Machine learning in medicine: Part 3. Springer, New York
Book Google Scholar
Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: Proceedings of OSDI’04, pp 137–150
Domhan T, Springenberg JT, Hutter F (2015) Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In: Proceedings of IJCAI’15, pp 3460–3468
Einbinder JS, Scully KW, Pates RD, Schubart JR, Reynolds RE (2001) Case study: a data warehouse for an academic medical center. J Healthc Inf Manag. 15(2):165–175
Google Scholar
Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015a) Efficient and robust automated machine learning. In: Proceedings of NIPS’15, pp 2944–2952
Feurer M, Springenberg T, Hutter F (2015b) Initializing Bayesian hyperparameter optimization via meta-learning. In: Proceedings of AAAI’15, pp 1128–1135
Fürnkranz J, Petrak J (2001) An evaluation of landmarking variants. In: Proceedings ECML/PKDD Workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning 2001, pp 57–68
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis, 3rd edn. Chapman and Hall/CRC, Boca Raton
MATH Google Scholar
Google Prediction API homepage (2016) https://cloud.google.com/prediction/docs. Accessed 20 January 2016
Gu B, Liu B, Hu F, Liu H (2001) Efficiently determining the starting sample size for progressive sampling. In: Proceedings of ECML’01, pp 192–202
Guo XC, Yang JH, Wu CG, Wang CY, Liang YC (2008) A novel LS-SVMs hyper-parameter selection based on particle swarm optimization. Neurocomputing 71(16–18):3211–3215
Article Google Scholar
Guyon I, Bennett K, Cawley GC, Escalante HJ, Escalera S, Ho TK, Macià N, Ray B, Saeed M, Statnikov AR, Viegas E (2015) Design of the 2015 ChaLearn AutoML challenge. In: Proceedings of IJCNN’15, pp 1–8
Hendry DF, Doornik JA (2014) Empirical model discovery and theory evaluation: automatic selection methods in econometrics. The MIT Press, Cambridge
Book Google Scholar
Hoffman MD, Shahriari B, de Freitas N (2014) On correlation and budget constraints in model-based bandit optimization with application to automatic machine learning. In: Proceedings of AISTATS’14, pp 365–374
Hutter F, Hoos HH, Leyton-Brown K, Stützle T (2009) ParamILS: an automatic algorithm configuration framework. J Artif Intell Res 36:267–306
MATH Google Scholar
Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Proceedings of LION’11, pp 507–523
Hutter F, Hoos H, Leyton-Brown K (2014) An efficient approach for assessing hyperparameter importance. In: Proceedings of ICML’14, pp 754–762
John GH, Langley P (1996) Static versus dynamic sampling for data mining. In: Proceedings of KDD’96, pp 367–370
Jovic A, Brkic K, Bogunovic N (2014) An overview of free software tools for general data mining. In: Proceedings of MIPRO’14, pp 1112–1117
Kadane JB, Lazar NA (2004) Methods and criteria for model selection. J Am Stat Assoc 99(465):279–290
Article MathSciNet MATH Google Scholar
Komer B, Bergstra J, Eliasmith C (2014) Hyperopt-sklearn: automatic hyperparameter configuration for scikit-learn. In: Proceedings of SciPy 2014, pp 33–39
Kraska T, Talwalkar A, Duchi JC, Griffith R, Franklin MJ, Jordan MI (2013) MLbase: a distributed machine-learning system. In: Proceedings of CIDR’13
Lacoste A, Larochelle H, Marchand M, Laviolette F (2014a) Sequential model-based ensemble optimization. In: Proceedings of UAI’14, pp 440–448
Lacoste A, Marchand M, Laviolette F, Larochelle H (2014b) Agnostic Bayesian learning of ensembles. In: Proceedings of ICML’14, pp 611–619
Leite R, Brazdil P (2005) Predicting relative performance of classifiers from samples. In: Proceedings of ICML’05, pp 497–503
Leite R, Brazdil P (2010) Active testing strategy to predict the best classification algorithm via sampling and metalearning. In: Proceedings of ECAI’10, pp 309–314
Leite R, Brazdil P, Vanschoren J (2012) Selecting classification algorithms with active testing. In: Proceedings of MLDM’12, pp 117–131
Liu H, Motoda H (2013) Feature selection for knowledge discovery and data mining. Springer, New York
MATH Google Scholar
Luo G (2015) MLBCD: a machine learning tool for big clinical data. Health Inf Sci Syst 3:3
Article Google Scholar
Luo G (2016) Automatically explaining machine learning prediction results: a demonstration on type 2 diabetes risk prediction. Health Inf Sci Syst 4:2
Article Google Scholar
Luo G, Frey LJ (2016) Efficient execution methods of pivoting for bulk extraction of Entity–Attribute–Value-modeled data. IEEE J Biomed Health Inform. 20(2):644–654
Article Google Scholar
Luo G, Nkoy FL, Gesteland PH, Glasgow TS, Stone BL (2014) A systematic review of predictive modeling for bronchiolitis. Int J Med Inform 83(10):691–714
Article Google Scholar
Luo G, Nkoy FL, Stone BL, Schmick D, Johnson MD (2015a) A systematic review of predictive models for asthma development in children. BMC Med Inform Decis Mak 15(1):99
Article Google Scholar
Luo G, Stone BL, Sakaguchi F, Sheng X, Murtaugh MA (2015b) Using computational approaches to improve risk-stratified patient management: rationale and methods. JMIR Res Protoc. 4(4):e128
Article Google Scholar
Luo G, Stone BL, Johnson MD, Nkoy FL (2016) Predicting appropriate admission of bronchiolitis patients in the emergency room: rationale and methods. JMIR Res Protoc. 5(1):e41
Article Google Scholar
Maron O, Moore AW (1993) Hoeffding races: accelerating model selection search for classification and function approximation. In: Proceedings of NIPS’93, pp 59–66
Nadkarni PM (2011) Metadata-driven software systems in biomedicine: designing systems that can adapt to changing knowledge. Springer, New York
Book Google Scholar
Nocedal J, Wright S (2006) Numerical optimization, 2nd edn. Springer, New York
MATH Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
MathSciNet MATH Google Scholar
Petrak J (2000) Fast subsampling performance estimates for classification algorithm selection. In: Proceedings of the ECML Workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination 2000, pp 3–14
Pfahringer B, Bensusan H, Giraud-Carrier CG (2000) Meta-learning by landmarking various learning algorithms. In: Proceedings of ICML’00, pp 743–750
Provost FJ, Jensen D, Oates T (1999) Efficient progressive sampling. In: Proceedings of KDD’99, pp 23–32
Roski J, Bo-Linn GW, Andrews TA (2014) Creating value in health care through big data: opportunities and policy implications. Health Aff (Millwood) 33(7):1115–1122
Article Google Scholar
Sabharwal A, Samulowitz H, Tesauro G (2016) Selecting near-optimal learners via incremental data allocation. In: Proceedings of AAAI’16
Shahriari B, Swersky K, Wang Z, Adams RP, de Freitas N (2015) Taking the human out of the loop: a review of Bayesian optimization. Proc IEEE 104(1):148–175
Article Google Scholar
Snoek J, Larochelle H, Adams RP (2012) Practical Bayesian optimization of machine learning algorithms. In: Proceedings of NIPS’12, pp 2960–2968
Soares C, Petrak J, Brazdil P (2001) Sampling-based relative landmarks: systematically test-driving algorithms before choosing. In: Proceedings of EPIA’01, pp 88–95
Sparks ER, Talwalkar A, Smith V, Kottalam J, Pan X, Gonzalez JE et al. (2013) MLI: an API for distributed machine learning. In: Proceedings of ICDM’13, pp 1187–1192
Sparks ER, Talwalkar A, Haas D, Franklin MJ, Jordan MI, Kraska T (2015) Automating model search for large scale machine learning. In: Proceedings of SoCC’15, pp 368–380
Steyerberg EW (2009) Clinical prediction models: a practical approach to development, validation, and updating. Springer, New York
Book MATH Google Scholar
Swersky K, Snoek J, Adams RP (2013) Multi-task Bayesian optimization. In: Proceedings of NIPS’13, 2004–2012
Swersky K, Snoek J, Adams RP (2014) Freeze-thaw Bayesian optimization. http://arxiv.org/abs/1406.3896. Accessed 20 January 2016
Thornton C, Hutter F, Hoos HH, Leyton-Brown K (2013) Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of KDD’13, pp 847–855
van Rijn JN, Abdulrahman SM, Brazdil P, Vanschoren J (2015) Fast algorithm selection using learning curves. In: Proceedings of IDA’15, pp 298–309
Wang L, Feng M, Zhou B, Xiang B, Mahadevan S (2015) Efficient hyper-parameter optimization for NLP applications. In: Proceedings of EMNLP’15, 2112–2117
White JM (2013) Bandit algorithms for website optimization. O’Reilly Media, Sebastopol
Google Scholar
Wistuba M, Schilling N, Schmidt-Thieme L (2015a) Hyperparameter search space pruning—a new component for sequential model-based hyperparameter optimization. In: Proceedings of ECML/PKDD (2) 2015, pp 104–119
Wistuba M, Schilling N, Schmidt-Thieme L (2015b) Learning hyperparameter optimization initializations. In: Proceedings of DSAA’15, pp 1–10
Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, Burlington
Google Scholar
Yogatama D, Mann G (2014) Efficient transfer learning method for automatic hyperparameter tuning. In: Proceedings of AISTATS’14, pp 1077–1085
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: Proceedings of HotCloud 2010
Zhou Z (2012) Ensemble methods: foundations and algorithms. Chapman and Hall/CRC, Boca Raton
Google Scholar

Download references

Acknowledgments

We thank Qing T. Zeng, Michael Conway, Philip J. Brewster, David E. Jones, Angela P. Presson, Yue Zhang, Tom Greene, Alun Thomas, and Selena B. Thomas for helpful discussions.

Author information

Authors and Affiliations

Department of Biomedical Informatics, University of Utah, Suite 140, 421 Wakara Way, Salt Lake City, UT, 84108, USA
Gang Luo

Authors

Gang Luo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gang Luo.

Ethics declarations

Conflict of interest

The author reports no conflicts of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, G. A review of automatic selection methods for machine learning algorithms and hyper-parameter values. Netw Model Anal Health Inform Bioinforma 5, 18 (2016). https://doi.org/10.1007/s13721-016-0125-6

Download citation

Received: 20 January 2016
Revised: 04 April 2016
Accepted: 06 May 2016
Published: 23 May 2016
DOI: https://doi.org/10.1007/s13721-016-0125-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review of automatic selection methods for machine learning algorithms and hyper-parameter values

Abstract

Access this article

Similar content being viewed by others

What Is Machine Learning?

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A review of automatic selection methods for machine learning algorithms and hyper-parameter values

Abstract

Access this article

Similar content being viewed by others

What Is Machine Learning?

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation