skip to main content
research-article

Cross-disciplinary perspectives on meta-learning for algorithm selection

Published: 15 January 2009 Publication History

Abstract

The algorithm selection problem [Rice 1976] seeks to answer the question: Which algorithm is likely to perform best for my problem? Recognizing the problem as a learning task in the early 1990's, the machine learning community has developed the field of meta-learning, focused on learning about learning algorithm performance on classification problems. But there has been only limited generalization of these ideas beyond classification, and many related attempts have been made in other disciplines (such as AI and operations research) to tackle the algorithm selection problem in different ways, introducing different terminology, and overlooking the similarities of approaches. In this sense, there is much to be gained from a greater awareness of developments in meta-learning, and how these ideas can be generalized to learn about the behaviors of other (nonlearning) algorithms. In this article we present a unified framework for considering the algorithm selection problem as a learning problem, and use this framework to tie together the crossdisciplinary developments in tackling the algorithm selection problem. We discuss the generalization of meta-learning concepts to algorithms focused on tasks including sorting, forecasting, constraint satisfaction, and optimization, and the extension of these ideas to bioinformatics, cryptography, and other fields.

References

[1]
Achlioptas, D., Naor, A., and Peres, Y. 2005. Rigorous location of phase transitions in hard optimization problems. Nature 435 (Jun.), 759--764.
[2]
Aha, D. 1992. Generalizing from case studies: A case study. In Proceedings of the 9th International Conference on Machine Learning, 1--10.
[3]
Ali, S. and Smith, K. A. 2005. Kernel width selection for SVM classification: A meta learning approach. Int. J. Data Warehousing Mining 1, 78--97.
[4]
Ali, S. and Smith, K. 2006. On learning algorithm selection for classification. Appl. Soft Comput. 6, 2, 119--138.
[5]
Ali, S. and Smith-Miles, K. 2006. A meta-learning approach to automatic kernel selection for support vector machines. Neurocomput. 70, 1-3, 173--186.
[6]
Angel, E. and Zissimopoulos, V. 2002. On the hardness of the quadratic assignment problem with meta-heuristics. J. Heurist. 8, 399--414.
[7]
Arinze, B. 1995. Selecting appropriate forecasting models using rule induction. Omega Int. J. Manage. Sci. 22, 6, 647--658.
[8]
Arinze, B., Kim, S. L., and Anandarajan, M. 1997. Combining and selecting forecasting models using rule based induction. Comput. Oper. Res. 24, 5, 423--433.
[9]
Armstrong, W., Christen, P., Mccreath, E., and Rendell, A. P. 2006. Dynamic algorithm selection using reinforcement learning. In Proceedings of the International Workshop on Integrating AI and Data Mining, 18--24.
[10]
Asuncion, A and Newman, D.J. 2007. UCI Machine Learning Repository (http://www.ics.uci.edu/~mlearn/MLRepository.html). Irvine, CA: University of California, Department of Information and Computer Science.
[11]
Bachelet, V. 1999. Métaheuristiques paralleles hybrides: Application au probleme d'affectation quadratique. Ph.D. dissertation, Universite des Sciences et Technologies de Lille, France. December.
[12]
Bensusan, H., Giraud-Carrier, C., and Kennedy, C. 2000. A higher-order approach to meta-learning. In Proceedings of the European Conference on Machine Learning, Workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination, 109--117.
[13]
Bensusan, H. and Kalousis, A. 2001. Estimating the predictive accuracy of a classifier. Lecture Notes in Computer Science, vol. 2167, 25--31.
[14]
Bernstein, A., Provost, F., and Hill, S. 2005. Toward intelligent assistance for a data mining process: An ontology-based approach for cost-sensitive classification. IEEE Trans. Knowl. Data Eng. 17, 4, 503--518.
[15]
Berrer, H., Paterson, I., and Keller, J. 2000. Evaluation of machine-learning algorithm ranking advisors. In Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILP: Forum for Practical Problem Presentation and Prospective Solutions, P. Brazdil and A. Jorge, eds.
[16]
Borenstein, Y., and Poli, R. 2006. Kolmogorov complexity, optimization and hardness. IEEE Congress on Evol. Comput., 112--119.
[17]
Boukeas, G., Halatsis, C., Zissimpoloulos, V., and Stamatopoulos, P. 2004. Measures of instrinsic hardness for constraint satisfaction problem instances. Lecture Notes in Computer Science, vol. 2932, 184--195.
[18]
Brazdil, P. and Henery, R. 1994. Analysis of results. In Machine Learning, Neural and Statistical Classification, D. Michie et al., eds. Ellis Horwood, New York. Chapter 10.
[19]
Brazdil, P., Soares, C., and Costa, J. 2003. Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. Mach. Learn. 50, 3, 251--277.
[20]
Brodley, C. E. 1993. Addressing the selective superiority problem: Automatic algorithm/model class selection. In Proceedings of the 10th International Machine Learning Conference, 17--24.
[21]
Burke, E., Kendall, G., Newall, J., Hart, E., Ross, P., and Schulenburg, S. 2003. Hyper-Heuristics: An emerging direction in modern search technology. In Handbook of Metaheuristics, Glover and Kochenberger, eds. Kluwer Academic, Dordrecht, 457--474.
[22]
Castiello, C., Castellano, G., and Fanelli, A. M. 2005. Meta-Data: Characterization of input features for meta-learning. Lecture Notes in Artificial Intelligence, vol. 3558, 457--468.
[23]
Chan, P. and Stolfo, S. J. 1997. On the accuracy of meta-learning for scalable data mining. J. Intell. Inf. Syst. 8, 19, 5--28.
[24]
Chu, C. H. and Widjaja, D. 1994. A neural network system for forecasting method selection. Decis. Support Syst. 12, 13--24.
[25]
Desjardins, M and Gordon, D. F. 1995. Special issue on bias evaluation and selection. Mach. Learn. 20, 1--2.
[26]
Duch, W. and Grudzinski, K. 2001. Meta-Learning: Searching in the model space. In Proceedings of the International Conference on Neural Information Processing (ICONIP) I, 235--240.
[27]
Estivill-Castro, V. and Wood, D. 1992. A survey of adaptive sorting algorithms. ACM Comput. Surv. 24, 4, 441--476.
[28]
Frey, P. W. and Slate, D. J. 1991. Letter recognition using holland-style adaptive classifiers. Mach. Learn. 6, 161--182.
[29]
Furnkranz, J. and Petrak, J. 2001. An evaluation of landmarking variants. In Proceedings of the ECML/PKDD Workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), C. Giraud-Carrier et al., eds., 57--68.
[30]
Gagliolo, M. and Schmidhuber, J. 2006. Learning dynamic algorithm portfolios. Ann. Math. Artif. Intell. 47, 3-4 (Aug.), 295--328.
[31]
Gama, J. and Brazdil, P. 2000. Cascade generalization. Mach. Learn. 41, 3, 315--343.
[32]
Gama, J. and Brazdil, P. 1995. Characterization of classification algorithms. In Proceedings of the 7th Portugese Conference in AI, 83--102.
[33]
Glover, F. and Kochenberger, G. 2003. Handbook of Metaheuristics. Kluwer Academic, Dordrecht.
[34]
Gnanadesikan, R. 1997. Methods for Statistical Data Analysis of Multivariate Observations, 2nd ed. Wiley, New York.
[35]
Guo, H. 2003. Algorithm selection for sorting and probabilistic inference: A machine learning-based approach. Ph.D. dissertation, Kansas State University.
[36]
Hilario, M. 2002. Model complexity and algorithm selection in classification. In Proceedings of the 5th International Conference on Discovery Science (DS-02), S. Lange et al., eds., 113--126.
[37]
Hoos, H. H. and Stützle, T. 1999. Towards a characterisation of the behaviour of stochastic local search algorithms for SAT. Artif. Intell. 112, 1-2, 213--232.
[38]
Horvitz, E., Ruan, Y., Gomes, C., Kautz, H., Selman, B., and Chickering, M. 2001. A Bayesian approach to tackling hard computational problems. In Proc. 17th Conference on Uncertainty in Artificial Intelligence.
[39]
Hyndman, R. 2002. Time Series Data Library, available from http://www-personal.buseco.monash.edu.au/~hyndman/TSDL/.
[40]
Jones, N. and Pevzner, P. 2004. An Introduction to Bioinformatics Algorithms. MIT Press, Cambridge, MA.
[41]
Jones, T. and Forrest, S. 1995. Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In Proceedings of the International Conference on Genetic Algorithms, 184--192.
[42]
Kalousis, A. and Hilario, M. 2001. Model selection via meta-learning. International. Journal on AI Tools. 10, 4.
[43]
Kalousis, A. and Theoharis, T. 1999. Design, implementation and performance results of an intelligent assistant for classifier selection. Intell. Data Anal. 3, 5, 319--337.
[44]
Keogh, E. and Folias, T. 2002. The UCR Time Series Data Mining Archive. http://www.cs.ucr.edu/~eamonn/TSDMA/.
[45]
Knowles, J. D. and Corne, D. W. 2002. Towards landscape analysis to inform the design of a hybrid local search for the multiobjective quadratic assignment problem. In Soft Computing Systems: Design, Management and Applications, A. Abraham et al., eds. IOS Press, Amsterdam, 271--279.
[46]
Kohonen, T. 1988. Self-Organisation and Associative Memory. Springer, New York.
[47]
Kopf, C., Taylor, C., and Keller, J. 2000. Meta-Analysis: From data characterisation for meta-learning to meta-regression. In Proceedings of the PKDD Workshop on Data Mining, Decision Support, Meta-Learning and ILP, P. Brazdil and A. Jorge, eds.
[48]
Kuba, P., Brázdil, P., Soares, C., and Woznica, A. 2002. Exploiting sampling and meta-learning for parameter setting for support vector machines. In Proceedings of the Workshop Learning and Data Mining Associated with Iberamia VIII Iberoamerican Conference on Artificial Intelligence. F. Herrera et al., eds., 209--216.
[49]
Lagoudakis, M., Littman, M., and Parr, R. 2001. Selecting the right algorithm. In Proceedings of the AAAI Fall Symposium Series: Using Uncertainty within Computation.
[50]
Ler, D., Koprinska, I., and Chawla, S. 2005. Utilising regression-based landmarkers within a meta-learning framework for algorithm selection. In Proceedings of the Workshop on Meta-Learning, 22nd International Conference on Machine Learning (ICML), 44--51.
[51]
Leyton-Brown, K., Nudelman, E., and Shoham, Y. 2002. Learning the empirical hardness of optimization problems: The case of combinatorial auctions. Lecture Notes in Computer Science, vol. 2470, 556--569.
[52]
Leyton-Brown, K., Nudelman, E., Andrew, G., Mcfadden, J., and Shoham, Y. 2003a. A portfolio approach to algorithm selection. In Proceedings of the International Joint Conference on Artificial Intelligence, 1542--1543.
[53]
Leyton-Brown, K., Nudelman, E., Andrew, G., Mcfadden, J., and Shoham, Y. 2003b. Boosting as a metaphor for algorithm design. In Proceedings of the Conference on Principles and Practice of Constraint Programming, 899--903.
[54]
Lim, T.-S., Loh, W.-Y., and Shih, Y.-S. 2000. A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Mach. Learn. 40, 203--228.
[55]
Linder, C. and Studer, R. 1999. AST: Support for algorithm selection with a CBR approach. In Proceedings of the 16th International Conference on Machine Learning.
[56]
Makridakis, S. and Hibon, M. 2000. The M3-competition: Results, conclusions and implications. Int. J. Forecast. 16, 4, 451--476.
[57]
Marom, Y. Zukerman, I., and Japkowicz, N. 2007. A meta-learning approach for selecting between response automation strategies in a help-desk domain. In Proceedings of the 22nd AAAI Conference on Artificial Intelligence, 907--912.
[58]
Maron, O. and Moore, A. W. 1997. The racing algorithm: Model selection for lazy learners. Artif. Intell. Rev., 193--225.
[59]
Merz, P. 2004. Advanced fitness landscape analysis and the performance of memetic algorithms. Evol. Comput. 12, 303--325.
[60]
Michie, D., Spiegelhalter, D. J., and Taylor C. C. eds. 1994. Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York.
[61]
Nudelman, E., Leyton-Brown, K., Hoos, H., Devkar, A., and Shoham, Y. 2004. Understanding random SAT: Beyond the clauses-to-variables ratio. Lecture Notes in Computer Science, vol. 3258, 438--452.
[62]
Opitz, D. and Maclin, R. 1999. Popular ensemble methods: An empirical study. J. Artif. Intell. Res. 11, 169--198.
[63]
Peng, Y., Flach, P., Soares, C., and Brazdil P. 2002. Improved dataset characterization for meta-learning. In Proceedings of the 5th International Conference on Discovery Science.
[64]
Peterson, A. H. and Martinez, T. R. 2005. Estimating the potential for combining learning models. In Proceedings of the ICML Workshop on Meta-Learning, 68--75.
[65]
Pfahringer, B., Bensusan, H., and Giraud-Carrier, C. 2000. Meta-Learning by landmarking various learning algorithms. In Proceedings of the 17th International Conference on Machine Learning.
[66]
Prodromidis, A. L., Chan, P., and Stolfo, S. J. 2000. Meta-Learning in distributed data mining systems: Issues and approaches. In Advances of Distributed Data Mining, H. Kargupta and P. Chan, eds. AAAI Press.
[67]
Prudêncio, R. and Ludermir, T. 2004. Meta-Learning approaches to selecting time-series models. Neurocomput. 61, 121--137.
[68]
Rendell, L. and Cho, H. 1990. Empirical learning as a function of concept character. Mach. Learn. 5, 267--298.
[69]
Rice, J. R. 1953. Classes of recursively enumerable sets and their decision problems. Trans. Amer. Math. Soc. 89, 29--59.
[70]
Rice, J. R. 1976. The algorithm selection problem. Advances in Comput. 15, 65--118.
[71]
Samulowitz, H. and Memisevic, R. 2007. Learning to solve QBF. In Proceedings of the 22nd AAAI Conference on Artificial Intelligence, 255--260.
[72]
Seewald, A., Petrak, J., and Widmer, G. 2001. Hybrid decision tree learners with alternative leaf classifiers: An empirical study. In Proceedings of the 14th International FLAIRS Conference.
[73]
Selman, B., Mitchell, D. G., and Levesque, H. J. 1996. Generating hard satisfiability problems. Artif. Intell. 81, 17--29.
[74]
Shannon, C. E. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27 (Jul., Oct.), 379--423 and 623--656.
[75]
Slaney, J. and Walsh, T. 2001. Backbones in optimization and approximation. In Proceedings of the International Joint Conference on Artificial Intelligence.
[76]
Smith, K. A., Woo, F., Ciesielski, V., and Ibrahim, R. 2001. Modelling the relationship between problem characteristics and data mining algorithm performance using neural networks. In Smart Engineering System Design: Neural Networks, Fuzzy Logic, Evolutionary Programming, Data Mining, and Complex Systems, C. Dagli et al., eds. ASME Press, 11, 356--362.
[77]
Smith, K. A., Woo, F., Ciesielski, V., and Ibrahim, R. 2002. Matching data mining algorithm suitability to data characteristics using a self-organising map. In Hybrid Information Systems, A. Abraham and M. Koppen, M., eds. Physica, Heidelberg, 169--180.
[78]
Smith-Miles, K. A. 2008. Towards insightful algorithm selection for optimisation using meta-learning concepts. In Proceedings of the IEEE Joint Conference on Neural Networks. 4118--4124.
[79]
Soares, C., Petrak, J., and Brazdil, P. 2001. Sampling-Based relative landmarks: Systematically test-driving algorithms before choosing. In Proceedings of the Portuguese AI Conference.
[80]
Soares, C., Brazdil, P., and Kuba, P. 2004. A meta-learning method to select the kernel width in support vector regression. Mach. Learn. 54, 3, 195--209.
[81]
Streeter, M., Golovin, D., and Smith, S. F. 2007. Combining multiple heuristics online. In Proceedings of the 22nd AAAI Conference on Artificial Intelligence, 1197--1203.
[82]
Stützle, T. and Fernandes, S. 2004. New benchmark instances for the QAP and the experimental analysis of algorithms. Lecture Notes in Computer Science, vol. 3004, 199--209.
[83]
Telesis, O. and Stamatopoulos, P. 2001. Combinatorial optimization through statistical instance-based learning. In Proceedings of the 13th International Conference on Tools with Artificial Intelligence, 203--209.
[84]
Todorovski, L., Blockeel, H., and Dzeroski, S. 2002. Ranking with predictive clustering trees. In Proceedings of the European Conference on Machine Learning.
[85]
Todorovski, L. and Dzeroski, S. 2003. Combining classifiers with meta decision trees. Mach. Learn.
[86]
Van Hemert, J. I. 2006. Evolving combinatorial problem instances that are difficult to solve. Evol. Comput. 14, 433--462.
[87]
Venkatachalam, A. R. and Sohl, J. E. 1999. An intelligent model selection and forecasting system. Int. J. Forecast. 18, 3, 67--180.
[88]
Vilalta, R. and Drissi, Y. 2002. A perspective view and survey of meta-learning. Artif. Intell. Rev. 18, 77--95.
[89]
Vollmann, T. E. and Buffa, E. S. 1966. The facility layout problem in perspective. Manage. Sci. 12, 10, 450--468.
[90]
Wallace, C. S. and Boulton D. M. 1968. An information measure for classification. Comput. J. 11, 2, 185--194.
[91]
Wallace, M. and Schimpf, J. 2002. Finding the right hybrid algorithm—A combinatorial meta-problem. Ann. Math. Artif. Intell. 34, 259--269.
[92]
Wang, X. 2005. Characteristic-Based forecasting for time series data, Ph.D. dissertation, Monash University, Australia.
[93]
Wang, X., Smith, K. A., and Hyndman, R. 2006. Characteristic-Based clustering for time series data. Data Mining Knowl. Discov. 13, 335--364.
[94]
Witten, I. H. and Frank, E. 2005. Data Mining: Practical Machine Learning Tools and Techniques, 2nd ed. Morgan Kaufmann, San Francisco.
[95]
Wolpert, D. and Macready, W. 1997. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1, 67--82.
[96]
Wolpert, D. H. 1992. Stacked generalization. Neural Netw. 5, 241--259.
[97]
Xu, L., Hutter, F., Hoos, H., and Leyton-Brown, K. 2007a. Satzilla-07: The design and analysis of an algorithm portfolio for SAT. In Principles and Practices of Constraint Programming. Lecture Notes in Computer Science, 712--727.
[98]
Xu, L., Hoos, H., and Leyton-Brown, K. 2007b. Hierarchical hardness models for SAT. In Principles and Practices of Constraint Programming. Lecture Notes in Computer Science, 696--711.
[99]
Yang, J. and Jiu, B. 2006. Algorithm selection: A quantitative approach. Algorithmic Trading II: Precision, control, execution. http://www.itg.com/news_events/papers/AlgoSelection20/.

Cited By

View all
  • (2025)Machine Learning based Algorithm Selection and Genetic Algorithms for serial-batch schedulingComputers and Operations Research10.1016/j.cor.2024.106827173:COnline publication date: 1-Jan-2025
  • (2025)Location, Size, and CapacityInto a Deeper Understanding of Evolutionary Computing: Exploration, Exploitation, and Parameter Control10.1007/978-3-031-75577-4_1(1-152)Online publication date: 18-Jan-2025
  • (2024)A machine learning-based selection approach for solving the single machine scheduling problem with early/tardy jobsBizinfo Blace10.5937/bizinfo2401001A15:1(1-10)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Cross-disciplinary perspectives on meta-learning for algorithm selection

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 41, Issue 1
      January 2009
      281 pages
      ISSN:0360-0300
      EISSN:1557-7341
      DOI:10.1145/1456650
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 January 2009
      Accepted: 01 March 2008
      Revised: 01 December 2007
      Received: 01 July 2007
      Published in CSUR Volume 41, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Algorithm selection
      2. classification
      3. combinatorial optimization
      4. constraint satisfaction
      5. dataset characterization
      6. empirical hardness
      7. forecasting
      8. landscape analysis
      9. meta-learning
      10. model selection
      11. sorting

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)206
      • Downloads (Last 6 weeks)9
      Reflects downloads up to 18 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Machine Learning based Algorithm Selection and Genetic Algorithms for serial-batch schedulingComputers and Operations Research10.1016/j.cor.2024.106827173:COnline publication date: 1-Jan-2025
      • (2025)Location, Size, and CapacityInto a Deeper Understanding of Evolutionary Computing: Exploration, Exploitation, and Parameter Control10.1007/978-3-031-75577-4_1(1-152)Online publication date: 18-Jan-2025
      • (2024)A machine learning-based selection approach for solving the single machine scheduling problem with early/tardy jobsBizinfo Blace10.5937/bizinfo2401001A15:1(1-10)Online publication date: 2024
      • (2024)Metaheuristics and Machine Learning ConvergenceMetaheuristic and Machine Learning Optimization Strategies for Complex Systems10.4018/979-8-3693-7842-7.ch015(276-322)Online publication date: 30-Jun-2024
      • (2024)Research on eight machine learning algorithms applicability on different characteristics data sets in medical classification tasksFrontiers in Computational Neuroscience10.3389/fncom.2024.134557518Online publication date: 31-Jan-2024
      • (2024)Maximal Objectives in the Multiarmed Bandit with ApplicationsManagement Science10.1287/mnsc.2022.0080170:12(8853-8874)Online publication date: 1-Dec-2024
      • (2024)Trusting My Predictions: On the Value of Instance-Level AnalysisACM Computing Surveys10.1145/361535456:7(1-28)Online publication date: 9-Apr-2024
      • (2024)Solution Transfer in Evolutionary Optimization: An Empirical Study on Sequential TransferIEEE Transactions on Evolutionary Computation10.1109/TEVC.2023.333950628:6(1776-1793)Online publication date: 1-Dec-2024
      • (2024)A Systems Theoretic Approach to Online Machine Learning2024 IEEE International Systems Conference (SysCon)10.1109/SysCon61195.2024.10553476(1-8)Online publication date: 15-Apr-2024
      • (2024)MLRS-PDS: A Meta-learning recommendation of dynamic ensemble selection pipelines2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651199(1-9)Online publication date: 30-Jun-2024
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media