tutorial

Instance Space Analysis for Algorithm Testing: Methodology and Software Tools

Authors:

Kate Smith-Miles,

Mario Andrés MuñozAuthors Info & Claims

ACM Computing Surveys, Volume 55, Issue 12

Article No.: 255, Pages 1 - 31

https://doi.org/10.1145/3572895

Published: 02 March 2023 Publication History

Abstract

Instance Space Analysis (ISA) is a recently developed methodology to (a) support objective testing of algorithms and (b) assess the diversity of test instances. Representing test instances as feature vectors, the ISA methodology extends Rice’s 1976 Algorithm Selection Problem framework to enable visualization of the entire space of possible test instances, and gain insights into how algorithm performance is affected by instance properties. Rather than reporting algorithm performance on average across a chosen set of test problems, as is standard practice, the ISA methodology offers a more nuanced understanding of the unique strengths and weaknesses of algorithms across different regions of the instance space that may otherwise be hidden on average. It also facilitates objective assessment of any bias in the chosen test instances and provides guidance about the adequacy of benchmark test suites. This article is a comprehensive tutorial on the ISA methodology that has been evolving over several years, and includes details of all algorithms and software tools that are enabling its worldwide adoption in many disciplines. A case study comparing algorithms for university timetabling is presented to illustrate the methodology and tools.

References

[1]

H. Alipour, M. A. Muñoz, and K. Smith-Miles. 2023. Enhanced instance space analysis for the maximum flow problem. Eur. J. Oper. Res. 304, 2 (2023), 411–428.

[2]

M. Alissa, K. Sim, and E. Hart. 2019. Algorithm selection using deep learning without feature extraction. In Proceedings of the Genetic and Evolutionary Computation Conference. 198–206.

Digital Library

[3]

C. Beyrouthy, E. K. Burke, D. Landa-Silva, B. McCollum, P. McMullan, and A. J. Parkes. 2009. Towards improving the utilization of university teaching space. J. Oper. Res. Soc. 60, 1 (2009), 130–143.

[4]

P. Brazdil, C. Giraud-Carrier, C. Soares, and R. Vilalta. 2008. Metalearning: Applications to Data Mining. Springer.

[5]

C. G. Broyden. 1970. The convergence of a class of double-rank minimization algorithms 1. General considerations. IMA J. Appl. Math. 6, 1 (1970), 76–90. DOI:

[6]

E. K. Burke, J. Mareček, A. J. Parkes, and H. Rudová. 2010. A supernodal formulation of vertex colouring with applications in course timetabling. Ann. Oper. Res. 179, 1 (2010), 105–130.

[7]

C. C. Chang and C. J. Lin. 2011. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3 (2011), Article 27, 27 pages.

[8]

J. C. Culberson and F. Luo. 1996. Exploring the k-colorable landscape with iterated greedy. Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge 26 (1996), 245–284.

[9]

M Daszykowski, B Walczak, and D. L. Massart. 2001. Looking for natural patterns in data: Part 1. Density-based approach. Chemometrics and Intelligent Laboratory Systems 56, 2 (2001), 83–92.

[10]

A. De Coster, N. Musliu, A. Schaerf, J. Schoisswohl, and K. Smith-Miles. 2022. Algorithm selection and instance space analysis for curriculum-based course timetabling. J. Sched. 25 (2022), 35–58.

Digital Library

[11]

H. Edelsbrunner, D. Kirkpatrick, and R. Seidel. 1983. On the shape of a set of points in the plane. IEEE Trans. Inform. Theory 29, 4 (1983), 551–559. DOI:

[12]

M. Ester, H. P. Kriegel, J. Sander, and X. Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD’96). 226–231.

[13]

M. Gallagher. 2016. Towards improved benchmarking of black-box optimization algorithms using clustering problems. Soft. Comput. 20, 10 (2016), 3835–3849. DOI:

Digital Library

[14]

N. Hansen, A. Auger, R. Ros, O. Mersmann, T. Tušar, and D. Brockhoff. 2021. COCO: A platform for comparing continuous optimizers in a black-box setting. Optim. Methods Softw. 36, 1 (2021), 114–144. DOI:

[15]

D. E. Hinkle, W. Wiersma, and S. G. Jurs. 2003. Applied Statistics for the Behavioral Sciences. Houghton Mifflin.

[16]

J. N. Hooker. 1994. Needed: An empirical science of algorithms. Oper. Res. 42, 2 (1994), 201–212.

Digital Library

[17]

J. Hooker. 1995. Testing heuristics: We have it all wrong. J. Heuristics 1, 1 (Sept. 1995), 33–42. DOI:

Digital Library

[18]

S. Kandanaarachchi, M. A. Muñoz, R. Hyndman, and K. Smith-Miles. 2019. On normalization and algorithm selection for unsupervised outlier detection. Data Min. Knowl. Discov. 34 (2019), 309–354. DOI:

[19]

S. Kandanaarachchi, M. A. Muñoz, and K. Smith-Miles. 2019. Instance space analysis for unsupervised outlier detection. In Proceedings of the1st Workshop on Evaluation and Experimental Design in Data Mining and Machine Learning.

[20]

Y. Kang, R. J. Hyndman, and K. Smith-Miles. 2017. Visualising forecasting algorithm performance using time series instance spaces. Int. J. Forecast. 33, 2 (2017), 345–358. DOI:

[21]

P. Kerschke and H. Trautmann. 2019. Comprehensive feature-based landscape analysis of continuous and constrained optimization problems using the R-package flacco. In Applications in Statistical Computing: From Music Data Analysis to Industrial Quality Improvement, N. Bauer, K. Ickstadt, K. Lübke, G. Szepannek, H. Trautmann, and M. Vichi (Eds.). Springer, 93–123. DOI:

[22]

L. Lopes and K. Smith-Miles. 2010. Pitfalls in instance generation for Udine timetabling. In Learning and Intelligent Optimization. Lecture Notes in Computer Science, Vol. 6073. Springer, 299–302.

[23]

L. Lopes and K. Smith-Miles. 2013. Generating applicable synthetic instances for branch problems. Oper. Res. 61 (June 2013), 563–577. DOI:

[24]

N. Macia and E. Bernadó-Mansilla. 2014. Towards UCI+: A mindful repository design. Inform. Sciences 261 (2014), 237–262.

Digital Library

[25]

C. C. McGeoch. 2012. A Guide to Experimental Algorithmics. Cambridge University Press.

Digital Library

[26]

O. Mersmann. 2009. Benchmarking Evolutionary Multiobjective Optimization Algorithms Using R. Master’s thesis. Universitat Dortmund.

[27]

M. A. Muñoz and K. Smith-Miles. 2017. Generating custom classification datasets by targeting the instance space. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO’17). ACM, New York, NY, 1582–1588. DOI:

Digital Library

[28]

M. A. Muñoz and K. Smith-Miles. 2017. Performance analysis of continuous black-box optimization algorithms via footprints in instance space. Evol. Comput. 25, 4 (2017), 529–554. DOI:

Digital Library

[29]

M. A. Muñoz and K. A. Smith-Miles. 2021. Generating new space-filling test instances for continuous black-box optimization. Evol. Comput. 28, 3 (2021), 379–404. DOI:

Digital Library

[30]

M. A. Muñoz, L. Villanova, D. Baatar, and K. Smith-Miles. 2018. Instance spaces for machine learning classification. Mach. Learn. 107, 1 (2018), 109–147. DOI:

Digital Library

[31]

T. Müller. 2009. ITC2007 solver description: A hybrid approach. Ann. Oper. Res. 172, 1 (2009), 429.

[32]

M. A. Muñoz and K. Smith-Miles. 2020. Instance Space Analysis: A Toolkit for the Assessment of Algorithmic Power. https://github.com/andremun/InstanceSpace/.

[33]

P. Y. A. Paiva, C. Castro Moreno, K. Smith-Miles, M. G. Valeriano, and A. C. Lorena. 2022. Relating instance hardness to classification performance in a dataset: A visual approach. Mach. Learn. 111, 8 (2022), 3085–3123.

Digital Library

[34]

J. R. Rice. 1976. The algorithm selection problem. In Advances in Computers. Vol. 15. Elsevier, 65–118. DOI:

[35]

E. Schubert, J. Sander, M. Ester, H. P. Kriegel, and X. Xu. 2017. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. ACM Trans. Database Syst. 42, 3 (July 2017), Article 19, 21 pages. DOI:

Digital Library

[36]

K. Sim and E. Hart. 2022. Evolutionary approaches to improving the layouts of instance-spaces. In Proceedings of the International Conference on Parallel Problem Solving from Nature. 207–219.

Digital Library

[37]

K. A. Smith-Miles. 2009. Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput. Surv. 41, 1 (2009), Article 6, 25 pages. DOI:

Digital Library

[38]

K. Smith-Miles, D. Baatar, B. Wreford, and R. Lewis. 2014. Towards objective measures of algorithm performance across instance space. Comput. Oper. Res. 45 (2014), 12–24. DOI:

Digital Library

[39]

K. Smith-Miles and S. Bowly. 2015. Generating new test instances by evolving in instance space. Comput. Oper. Res. 63 (2015), 102–113. DOI:

Digital Library

[40]

K. Smith-Miles, J. Christiansen, and M. A. Muñoz. 2021. Revisiting where are the hard knapsack problems? via Instance Space Analysis. Comput. Oper. Res. 128 (2021), 105184. DOI:

[41]

K. Smith-Miles and L. Lopes. 2011. Generalising algorithm performance in instance space: A timetabling case study. In Learning and Intelligent Optimization. Lecture Notes in Computer Science, Vol. 6683. Springer, 524–538. DOI:

[42]

K. Smith-Miles and L. Lopes. 2012. Measuring instance difficulty for combinatorial optimization problems. Comput. Oper. Res. 39, 5 (2012), 875–889. DOI:

Digital Library

[43]

K. Smith-Miles, M. A. Muñoz, and Neelofar. 2020. Melbourne Algorithm Test Instance Library with Data Analytics (MATILDA). https://matilda.unimelb.edu.au/.

[44]

K. Smith-Miles and T. T. Tan. 2012. Measuring algorithm footprints in instance space. In Proceedings of the 2012 IEEE Congress on Computational Intelligence (CEC’12). 3446–3453.

[45]

J. Vanschoren, J. N. van Rijn, B. Bischl, and L. Torgo. 2013. OpenML: Networked science in machine learning. SIGKDD Explorations 15, 2 (2013), 49–60. DOI:

Digital Library

[46]

X. Wang, K. Smith, and R. Hyndman. 2006. Characteristic-based clustering for time series data. Data Min. Knowl. Discov. 13, 3 (2006), 335–364.

Digital Library

[47]

D. H. Wolpert and W. G. Macready. 1997. No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1, 1 (April1997), 67–82. DOI:

Digital Library

[48]

S. Yarrow, K. A. Razak, A. R. Seitz, and P. Serès. 2014. Detecting and quantifying topography in neural maps. PLoS One 9, 2 (Feb. 2014), 1–14. DOI:

Cited By

Katial VSmith-Miles KHill CHollenberg L(2025)On the Instance Dependence of Parameter Initialization for the Quantum Approximate Optimization AlgorithmINFORMS Journal on Computing10.1287/ijoc.2024.056437:1(146-171)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1287/ijoc.2024.0564
Carrabs FCerulli RMansini RSerra DSorgente C(2025)Hybridizing Carousel Greedy and Kernel Search: A new approach for the maximum flow problem with conflict constraintsEuropean Journal of Operational Research10.1016/j.ejor.2025.02.006Online publication date: Feb-2025
https://doi.org/10.1016/j.ejor.2025.02.006
Martí RSevaux MSörensen K(2025)Fifty years of metaheuristicsEuropean Journal of Operational Research10.1016/j.ejor.2024.04.004321:2(345-362)Online publication date: Mar-2025
https://doi.org/10.1016/j.ejor.2024.04.004
Show More Cited By

Index Terms

Instance Space Analysis for Algorithm Testing: Methodology and Software Tools

Recommendations

An Instance Space Analysis of Regression Problems
Survey Paper and Regular Papers

The quest for greater insights into algorithm strengths and weaknesses, as revealed when studying algorithm performance on large collections of test problems, is supported by interactive visual analytics tools. A recent advance is Instance Space ...
Instance spaces for machine learning classification

This paper tackles the issue of objective performance evaluation of machine learning classifiers, and the impact of the choice of test instances. Given that statistical properties or features of a dataset affect the difficulty of an instance for ...
Algorithm selection and instance space analysis for curriculum-based course timetabling
Abstract
We propose an algorithm selection approach and an instance space analysis for the well-known curriculum-based course timetabling problem (CB-CTT), which is an important problem for its application in higher education. Several state of the art ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 55, Issue 12

December 2023

825 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3582891

Editor:
Albert Zomaya
University of Sydney, Australia

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 March 2023

Online AM: 30 November 2022

Accepted: 22 November 2022

Revised: 13 November 2022

Received: 31 May 2021

Published in CSUR Volume 55, Issue 12

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Tutorial

Funding Sources

Australian Research Council under the Australian Laureate Fellowship scheme
ARC Training Centre in Optimisation Technologies, Integrated Methodologies and Applications (OPTIMA)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
1,389
Total Downloads

Downloads (Last 12 months)653
Downloads (Last 6 weeks)68

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Katial VSmith-Miles KHill CHollenberg L(2025)On the Instance Dependence of Parameter Initialization for the Quantum Approximate Optimization AlgorithmINFORMS Journal on Computing10.1287/ijoc.2024.056437:1(146-171)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1287/ijoc.2024.0564
Carrabs FCerulli RMansini RSerra DSorgente C(2025)Hybridizing Carousel Greedy and Kernel Search: A new approach for the maximum flow problem with conflict constraintsEuropean Journal of Operational Research10.1016/j.ejor.2025.02.006Online publication date: Feb-2025
https://doi.org/10.1016/j.ejor.2025.02.006
Martí RSevaux MSörensen K(2025)Fifty years of metaheuristicsEuropean Journal of Operational Research10.1016/j.ejor.2024.04.004321:2(345-362)Online publication date: Mar-2025
https://doi.org/10.1016/j.ejor.2024.04.004
Smith-Miles K(2025)Understanding instance hardness for optimisation algorithms: Methodologies, open challenges and post-quantum implicationsApplied Mathematical Modelling10.1016/j.apm.2025.115965142(115965)Online publication date: Jun-2025
https://doi.org/10.1016/j.apm.2025.115965
Notice DPavlidis NKheiri A(2025)Supervised Dimensionality Reduction for the Algorithm Selection ProblemAdvances in Computational Intelligence Systems10.1007/978-3-031-78857-4_7(85-97)Online publication date: 8-Jan-2025
https://doi.org/10.1007/978-3-031-78857-4_7
Shen YSun YLi XCao ZEberhard AZhang GSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Adaptive stabilization based on machine learning for column generationProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693891(44741-44758)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693891
Vermetten DYe FBäck TDoerr C(2024)MA-BBOB: A Problem Generator for Black-Box Optimization Using Affine Combinations and ShiftsACM Transactions on Evolutionary Learning and Optimization10.1145/3673908Online publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1145/3673908
Long Fvan Stein BFrenzel MKrause PGitterle MBäck T(2024)Generating Cheap Representative Functions for Expensive Automotive Crashworthiness OptimizationACM Transactions on Evolutionary Learning and Optimization10.1145/36465544:2(1-26)Online publication date: 8-Jun-2024
https://dl.acm.org/doi/10.1145/3646554
Valeriano MPereira JVeiga Kiffer CLorena ALi XHandl J(2024)Explaining instances in the health domain based on the exploration of a dataset's hardness embeddingProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664113(1598-1606)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3664113
Rasulo ASmith-Miles KMuñoz MHandl JLópez-Ibáñez MLi XHandl J(2024)Extending Instance Space Analysis to Algorithm Configuration SpacesProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654264(147-150)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638530.3654264
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents