Abstract
The interest in AutoML search spaces has given rise to a plethora of studies conceived to better understand the characteristics of these spaces. Exploratory landscape analysis is among the most commonly investigated techniques. However, in contrast with other classical optimization problems, in AutoML defining the landscape may be as tough as characterizing it. This is because the concept of solution neighborhood is not clear, as the spaces have a high number of conditional hyperparameters and a somehow hierarchical structure. This paper looks at the impact of different solution representations and distance metrics on the definition of these spaces, and how they affect exploratory landscape analysis metrics. We conclude that these metrics are not able to deal with structured, complex spaces such as the AutoML ones, and problem-related metrics might be the way to leverage the landscape complexity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
https://archive.ics.uci.edu/ml/index.php.
- 3.
https://www.kaggle.com/datasets.
References
Cleghorn, C.W., Ochoa, G.: Understanding parameter spaces using local optima networks. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1657–1664 (2021)
Garciarena, U., Santana, R., Mendiburu, A.: Analysis of the complexity of the automatic pipeline generation problem. In: IEEE Congress on Evolutionary Computation, pp. 1–8 (2018)
Hutter, F., Kotthoff, L., Vanschoren, J. (eds.): Automated Machine Learning: Methods, Systems, Challenges. Springer, Cham (2018). http://automl.org/book
Jones, T., Forrest, S., et al.: Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In: ICGA, vol. 95, pp. 184–192 (1995)
Lunacek, M., Whitley, D.: The dispersion metric and the CMA evolution strategy. In: Proceedings of the Conference on Genetic and Evolutionary Computation (2006)
Malan, K., Engelbrecht, A.P.: A survey of techniques for characterising fitness landscapes and some possible ways forward. Inf. Sci. 241, 148–163 (2013)
Malan, K.M.: A survey of advances in landscape analysis for optimisation. Algorithms 14(2) (2021)
Miller, F., Vandome, A., John, M.: Kendall Tau Rank Correlation Coefficient. VDM Publishing (2010)
Nunes, M., Fraga, P.M., Pappa, G.L.: Fitness landscape analysis of graph neural network architecture search spaces. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 876–884 (2021)
Ochoa, G., Veerapen, N.: Neural architecture search: a visual analysis. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds.) Parallel Problem Solving from Nature, pp. 603–615. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-14714-2_42
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pimenta, C.G., de Sá, A.G.C., Ochoa, G., Pappa, G.L.: Fitness landscape analysis of automated machine learning search spaces. In: Paquete, L., Zarges, C. (eds.) EvoCOP 2020. LNCS, vol. 12102, pp. 114–130. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43680-3_8
Pushak, Y., Hoos, H.: Algorithm configuration landscapes. In: Auger, A., Fonseca, C.M., Lourenço, N., Machado, P., Paquete, L., Whitley, D. (eds.) PPSN 2018. LNCS, vol. 11102, pp. 271–283. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99259-4_22
Pushak, Y., Hoos, H.: Automl loss landscapes. ACM Trans. Evol. Learn. Optim. 2(3) (2022)
Reidys, C.M., Stadler, P.F.: Neutrality in fitness landscapes. Appl. Math. Comput. 117(2–3), 321–350 (2001)
Richter, H.: Fitness landscapes: from evolutionary biology to evolutionary computation. In: Richter, H., Engelbrecht, A. (eds.) Recent Advances in the Theory and Application of Fitness Landscapes. ECC, vol. 6, pp. 3–31. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-41888-4_1
Rodrigues, N.M., Malan, K.M., Ochoa, G., Vanneschi, L., Silva, S.: Fitness landscape analysis of convolutional neural network architectures for image classification. Inf. Sci. 609, 711–726 (2022)
Teixeira, M.C., Pappa, G.L.: Understanding AutoML search spaces with local optima networks. In: Genetic and Evolutionary Computation Conference (2022)
Traoré, K.R., Camero, A., Zhu, X.X.: Fitness Landscape Footprint: A Framework to Compare Neural Architecture Search Problems (2021). http://arxiv.org/abs/2111.01584
Treimun-Costa, G., Montero, E., Ochoa, G., Rojas-Morales, N.: Modelling parameter configuration spaces with local optima networks. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 751–759 (2020)
Van Der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2625 (2008)
Vanneschi, L., Pirola, Y., Mauri, G., Tomassini, M., Collard, P., Verel, S.: A study of the neutrality of Boolean function landscapes in genetic programming. Theor. Comput. Sci. 425, 34–57 (2012)
Witten, I.H., Frank, E.: Data Mining - Practical Machine Learning Tools and Techniques, 2nd edn. The Morgan Kaufmann Series in Data Management Systems (2005)
Zöller, M.A., Huber, M.F.: Benchmark and survey of automated machine learning frameworks. J. Artif. Intell. Res. 70, 409–472 (2021)
Acknowledgements
This work was supported by FAPEMIG (through grant no. CEX-PPM-00098-17), MPMG (through the project Analytical Capabilities), CNPq (through grant no. 310833/2019-1), and CAPES.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Teixeira, M.C., Pappa, G.L. (2023). On the Effect of Solution Representation and Neighborhood Definition in AutoML Fitness Landscapes. In: Pérez Cáceres, L., Stützle, T. (eds) Evolutionary Computation in Combinatorial Optimization. EvoCOP 2023. Lecture Notes in Computer Science, vol 13987. Springer, Cham. https://doi.org/10.1007/978-3-031-30035-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-30035-6_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30034-9
Online ISBN: 978-3-031-30035-6
eBook Packages: Computer ScienceComputer Science (R0)