On the Effect of Solution Representation and Neighborhood Definition in AutoML Fitness Landscapes

Teixeira, Matheus C.; Pappa, Gisele L.

doi:10.1007/978-3-031-30035-6_15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13987))

Included in the following conference series:

European Conference on Evolutionary Computation in Combinatorial Optimization (Part of EvoStar)

290 Accesses
1 Citations

Abstract

The interest in AutoML search spaces has given rise to a plethora of studies conceived to better understand the characteristics of these spaces. Exploratory landscape analysis is among the most commonly investigated techniques. However, in contrast with other classical optimization problems, in AutoML defining the landscape may be as tough as characterizing it. This is because the concept of solution neighborhood is not clear, as the spaces have a high number of conditional hyperparameters and a somehow hierarchical structure. This paper looks at the impact of different solution representations and distance metrics on the definition of these spaces, and how they affect exploratory landscape analysis metrics. We conclude that these metrics are not able to deal with structured, complex spaces such as the AutoML ones, and problem-related metrics might be the way to leverage the landscape complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
bit.ly/38F0o3U.
2.
https://archive.ics.uci.edu/ml/index.php.
3.
https://www.kaggle.com/datasets.

References

Cleghorn, C.W., Ochoa, G.: Understanding parameter spaces using local optima networks. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1657–1664 (2021)
Google Scholar
Garciarena, U., Santana, R., Mendiburu, A.: Analysis of the complexity of the automatic pipeline generation problem. In: IEEE Congress on Evolutionary Computation, pp. 1–8 (2018)
Google Scholar
Hutter, F., Kotthoff, L., Vanschoren, J. (eds.): Automated Machine Learning: Methods, Systems, Challenges. Springer, Cham (2018). http://automl.org/book
Jones, T., Forrest, S., et al.: Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In: ICGA, vol. 95, pp. 184–192 (1995)
Google Scholar
Lunacek, M., Whitley, D.: The dispersion metric and the CMA evolution strategy. In: Proceedings of the Conference on Genetic and Evolutionary Computation (2006)
Google Scholar
Malan, K., Engelbrecht, A.P.: A survey of techniques for characterising fitness landscapes and some possible ways forward. Inf. Sci. 241, 148–163 (2013)
Article Google Scholar
Malan, K.M.: A survey of advances in landscape analysis for optimisation. Algorithms 14(2) (2021)
Google Scholar
Miller, F., Vandome, A., John, M.: Kendall Tau Rank Correlation Coefficient. VDM Publishing (2010)
Google Scholar
Nunes, M., Fraga, P.M., Pappa, G.L.: Fitness landscape analysis of graph neural network architecture search spaces. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 876–884 (2021)
Google Scholar
Ochoa, G., Veerapen, N.: Neural architecture search: a visual analysis. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds.) Parallel Problem Solving from Nature, pp. 603–615. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-14714-2_42
Chapter Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Pimenta, C.G., de Sá, A.G.C., Ochoa, G., Pappa, G.L.: Fitness landscape analysis of automated machine learning search spaces. In: Paquete, L., Zarges, C. (eds.) EvoCOP 2020. LNCS, vol. 12102, pp. 114–130. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43680-3_8
Chapter Google Scholar
Pushak, Y., Hoos, H.: Algorithm configuration landscapes. In: Auger, A., Fonseca, C.M., Lourenço, N., Machado, P., Paquete, L., Whitley, D. (eds.) PPSN 2018. LNCS, vol. 11102, pp. 271–283. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99259-4_22
Chapter Google Scholar
Pushak, Y., Hoos, H.: Automl loss landscapes. ACM Trans. Evol. Learn. Optim. 2(3) (2022)
Google Scholar
Reidys, C.M., Stadler, P.F.: Neutrality in fitness landscapes. Appl. Math. Comput. 117(2–3), 321–350 (2001)
MathSciNet MATH Google Scholar
Richter, H.: Fitness landscapes: from evolutionary biology to evolutionary computation. In: Richter, H., Engelbrecht, A. (eds.) Recent Advances in the Theory and Application of Fitness Landscapes. ECC, vol. 6, pp. 3–31. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-41888-4_1
Chapter Google Scholar
Rodrigues, N.M., Malan, K.M., Ochoa, G., Vanneschi, L., Silva, S.: Fitness landscape analysis of convolutional neural network architectures for image classification. Inf. Sci. 609, 711–726 (2022)
Article Google Scholar
Teixeira, M.C., Pappa, G.L.: Understanding AutoML search spaces with local optima networks. In: Genetic and Evolutionary Computation Conference (2022)
Google Scholar
Traoré, K.R., Camero, A., Zhu, X.X.: Fitness Landscape Footprint: A Framework to Compare Neural Architecture Search Problems (2021). http://arxiv.org/abs/2111.01584
Treimun-Costa, G., Montero, E., Ochoa, G., Rojas-Morales, N.: Modelling parameter configuration spaces with local optima networks. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 751–759 (2020)
Google Scholar
Van Der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2625 (2008)
MATH Google Scholar
Vanneschi, L., Pirola, Y., Mauri, G., Tomassini, M., Collard, P., Verel, S.: A study of the neutrality of Boolean function landscapes in genetic programming. Theor. Comput. Sci. 425, 34–57 (2012)
Article MathSciNet MATH Google Scholar
Witten, I.H., Frank, E.: Data Mining - Practical Machine Learning Tools and Techniques, 2nd edn. The Morgan Kaufmann Series in Data Management Systems (2005)
Google Scholar
Zöller, M.A., Huber, M.F.: Benchmark and survey of automated machine learning frameworks. J. Artif. Intell. Res. 70, 409–472 (2021)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by FAPEMIG (through grant no. CEX-PPM-00098-17), MPMG (through the project Analytical Capabilities), CNPq (through grant no. 310833/2019-1), and CAPES.

Author information

Authors and Affiliations

Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
Matheus C. Teixeira & Gisele L. Pappa

Authors

Matheus C. Teixeira
View author publications
You can also search for this author in PubMed Google Scholar
Gisele L. Pappa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gisele L. Pappa .

Editor information

Editors and Affiliations

Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile
Leslie Pérez Cáceres
Université libre de Bruxelles, Bruxelles, Belgium
Thomas Stützle

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Teixeira, M.C., Pappa, G.L. (2023). On the Effect of Solution Representation and Neighborhood Definition in AutoML Fitness Landscapes. In: Pérez Cáceres, L., Stützle, T. (eds) Evolutionary Computation in Combinatorial Optimization. EvoCOP 2023. Lecture Notes in Computer Science, vol 13987. Springer, Cham. https://doi.org/10.1007/978-3-031-30035-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-30035-6_15
Published: 31 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30034-9
Online ISBN: 978-3-031-30035-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Effect of Solution Representation and Neighborhood Definition in AutoML Fitness Landscapes