Skip to main content

On the Effect of Solution Representation and Neighborhood Definition in AutoML Fitness Landscapes

  • Conference paper
  • First Online:
Evolutionary Computation in Combinatorial Optimization (EvoCOP 2023)

Abstract

The interest in AutoML search spaces has given rise to a plethora of studies conceived to better understand the characteristics of these spaces. Exploratory landscape analysis is among the most commonly investigated techniques. However, in contrast with other classical optimization problems, in AutoML defining the landscape may be as tough as characterizing it. This is because the concept of solution neighborhood is not clear, as the spaces have a high number of conditional hyperparameters and a somehow hierarchical structure. This paper looks at the impact of different solution representations and distance metrics on the definition of these spaces, and how they affect exploratory landscape analysis metrics. We conclude that these metrics are not able to deal with structured, complex spaces such as the AutoML ones, and problem-related metrics might be the way to leverage the landscape complexity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    bit.ly/38F0o3U.

  2. 2.

    https://archive.ics.uci.edu/ml/index.php.

  3. 3.

    https://www.kaggle.com/datasets.

References

  1. Cleghorn, C.W., Ochoa, G.: Understanding parameter spaces using local optima networks. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, pp. 1657–1664 (2021)

    Google Scholar 

  2. Garciarena, U., Santana, R., Mendiburu, A.: Analysis of the complexity of the automatic pipeline generation problem. In: IEEE Congress on Evolutionary Computation, pp. 1–8 (2018)

    Google Scholar 

  3. Hutter, F., Kotthoff, L., Vanschoren, J. (eds.): Automated Machine Learning: Methods, Systems, Challenges. Springer, Cham (2018). http://automl.org/book

  4. Jones, T., Forrest, S., et al.: Fitness distance correlation as a measure of problem difficulty for genetic algorithms. In: ICGA, vol. 95, pp. 184–192 (1995)

    Google Scholar 

  5. Lunacek, M., Whitley, D.: The dispersion metric and the CMA evolution strategy. In: Proceedings of the Conference on Genetic and Evolutionary Computation (2006)

    Google Scholar 

  6. Malan, K., Engelbrecht, A.P.: A survey of techniques for characterising fitness landscapes and some possible ways forward. Inf. Sci. 241, 148–163 (2013)

    Article  Google Scholar 

  7. Malan, K.M.: A survey of advances in landscape analysis for optimisation. Algorithms 14(2) (2021)

    Google Scholar 

  8. Miller, F., Vandome, A., John, M.: Kendall Tau Rank Correlation Coefficient. VDM Publishing (2010)

    Google Scholar 

  9. Nunes, M., Fraga, P.M., Pappa, G.L.: Fitness landscape analysis of graph neural network architecture search spaces. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 876–884 (2021)

    Google Scholar 

  10. Ochoa, G., Veerapen, N.: Neural architecture search: a visual analysis. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds.) Parallel Problem Solving from Nature, pp. 603–615. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-14714-2_42

    Chapter  Google Scholar 

  11. Pedregosa, F., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  12. Pimenta, C.G., de Sá, A.G.C., Ochoa, G., Pappa, G.L.: Fitness landscape analysis of automated machine learning search spaces. In: Paquete, L., Zarges, C. (eds.) EvoCOP 2020. LNCS, vol. 12102, pp. 114–130. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43680-3_8

    Chapter  Google Scholar 

  13. Pushak, Y., Hoos, H.: Algorithm configuration landscapes. In: Auger, A., Fonseca, C.M., Lourenço, N., Machado, P., Paquete, L., Whitley, D. (eds.) PPSN 2018. LNCS, vol. 11102, pp. 271–283. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99259-4_22

    Chapter  Google Scholar 

  14. Pushak, Y., Hoos, H.: Automl loss landscapes. ACM Trans. Evol. Learn. Optim. 2(3) (2022)

    Google Scholar 

  15. Reidys, C.M., Stadler, P.F.: Neutrality in fitness landscapes. Appl. Math. Comput. 117(2–3), 321–350 (2001)

    MathSciNet  MATH  Google Scholar 

  16. Richter, H.: Fitness landscapes: from evolutionary biology to evolutionary computation. In: Richter, H., Engelbrecht, A. (eds.) Recent Advances in the Theory and Application of Fitness Landscapes. ECC, vol. 6, pp. 3–31. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-41888-4_1

    Chapter  Google Scholar 

  17. Rodrigues, N.M., Malan, K.M., Ochoa, G., Vanneschi, L., Silva, S.: Fitness landscape analysis of convolutional neural network architectures for image classification. Inf. Sci. 609, 711–726 (2022)

    Article  Google Scholar 

  18. Teixeira, M.C., Pappa, G.L.: Understanding AutoML search spaces with local optima networks. In: Genetic and Evolutionary Computation Conference (2022)

    Google Scholar 

  19. Traoré, K.R., Camero, A., Zhu, X.X.: Fitness Landscape Footprint: A Framework to Compare Neural Architecture Search Problems (2021). http://arxiv.org/abs/2111.01584

  20. Treimun-Costa, G., Montero, E., Ochoa, G., Rojas-Morales, N.: Modelling parameter configuration spaces with local optima networks. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 751–759 (2020)

    Google Scholar 

  21. Van Der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2625 (2008)

    MATH  Google Scholar 

  22. Vanneschi, L., Pirola, Y., Mauri, G., Tomassini, M., Collard, P., Verel, S.: A study of the neutrality of Boolean function landscapes in genetic programming. Theor. Comput. Sci. 425, 34–57 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  23. Witten, I.H., Frank, E.: Data Mining - Practical Machine Learning Tools and Techniques, 2nd edn. The Morgan Kaufmann Series in Data Management Systems (2005)

    Google Scholar 

  24. Zöller, M.A., Huber, M.F.: Benchmark and survey of automated machine learning frameworks. J. Artif. Intell. Res. 70, 409–472 (2021)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by FAPEMIG (through grant no. CEX-PPM-00098-17), MPMG (through the project Analytical Capabilities), CNPq (through grant no. 310833/2019-1), and CAPES.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gisele L. Pappa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Teixeira, M.C., Pappa, G.L. (2023). On the Effect of Solution Representation and Neighborhood Definition in AutoML Fitness Landscapes. In: Pérez Cáceres, L., Stützle, T. (eds) Evolutionary Computation in Combinatorial Optimization. EvoCOP 2023. Lecture Notes in Computer Science, vol 13987. Springer, Cham. https://doi.org/10.1007/978-3-031-30035-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-30035-6_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-30034-9

  • Online ISBN: 978-3-031-30035-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics