Abstract
In this chapter, we present a new implementation of the popular Tree-Based Pipeline Optimization Tool (TPOT). This new implementation, called TPOT2, was rebuilt from the ground up to be more modular, easier to maintain, and easier to expand. TPOT2 comes with new features and optimizations, such as a more flexible graph-based representation of Scikit-Learn pipelines and the ability to specify various aspects of the evolutionary run. Using experiments on multiple benchmark datasets, we show that TPOT2 performs at least as well as TPOT1 with equivalent settings, with stronger performance on a few datasets. We outline some future directions for further optimizations and applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2019)
Blank, J., Deb, K.: pymoo: Multi-objective optimization in python. IEEE Access 8, 89497–89509 (2020)
Cavaglià, M., Gaudio, S., Hansen, T., Staats, K., Szczepańczyk, M., Zanolin, M.: Improving the background of gravitational-wave searches for core collapse supernovae: a machine learning approach. Mach. Learn.: Sci. Technol. 1(1), 015005 (2020)
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)
Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., Hutter, F.: Auto-sklearn 2.0: hands-free automl via meta-learning. arXiv:2007.04074 [cs.LG] (2020)
Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., Hutter, F.: Auto-sklearn 2.0: hands-free automl via meta-learning. J. Mach. Learn. Res. 23(1), 11936–11996 (2022)
Feurer, M., Klein, A., Eggensperger, J., Springenberg, K., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems 2015, vol. 28, pp. 2962–2970 (2015)
Fortin, F.-A., De Rainville, F.-M., Gardner, M.-A., Parizeau, M., Gagné, C.: DEAP: evolutionary algorithms made easy. J. Mach. Learn. Res. 13, 2171–2175 (2012)
Freda, P.J., Ghosh, A., Zhang, E., Luo, T., Chitre, A.S., Polesskaya, O., St. Pierre, C.L., Gao, J., Martin, C.D., Chen, H. et al.: Automated quantitative trait locus analysis (autoqtl). BioData Mining 16(1) (2023)
Hagberg, A.A., Schult, D.A., Swart, P.J.: Exploring network structure, dynamics, and function using networkx. In: Varoquaux, G., Vaught, T., Millman, J. (eds.), Proceedings of the 7th Python in Science Conference, pp. 11–15. Pasadena (2008)
Manduchi, E., Fu, W., Romano, J.D., Ruberto, S., Moore, J.H.: Embedding covariate adjustments in tree-based automated machine learning for biomedical big data analyses. BMC Bioinf. 21(1) (2020)
Manduchi, E., Romano, J.D., Moore, J.H.: The promise of automated machine learning for the genetic analysis of complex traits. Hum. Genet. 141(9), 1529–1544 (2021)
Olson, R.S., Moore, J.H.: Tpot: a tree-based pipeline optimization tool for automating machine learning. In: Workshop on Automatic Machine Learning, pp. 66–74. PMLR (2016)
Parmentier, L., Nicol, O., Jourdan, L., Kessaci, M.E.: Tpot-sh: A faster optimization algorithm to solve the automl problem on large datasets. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 471–478. IEEE (2019)
Soper, D.S.: Greed is good: Rapid hyperparameter optimization and model selection using greedy k-fold cross validation. Electronics 10(16), 1973 (2021)
Thornton, C., Hutter, F., Hoos, H. H., Leyton-Brown, K.: Auto-weka: Combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 847–855 (2013)
Acknowledgements
The study was supported by the following NIH grants: R01 LM010098 and U01 AG066833.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Ribeiro, P. et al. (2024). TPOT2: A New Graph-Based Implementation of the Tree-Based Pipeline Optimization Tool for Automated Machine Learning. In: Winkler, S., Trujillo, L., Ofria, C., Hu, T. (eds) Genetic Programming Theory and Practice XX. Genetic and Evolutionary Computation. Springer, Singapore. https://doi.org/10.1007/978-981-99-8413-8_1
Download citation
DOI: https://doi.org/10.1007/978-981-99-8413-8_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8412-1
Online ISBN: 978-981-99-8413-8
eBook Packages: Computer ScienceComputer Science (R0)