ABSTRACT
A continued fraction expansion represents a real number as an expression obtained by iteratively extracting the largest whole number from its fractional part and inverting the remainder.
Continued Fraction Regression (CFR) is a method for approximating unknown target functions from data. The key idea is representing the target function as an analytic continued fraction expansion. This is achieved through an optimization approach, which searches the set of possible fractions to find the best approximating fraction for the given data. This research investigates the relationship between truncated fraction depth, accuracy, complexity, and training time in the CFR method for challenging regression problems. Specifically, low-sample synthetic datasets with Gaussian noise are considered, which we use as a proxy for low-sample dynamical systems with underlying models obscured by measurement errors.
We propose and assess the performance of three depth-regulating CFR approaches against six modern symbolic regression methods. We reinforce the strong generalization capacity of the CFR method while reducing model complexity and execution time. Our method achieves the most 1st place rankings in testing against its competitors on 21 Nguyen datasets. It is never worse than 3rd for any dataset while taking at most 4% of the training time of its closest competitor.
- Ignacio Arnaldo, Krzysztof Krawiec, and Una-May O'Reilly. 2014. Multiple Regression Genetic Programming. In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (Vancouver, BC, Canada) (GECCO '14). Association for Computing Machinery, New York, NY, USA, 879--886. Google ScholarDigital Library
- Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, 785--794. Google ScholarDigital Library
- Miles Cranmer. 2020. PySR: Fast & Parallelized Symbolic Regression in Python/Julia. Google ScholarCross Ref
- Rosanna Cretney. 2014. The origins of Euler's early work on continued fractions. Historia Mathematica 41, 2 (May 2014), 139156. Google ScholarCross Ref
- Fabricio Olivetti de Franca and Guilherme Seidyo Imai Aldeia. 2021. Interaction-Transformation Evolutionary Algorithm for Symbolic Regression. Evol Comput 29, 3 (September 2021), 367--390. Google ScholarCross Ref
- Leonhard Euler. 1737. De fractionibus continuis dissertatio. Commentarii academiae scientiarum Petropolitanae 14, 1 (March 1737), 98137. https://scholarlycommons.pacific.edu/euler-works/71/Google Scholar
- Iztok Fajfar, Janez Puhan, and Bundefinedrmen. 2017. Evolving a Nelder-Mead Algorithm for Optimization with Genetic Programming. Evol Comput 25, 3 (September 2017), 351--373. Google ScholarDigital Library
- Andrew N. W. Hone. 2017. Continued fractions for some transcendental numbers. Monatsh Math 182, 1 (Jan 2017), 33--38. Google ScholarCross Ref
- Maarten Keijzer. 2003. Improving Symbolic Regression with Interval Arithmetic and Linear Scaling. In Genetic Programming (Berlin, Heidelberg). Springer Berlin Heidelberg, New York, NY, USA, 70--82. Google ScholarCross Ref
- Gabriel Kronberger. 2021. CFR - Continued Fraction Regression. Retrieved Dec 29, 2021 from https://github.com/heal-research/HEAL.CFRGoogle Scholar
- William La Cava, Thomas Helmuth, Lee Spector, and Jason H. Moore. 2019. A Probabilistic and Multi-Objective Analysis of Lexicase Selection and -Lexicase Selection. Evol Comput 27, 3 (September 2019), 377--402. Google ScholarDigital Library
- William La Cava, Patryk Orzechowski, Bogdan Burlacu, Fabricio de Franca, Marco Virgolin, Ying Jin, Michael Kommenda, and Jason Moore. 2021. Contemporary Symbolic Regression Methods and their Relative Performance. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (New Orleans). Curran, 226--236. https://datasets-benchmarks-proceedings.neurips.cc/paper_files/paper/2021/file/c0c7c76d30bd3dcaefc96f40275bdc0a-Paper-round1.pdfGoogle Scholar
- Jiayu Liang and Yu Xue. 2020. An Adaptive GP-Based Memetic Algorithm for Symbolic Regression. Applied Intelligence 50, 11 (2020), 3961--3975. Google ScholarDigital Library
- Jiayu Liang and Yu Xue. 2021. Multi-Objective Memetic Algorithms with Tree-Based Genetic Programming and Local Search for Symbolic Regression. Neural Process Lett 53, 3 (2021), 2197--2219. Google ScholarDigital Library
- Lisa Lorentzen and Haakon Waadeland. 2008. Continued Fractions Convergence Theory. Vol. 1. Atlantis Press Paris. Google ScholarCross Ref
- Jiang Lu, Pinghua Gong, Jieping Ye, and Changshui Zhang. 2023. A survey on machine learning from few samples. Pattern Recognition 139, Article 109480 (July 2023). Google ScholarDigital Library
- Greg Martin. 2002. The Unreasonable Effectualness of Continued Function Expansions. (2002). arXiv:math/0206166 Google ScholarCross Ref
- Trent McConaghy. 2011. FFX: Fast, Scalable, Deterministic Symbolic Regression Technology. Springer New York. Google ScholarCross Ref
- Pablo Moscato. 1989. On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Towards Memetic Algorithms. Technical Report. Pasadena, California, USA.Google Scholar
- Pablo Moscato, Mohammad Nazmul Haque, and Anna Moscato. 2023 (accepted April 18, 2023). "Continued fractions and the Thomson Problem". Scientific Reports (2023 (accepted April 18, 2023)).Google Scholar
- Pablo Moscato, Haoyuan Sun, and Mohammad Nazmul Haque. 2020. Analytic Continued Fractions for Regression: Results on 352 datasets from the physical sciences. In 2020 IEEE Congress on Evolutionary Computation (CEC). 18. Google ScholarDigital Library
- Pablo Moscato, Haoyuan Sun, and Mohammad Nazmul Haque. 2021. Analytic Continued Fractions for Regression: A Memetic Algorithm Approach. Expert Systems with Applications 179, 1, Article 115018 (October 2021). Google ScholarCross Ref
- Marcus Märtens and Dario Izzo. 2022. Symbolic Regression for Space Applications: Differentiable Cartesian Genetic Programming Powered by Multi-objective Memetic Algorithms. (2022). arXiv:2206.06213 Google ScholarCross Ref
- John. A. Nelder and Roger Mead. 1965. A Simplex Method for Function Minimization. Comput. J. 7, 4 (April 1965), 308--313. Google ScholarCross Ref
- Ji Ni, Russ H. Drieberg, and Peter I. Rockett. 2012. The Use of an Analytic Quotient Operator in Genetic Programming. In IEEE Transactions on Evolutionary Computation. IEEE, 146--152. Google ScholarDigital Library
- Carl D. Olds. 1963. Continued fractions. Mathematical Association of America.Google Scholar
- Randal S. Olson, William La Cava, Patryk Orzechowski, Ryan J. Urbanowicz, and Jason H. Moore. 2017. PMLB: a large benchmark suite for machine learning evaluation and comparison. BioData Mining 10, 1, Article 36 (Dec 2017). Google ScholarCross Ref
- Fabian Pedregosa, Gaël Varoquauxand Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 85 (2011), 2825--2830. http://jmlr.org/papers/v12/pedregosa11a.htmlGoogle ScholarDigital Library
- Brenden K Petersen, Mikel Landajuela Larma, Terrell N. Mundhenk, Claudio Prata Santiago, Soo Kyung Kim, and Joanne Taery Kim. May 2021. Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients. In International Conference on Learning Representations (Vienna, Austria). https://openreview.net/forum?id=m5Qsh0kBQGGoogle Scholar
- Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 9351, 1 (2015), 234241. Google ScholarCross Ref
- Aliyu Sani Sambo, R. Muhammad Atif Azad, Yevgeniya Kovalchuk, Vivek Padmanaabhan Indramohan, and Hanifa Shah. 2021. Evolving simple and accurate symbolic regression models via asynchronous parallel computing. Applied Soft Computing 104, 1, Article 107198 (June 2021). Google ScholarDigital Library
- Michael Schmidt and Hod Lipson. 2011. Age-Fitness Pareto Optimization. Springer New York. 129--146 pages.Google Scholar
- Haoyuan Sun and Pablo Moscato. 2019. A Memetic Algorithm for Symbolic Regression. In 2019 IEEE Congress on Evolutionary Computation (CEC). IEEE, 21672174. Google ScholarDigital Library
- Nguyen Quang Uy, Nguyen Xuan Hoai, Michael O'Neill, R. I. McKay, and Edgar Galván-López. 2010. Semantically-based crossover in genetic programming: application to real-valued symbolic regression. Genetic Programming and Evolvable Machines 12, 2 (July 2010), 91--119. Google ScholarDigital Library
- Dominique van den Bosch. 2018. The application of continued fractions in Christiaan Huygens' planetarium. Master's thesis. http://resolver.tudelft.nl/uuid:34d16f38-51f4-4f55-854a-5f317a5e79afGoogle Scholar
- Marco Virgolin, Tanja Alderliesten, and Peter A. N. Bosman. 2019. Linear Scaling with and within Semantic Backpropagation-Based Genetic Programming for Symbolic Regression. In Proceedings of the Genetic and Evolutionary Computation Conference (Prague, Czech Republic) (GECCO '19). Association for Computing Machinery, New York, NY, USA, 1084--1092. Google ScholarDigital Library
- Yaqing Wang, Quanming Yao, James T. Kwok, and Lionel M. Ni. 2021. Generalizing from a Few Examples: A Survey on Few-Shot Learning. ACM Comput. Surv. 53, 3, Article 63 (Jun 2021), 34 pages. Google ScholarDigital Library
- Hengzhe Zhang. 2021. PS-Tree. Retrieved April 2023 from https://github.com/hengzhe-zhang/PS-TreeGoogle Scholar
Index Terms
- Dynamic Depth for Better Generalization in Continued Fraction Regression
Recommendations
Approximating the Boundaries of Unstable Nuclei Using Analytic Continued Fractions
GECCO '23 Companion: Proceedings of the Companion Conference on Genetic and Evolutionary ComputationWe used evolutionary computation to approximate the nuclear binding energy B of stable nuclei with atomic mass number A = Z + N. Our symbolic regression approximation outperformed the Liquid Drop Model (LDM) for lighter nuclides, and is less complex. ...
A method of convergence acceleration of some continued fractions II
Most of the methods for convergence acceleration of continued fractions K ( a m / b m ) are based on the use of modified approximants S m ( m ) in place of the classical ones S m (0), where m are close to the tails f ( ...
Hausdorff dimensions of bounded-type continued fraction sets of Laurent series
We study the Hausdorff dimensions of bounded-type continued fraction sets of Laurent series and show that the Texan conjecture is true in the case of Laurent series.
Comments