Geometric Semantic Genetic Programming (GSGP) [1], has profoundly impacted the field of evolutionary computation since its introduction in 2012. The seminal paper by Moraglio, Krawiec, and Johnson presented at PPSN XII introduced an important development in genetic programming that provided a unimodal fitness landscape with constant slope for, in principle, any supervised machine learning task. This breakthrough was achieved by designing genetic operators with specific semantic guarantees: crossover producing offspring with behavior intermediate to their parents’ and mutation generating variants with behavior similar to the original program. The theoretical framework established a clear genotype–phenotype mapping between syntax and semantics, enabling simple syntactic implementations of these semantic operators. Since its inception, GSGP has inspired hundreds of research works that have applied, improved, or extended this framework, demonstrating its significant impact on the field.

To commemorate the tenth anniversary of GSGP, we organized this special issue to present a representative sample of contemporary research threads in this area and foster future developments in the field. The papers collected in this issue showcase both the theoretical depth and practical applicability that GSGP has achieved over the past decade.

1 Advances in theory and implementation

The special issue opens with a novel theoretical perspective by Grant Dick, who reinterprets GSGP through the lens of ensemble learning. This fresh viewpoint frames mutation and crossover as boosting and stacking operations respectively, enabling the integration of regularization techniques to address the long-standing challenge of program growth. This theoretical contribution not only provides new insights into GSGP’s behavior but also suggests practical improvements through better initialization techniques.

Implementation efficiency, a critical aspect for real-world applications, is addressed by groundbreaking GSGP-Hardware by Yazmin Maldonado, Ruben Salas, Joel A. Quevedo, Rogelio Valdez and Leonardo Trujillo. Their FPGA implementation, with bespoke modular hardware realization of individual steps of the evolutionary loop (including the geometric search operators and the fitness function), achieves remarkable efficiency improvements of three to four orders of magnitude compared to GPU implementations, when deployed on a commodity FPGA chip. This work demonstrates how GSGP’s unique properties can be leveraged on specialized hardware architectures, opening new possibilities for edge computing and IoT applications.

2 Novel operators and extensions

The collection features several innovative extensions to the basic GSGP framework. Hengzhe Zhang, Qi Chen, Bing Xue, Wolfgang Banzhaf and Mengjie Zhang introduce a geometric semantic macro-crossover operator for multi-tree genetic programming, significantly improving feature construction in regression tasks. Their comprehensive evaluation across 98 datasets demonstrates the practical value of considering interactions between multiple trees within an individual.

Illya Bakurov, José Manuel Muñoz Contreras, Mauro Castelli, Nuno Rodrigues, Sara Silva, Leonardo Trujillo and Leonardo Vanneschi present an elegant enhancement to the geometric semantic mutation operator, drawing inspiration from deep learning’s batch normalization. Their approach of standardizing random programs reduces program size while improving performance, showing how insights from other machine learning paradigms can advance GSGP.

Lorenzo Bonin, Luigi Rovito, Andrea De Lorenzo and Luca Manzoni address the challenge of premature convergence through their cellular GSGP variant, which introduces a spatial structure to population management. Rather than dwelling in a homogeneous population, the candidate solutions are distributed across an n-dimensional toroidal grid, with the evolutionary process contextualized by the structure of the local neighborhood. The presented empirical results corroborate the authors’ hypothesis, with the solutions found by the cellular variant outperforming the baseline GSGP. This novel approach demonstrates how classical concepts from cellular automata can be successfully adapted to enhance GSGP’s exploration capabilities.

3 Current state and future directions

The special issue concludes with a comprehensive benchmarking study by Jose Manuel Muñoz Contreras, Leonardo Trujillo, Daniel E. Hernández and Luis A. Cardenas Florido, which addresses a crucial question: Is GSGP still competitive after a decade? Their thorough empirical evaluation, using modern implementations and optimal parameter settings, confirms that GSGP remains a powerful approach for symbolic regression. The authors employ a parallel version of GSGP with optimal mutation step calculation, demonstrating that with relatively simple but strategic modifications to the original algorithm, GSGP continues to compete effectively with state-of-the-art methods on black-box problems. This work provides valuable validation for researchers continuing to explore and extend the GSGP framework, showing that the fundamental principles established a decade ago maintain their relevance in today’s machine learning landscape.

Looking ahead, several promising research directions emerge from these contributions:

  1. (1)

    The development of specialized operators for specific problem domains.

  2. (2)

    The exploration of hybrid approaches combining GSGP with other machine learning paradigms.

  3. (3)

    The integration of ideas from deep learning and ensemble methods to enhance GSGP’s performance.

  4. (4)

    The potential for hardware-specific implementations to enable real-time learning in resource-constrained environments.