Elsevier

Swarm and Evolutionary Computation

Volume 44, February 2019, Pages 840-851
Swarm and Evolutionary Computation

Alignment-based genetic programming for real life applications

https://doi.org/10.1016/j.swevo.2018.09.006Get rights and content

Abstract

A recent discovery has attracted the attention of many researchers in the field of genetic programming: given individuals with particular characteristics of alignment in the error space, called optimally aligned, it is possible to reconstruct a globally optimal solution. Furthermore, recent preliminary experiments have shown that an indirect search consisting of looking for optimally aligned individuals can have benefits in terms of generalization ability compared to a direct search for optimal solutions. For this reason, defining genetic programming systems that look for optimally aligned individuals is becoming an ambitious and important objective. Nevertheless, the systems that have been introduced so far present important limitations that make them unusable in practice, particularly for complex real-life applications. In this paper, we overcome those limitations, and we present the first usable alignment-based genetic programming system, called nested alignment genetic programming (NAGP). The presented experimental results show that NAGP is able to outperform two of the most recognized state-of-the-art genetic programming systems on four complex real-life applications. The predictive models generated by NAGP are not only more effective than the ones produced by the other studied methods but also significantly smaller and thus more manageable and interpretable.

Introduction

The use of machine learning techniques to tackle challenging real-world problems is a cornerstone of the last decades. Among the various techniques inside the ML framework, evolutionary computation (EC) has shown its ability in addressing a plethora of complex problems over various domains [1]. The common idea shared by EC methods is to mimic the Darwinian principles of natural selection [2] to solve optimization problems. In particular, given an objective function that quantifies the quality of each solution, EC starts by randomly creating a set of candidate solutions and uses the objective function as an abstract fitness measure. Based on this fitness, some of the better candidates are chosen to seed the next generation by applying stochastic genetic operators [1].

One of the newest techniques belonging to the EC family is genetic programming (GP) [3], where solutions are (typically) represented as Lisp-like trees. GP was able to produce human-competitive results over various domains [4], and one of its advantages is that its application only requires a limited knowledge of the problem to be solved [3]. Despite its ability to produce good-quality solutions over various domains, GP suffers from some important limitations. The first issue relates to the time needed to evaluate each solution in the population, a process that is time-consuming and that can be exacerbated by the onset of bloat, the increase of tree size (i.e., the number of nodes) without a corresponding improvement in terms of fitness. The second limitation regards the use of the standard genetic operators used by GP: mutation and crossover. These two operators work by performing blind transformation of the syntax (i.e., the structure) of the solutions to build new individuals. While this allows for a simple definition of the genetic operators for various domains, the standard GP operators completely ignore the information about the behavior (i.e., the semantics) of the solutions. That is, given a parent solution (or two solutions for the crossover), it is very difficult to predict what will be the vector of output produced by the new child (two children for the crossover) after the application of genetic operators. Anyway, the semantics is what really matters for a GP practitioner and is the information necessary to evaluate the suitability of a solution in addressing a given problem. For this reason, the integration and use of semantic awareness in GP became one of the hottest topics in the field of EC [5].

Semantics is defined as the output vector produced by a GP individual when evaluated over a set of training cases [5]. Among the many approaches that have been defined so far, geometric semantic GP (GSGP) [6] and one of its more recent developments, GSGP with local search (GSGP_LS) [7], have become particularly popular in the last few years. This success is probably due to the ability of these two systems to induce a unimodal error surface (i.e., with no local optima) for any supervised learning problem. Additionally, GSGP allows direct inclusion of semantic awareness in the evolutionary search process. On the other hand, the large majority of the existing semantic methods are indirect, in the sense that they use traditional syntax-based genetic operators and only accept newly created solutions based on some semantic criteria [5]. This feature makes GSGP suitable for addressing problems characterized by a vast amount of data, still maintaining an acceptable running time [8,9].

To continue this promising research stream, in this paper, we introduce a novel GP system, aimed at exploiting semantic awareness in a totally different way, compared to GSGP and GSGP_LS. The new system, called nested alignment GP (NAGP), is based on the recently defined concept of alignment in the error space, which is discussed in Section 2. The contribution of this work is twofold:

  • In the first place, we deepen a very recent and promising idea to exploit semantic awareness in GP (i.e., the alignment in the error space [10]), which has received relatively little attention by the GP community so far;

  • Secondly, we define a novel computational intelligence method, NAGP, which is able to generate predictive models that are more effective and manageable than the ones generated by GSGP and GSGP_LS for a set of complex real-life applications.

The former contribution is important for the GP community because it represents a further step forward in a very popular research line (improving GP with semantic awareness). On the other hand, considering that GSGP and GSGP_LS are regarded as state-of-the-art computational technology for generating predictive models in all the applications that we study in this paper, the latter contribution promises to have a tremendous impact on these very important applicative domains. As we will see in the rest of this paper, NAGP has two competitive advantages compared to GSGP and GSGP_LS: Not only is NAGP able to obtain more accurate predictive models, but these models are also smaller in size, which makes them more readable and interpretable.

The paper is organized as follows: In Section 2, we introduce the idea of alignment in the error space, also discussing some previous preliminary studies in which this idea was developed. In Section 3, we discuss the studied applications. In Section 4, we present NAGP for the first time, as well as a variant of NAGP called NAGP_β, describing every single step of their implementation. Section 5 describes the employed experimental settings. In Section 6, we present and discuss the obtained experimental results, comparing the performance of NAGP and NAGP_β to that of GSGP and GSGP_LS for the prediction of the energy consumption of buildings. Finally, Section 7 concludes the paper and proposes suggestions for future research.

Section snippets

Previous work on alignment in the error space

A few years after the introduction of GSGP, a new way of exploiting semantic awareness was presented in Ref. [10] and further developed in Refs. [11,12]. The idea, which is also the focus of this paper, is based on the concept of error space, which is illustrated in Fig. 1. In the genotypic space, programs are represented by their syntactic structures (for instance, trees, as in Ref. [3], or any other of the existing representations). As explained above, semantics can be represented as a point

Test problems and data sets

To test the studied GP frameworks, we used four benchmark problems. The first dataset was already used in Ref. [15] and in Ref. [16]. In particular, the dataset consists of 8 independent variables (or features) and 768 instances. Each instance is related to a particular building, and the objective is to predict the energy consumption of the heating system of a particular building. An explanation of the features is reported in Table 1. For additional details about the dataset, the reader is

Nested alignment genetic programming

NAGP uses multi-individuals and thus extends the first attempt proposed in Ref. [11]. In this section, we describe the selection, mutation, and population initialization of NAGP, keeping in mind that no crossover has been defined yet for this method. Furthermore, we explain how NAGP overcomes the problems described in Section 2. In the last part of this section, we also define a variant of the NAGP method, called NAGP_β, which will also be taken into account in our experimental study.

Selection

Experimental settings

Performance of the evaluated systems was assessed by considering a k-fold cross-validation, in order to avoid any bias with respect to the splitting of the data. In particular, a repeated 10-fold cross-validation (for a total of 30 runs) was executed to ensure statistical robustness of the results. The fitness function is the root mean square error (RMSE) between target and obtained values. The parameters used are summarized in Table 3. These values were obtained after a preliminary tuning

Experimental results

In this section, we present the results of the experimental campaign. In particular, results are discussed by first comparing the performance of the different GP based systems and, subsequently, by comparing the performance of NAGP and NAGP_50 against that produced by various machine learning techniques that are commonly used to address regression problems.

Fig. 5 reports the training and test fitness achieved by the various GP systems on the four benchmarks taken into account. In particular,

Conclusions and future work

Previous work has shown the success of two sophisticated GP systems, able to exploit semantic awareness, for generating predictive models in many applicative fields. These two systems, GSGP and GSGP_LS, have clearly outperformed many of the existing methods, thus fostering themselves as the state-of-the-art technology for the generation of predictive models in those applicative areas. Nevertheless, GSGP and GSGP_LS have the important drawback of generating predictive models that, although very

Acknowledgments

This work was funded by CONACYT (Mexico) project no. FC-2015-2/944 “Aprendizaje evolutivo a gran escala” and TecNM (Mexico) project no. 6826-18-p.

References (28)

  • L. Vanneschi et al.

    A survey of semantic methods in genetic programming

    Genet. Program. Evolvable Mach.

    (2014)
  • A. Moraglio et al.

    Geometric semantic genetic programming

  • M. Castelli et al.

    Geometric semantic genetic programming with local search

  • S. Ruberto et al.

    Esagp – a semantic GP framework based on alignment in the error space

  • Cited by (13)

    • Generalized uncertainty in surrogate models for concrete strength prediction

      2023, Engineering Applications of Artificial Intelligence
    • Soft target and functional complexity reduction: A hybrid regularization method for genetic programming

      2021, Expert Systems with Applications
      Citation Excerpt :

      Another common point between our work and the work of Ni and Rockett is that in both cases two objectives are optimized at the same time. Nevertheless, we opted for nested tournaments, instead of multi-optimization, given their simplicity and the good results that they have been able to obtain in recent GP research (Vanneschi et al., 2019). In 2016, Alonso and colleagues (Alonso et al., 2016; Montaña et al., 2016) proposed a tool for controlling the complexity of GP models.

    • Multiple response optimization: Analysis of genetic programming for symbolic regression and assessment of desirability functions

      2019, Knowledge-Based Systems
      Citation Excerpt :

      There are several applications of GP in problem modeling involving non-linear equations, emphasizing a greater use in forecasting time series [16]. An important hindrance in GP application in mathematical models building lies in the computational effort required [17,18]. However, Ragallo and Pillay [19] argue that the way the fitness function is established can significantly reduce computational effort by providing better solutions in a shorter processing time.

    View all citing articles on Scopus
    View full text