Nonlinear process modelling using echo state networks optimised by covariance matrix adaption evolutionary strategy

https://doi.org/10.1016/j.compchemeng.2020.106730Get rights and content

Abstract

Echo state networks (ESN) have been shown to be an effective alternative to conventional recurrent neural networks (RNNs) due to its fast training process and good performance in dynamic system modelling. However, the performance of ESN can be affected by the randomly generated reservoir. This paper presents nonlinear process modelling using ESN optimized by covariance matrix adaption evolutionary strategy (CMA-ES). CMA-ES is used to optimize the structural parameters of ESN: reservoir size, spectral radius, and leak rate. The proposed method is applied to three case studies: modelling a time series, modelling a conic tank, and modelling a fed-batch penicillin fermentation process. The results are compared with those from the original ESN, long short-term memory network, GA-ESN (ESN optimized by genetic algorithm), and feedforward neural networks. It is shown that the proposed method gives much better performance than the original ESN and other networks on all the three case studies.

Introduction

Recurrent neural networks (RNNs) are very useful in solving complicated temporal machine learning and engineering tasks. RNNs have cycles and loops of synaptic connections which are more similar to synapses system in human brains. In contrast to feedforward neural networks (FNN), the outputs of some neurons in RNNs are fed back to their inputs or other neurons in the previous layers. So RNNs exhibit dynamic temporal behaviour and process arbitrary sequences of inputs by its internal memory. Nonetheless, RNNs still have some limitations in real-world applications including the possibility of bifurcation, no global convergence guarantee, and more importantly the slow training process due to using gradient descent methods (Antonelo et al., 2017). A new type of RNNs named echo state networks (ESNs) was proposed in (Jaeger and Haas, 2004). The training of ESNs does not require backpropagation through time (BPTT) and can be done very quickly. A similar type of RNNs called liquid state machines (LSM) was reported (Maass et al., 2002). In addition, a new learning approach for recurrent neural network, backpropagation-decorrelation rule, was proposed (Steil, 2006). These three approaches are now often referred to as reservoir computing (RC). In RC, the networks are generally constituted by two main parts, the reservoir which is a high-dimensional recurrent pool of neurons with randomly generated and fixed synaptic weights and a linear adaptive readout output layer which maps the internal reservoir states to the network output. These types of RNNs have some significant advantages like the rapid training procedure, guaranteed optimality in the least squares sense, and producing excellent results. Some RC applications have been reported in the machine learning and engineering fields such as dynamic pattern classification (Jaeger et al, 2007a), speech recognition of isolated digits (Verstraeten et al., 2005), and the computation of highly nonlinear functions on the instantaneous rates of spike trains (Maass et al., 2004).

Even though reservoir computing has been successfully utilized for solving real-world tasks, it is still difficult to tune the reservoir parameters. ESN has several limitations in practical applications, such as the reservoir properties are hard to be understood and the initialization of reservoir parameters, for instance the size and spectral radius of the reservoir, is usually done by experience. In addition, as the connections of the internal reservoir and the input weights are generated randomly, the reservoirs are often determined by trial and errors, so reservoir optimization is generally a challenge. In recent years, improvements of the original ESN have been made in three main areas: the development of neurons type, the improvement of reservoir topology, and using some optimization methods to optimize key structural parameters of ESN. For the first aspect, spiking neurons (Verstraeten et al., 2007) can be used to increase the prediction accuracy in modelling time-series data and pattern classification (Jaeger et al., 2007a). Enriching the reservoir topologies gets the ESN suboptimal for some specific tasks and many efficient reservoir construction schemes are proposed, such as small world topology (Deng and Zhang, 2007), scale-free topology (Cui et al., 2012), and simple cycle reservoir (Rodan and Tino, 2011), which are shown to give better performance than the original ESN.

As optimizing reservoirs is generally challenging and checking the performance of a resulting ESN is relatively inexpensive, evolutionary optimization techniques such as evolutionary strategy (ES) and genetic algorithm (GA) are a natural strategy for this optimisation task (Lukoševičius and Jaeger, 2009). Evolutionary strategy is a useful global searching method for optimization because they do not possess the limitations found in the traditional methods, thus evolutionary strategies can be used to build a good ESN model. Optimization of ESN using evolutionary strategies can generally be carried out using the following different approaches: optimization of the reservoir topology, optimization of the connection weights, and optimization of the reservoir parameters. If the optimization method operates on the connection weights directly, the search space will be quite large, moreover, the variance of the performance across different reservoirs with the same spectral radius is still quite substantial, which is clearly undesirable (Schrauwen et al., 2008). In some earlier works on ESN reservoir optimization, several evolutionary algorithms have been presented, including GA (Ferreira and Ludermir, 2011), differential evolution (DE) (Otte et al., 2016), particle swarm optimization (PSO) (Chouikhi et al., 2015), and Evolino (Schmidhuber et al., 2007). Additionally, other metaheuristic methods were used to optimize the reservoir global parameters and topology (Ferreira and Ludermir, 2010; Ferreira et al., 2013).

In this paper, a new method for optimizing ESN using covariance matrix adaptation evolution strategy (CMA-ES) is proposed. CMA-ES is an efficient and widely used metaheuristic approach to search optimal regions on complex spaces with the advantage of being invariant against order-preserving transformations of the fitness function value and in particular against rotation and translation of the search space (Hansen, 2006). Three global reservoir parameters, reservoir size, spectral radius factor, and leak rate, are optimized by CMA-ES in this paper. The reservoir size is the number of neurons in the reservoir layer. Spectral radius is the key factor to maintain echo state property (ESP). Leak rate determines the time scales and status updating scale in the reservoir.

The paper is organized as follows. Section 2 introduces ESN and CMA-ES and presents the proposed training algorithm. Three case studies are presented in Section 3. Section 4 presents the modelling results. Finally, Section 5 gives some conclusions.

Section snippets

Echo state networks

Echo state networks are a type of recurrent neural networks which are quite different from classical RNNs in that stability can be maintained if the condition called ‘echo state property’ is satisfied. An ESN is composed of a reservoir and a linear output layer which maps the reservoir states to the network output. Fig. 1 shows the original ESN where all input neurons are connected to the reservoir neurons (the hidden layer) which are totally linked with the output layer, while the output

Application examples

In this section, three modelling case studies are introduced to test the performance of CMA-ES-ESN. The three case studies are Mackey-Glass time series prediction, modelling of a water tank, and modelling of a benchmark industrial fed-batch fermentation process. To evaluate the effectiveness of CMA-ES-ESN, the results are compared with those from the original ESN, GA-ESN, long short-term memory networks, and feedforward neural networks. In the original ESN the reservoir size, leak rate, and

Results

The following parameters are used:

  • ESN: Every reservoir node has the activation function tanh(x)=sinh(x)cosh(x)=exexex+ex; the input and backwards weights, Win and Wback, are generated as random numbers uniformly distributed in the range [−0.5, 0.5], and similarly the internal weights W are created by uniform distribution in the range [−0.5, 0.5] with different sparse density and reservoir size. In the standard ESN, the three structural parameters, reservoir size, leak rate and spectral

Conclusions

Modelling nonlinear dynamic processes using echo state networks optimised by covariance matrix adaption evolution strategy is proposed in this paper. ESN has been shown to be a rapid, efficient and accurate dynamic system modelling algorithm, but their performance can be influenced by the setting of structural parameters, i.e. reservoir size, leak rate and spectral radius factor. By using CMA-ES to optimise these structural parameters, much enhanced modelling performance can be achieved. The

Declaration of Competing Interest

The authors declare no conflict of interest.

Cited by (18)

  • Open benchmarks for assessment of process monitoring and fault diagnosis techniques: A review and critical analysis

    2022, Computers and Chemical Engineering
    Citation Excerpt :

    In the context of fault detection and diagnosis, IndPenSim was used as benchmark to perform many investigations: a sparse parallel factor decomposition technique was used to model the sparse three-way data resulting from batch data (Luo et al., 2017); an extension of PCA, in combination with a fuzzy clustering technique, was proposed to perform phase identification and data segmentation in batch processes (Tanatavikorn and Yamashita, 2017); a methodology was proposed to integrate knowledge data to sparse models using correlations extracted from process flow diagrams (Luo and Bao, 2018); a sparsity strategy was applied to parallel factor decomposition in order to improve the interpretability of the model (Luo et al., 2019); a method based on Gaussian sampling was proposed to take into account the correlation between variables and autocorrelations (Alshraideh et al., 2021); a time alignment method was used to synchronize uneven batches based on probability distributions (Lee et al., 2021); and a unsupervised deep learning approach based on fault detection rate maximization was proposed (Agarwal et al., 2022). It is worth mentioning that most applications based on IndPenSim were related to empirical process modeling but not directly related to development and implementation of process monitoring and fault diagnosis (Liu and Zhang, 2020). FCC Fractionator is a model recently proposed for simulating a refining operation comprised of a fluidized bed catalytic cracker and a fractionator (Santander et al., 2022).

  • Echo State network based soft sensor for Monitoring and Fault Detection of Industrial Processes

    2021, Computers and Chemical Engineering
    Citation Excerpt :

    Additionally, tracking of internal state values also provides indications about the need to retrain the model, which can be very important since the complex and real industrial processes tend to evolve naturally to other operational conditions, without necessarily indicating a fault behavior, either due to aging of equipment or changes in the quality of raw materials, among many other possible reasons (Kruger & Xie, 2012). The proposed monitoring system is initially tested with synthetic time series datasets generated by the Mackey-Glass Equation (Mackey & Glass, 1977), which is a widely used benchmark in the field of nonlinear dynamic analysis, more specifically in time series forecasting, being especially used in the context of ESNs (Jaeger & Haas, 2004; Wang & Yan, 2015; Løkse et al., 2017; Liu & Zhang, 2020). In addition, ESNs are used here for the first time to detect faults in real assets of the oil and gas industry, monitoring the operation conditions of equipment installed in a plant of Petrobras (Petróleo Brasileiro S.A.).

  • Developing accurate data-driven soft-sensors through integrating dynamic kernel slow feature analysis with neural networks

    2021, Journal of Process Control
    Citation Excerpt :

    Neural networks have always been one of the most used techniques for soft-sensing [8,9] ever since they were first used for inferential estimation by Willis et al. [7]. In more recent times, neural networks have evolved and extensions such as deep learning [10,11], ensembles [12–14] and echo state networks [15–17] have been widely used for soft-sensing applications. Other machine learning algorithms that are commonly used include support vector machines [18] and extreme learning machine [19,20].

  • Lights and shadows in Evolutionary Deep Learning: Taxonomy, critical methodological analysis, cases of study, learned lessons, recommendations and challenges

    2021, Information Fusion
    Citation Excerpt :

    The parameters of the rest of recurrent neurons (the reservoir) are randomly initialized subject to some stability constraints, and kept fixed while the readout layer is trained [302]. Some works have been reported in the last couple of years dealing with the optimization of Reservoir Computing models, such as the composition of the reservoir, connectivity and hierarchical structure of Echo State Networks via Genetic Algorithms [303], or the structural hyper-parameter optimization of Liquid State Machines [304,305] and Echo State Networks [306] using an adapted version of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) solver. The relatively recent advent of Deep versions of Reservoir Computing models [307] unfolds an interesting research playground over which to propose new bio-inspired solvers for topology and hyper-parameter optimization.

  • A Neuroevolutionary Approach for System Identification

    2024, Journal of Control, Automation and Electrical Systems
View all citing articles on Scopus
View full text