Architectural and Markovian factors of echo state networks

doi:10.1016/j.neunet.2011.02.002

Neural Networks

Volume 24, Issue 5, June 2011, Pages 440-456

https://doi.org/10.1016/j.neunet.2011.02.002 Get rights and content

Abstract

Echo State Networks (ESNs) constitute an emerging approach for efficiently modeling Recurrent Neural Networks (RNNs). In this paper we investigate some of the main aspects that can be accounted for the success and limitations of this class of models. In particular, we propose complementary classes of factors related to contractivity and architecture of reservoirs and we study their relative relevance.

First, we show the existence of a class of tasks for which ESN performance is independent of the architectural design. The effect of the Markovian factor, characterizing a significant class within these cases, is shown by introducing instances of easy/hard tasks for ESNs featured by contractivity of reservoir dynamics.

In the complementary cases, for which architectural design is effective, we investigate and decompose the aspects of network design that allow a larger reservoir to progressively improve the predictive performance. In particular, we introduce four key architectural factors: input variability, multiple time-scales dynamics, non-linear interactions among units and regression in an augmented feature space. To investigate the quantitative effects of the different architectural factors within this class of tasks successfully approached by ESNs, variants of the basic ESN model are proposed and tested on instances of datasets of different nature and difficulty.

Experimental evidences confirm the role of the Markovian factor and show that all the identified key architectural factors have a major role in determining ESN performances.

Introduction

Recurrent Neural Networks (RNNs) are a widely known class of neural network models used for sequential data processing. Reservoir Computing (RC) (e.g. Lukoševičius and Jaeger, 2009, Verstraeten et al., 2007) is a denomination for a class of RNN models that are characterized by a conceptual separation between a recurrent dynamical part and a simple non-recurrent output tool. The striking feature of RC is that the recurrent part of the network can be left untrained after initialization as long as it satisfies some very easy-to-check properties. Learning is then restricted to the recurrent-free output part, leading to a very efficient RNN design. RC comprises several classes of RNN models, including the popular Echo State Networks (ESNs) (Jaeger, 2001, Jaeger and Haas, 2004), Liquid State Machines (LSMs) (Maass, Natschlager, & Markram, 2002) and other approaches such as BackPropagation Decorrelation (BPDC) (Steil, 2004, Steil, 2006) and Evolino (Schmidhuber, Wierstra, Gagliolo, & Gomez, 2007). In this paper we focus on the ESN approach.

An ESN (typically) consists in a large and sparsely connected untrained reservoir layer of recurrent neurons, connected to a simple trained readout layer of linear neurons. A valid reservoir satisfies a condition on the state dynamics called the Echo State Property (ESP).

ESNs have been successfully applied in several sequential domains, such as non-linear system identification (e.g. Jaeger, 2002a), robot control (e.g. Hertzberg et al., 2002, Ishu et al., 2004, Plöger et al., 2003), speech processing (e.g. Skowronski & Harris, 2006), time series prediction and noise modeling (e.g. Jaeger & Haas, 2004).

However, some doubts remain about the applicative success of ESNs on practical tasks, with particular regard to problems for which standard RNNs have achieved good performance (Prokhorov, 2005). Moreover a number of more theoretical open issues still remain and motivate the research effort in the ESN area. Some of the main research topics on ESNs (Jaeger, 2005) focus on the optimization of reservoirs towards specific problems (Ishu et al., 2004, Schmidhuber et al., 2007, Schrauwen et al., 2008), the role of topological organization of reservoirs (Yanbo, Le, & Haykin, 2007) and the properties of reservoirs that are responsible for successful or unsuccessful applications (Hajnal and Lorincz, 2006, Ozturk et al., 2007). In particular, this last topic, considered in relation to the reservoir architecture and its (usually) high dimensionality is of a special interest for the aims of this paper.

Other aspects concerning the optimal design of ESNs involving the setting of hyper-parameters of the reservoir, such as the input scaling, the bias, the spectral radius and the settling time (see e.g. Venayagamoorthy and Shishir, 2009, Verstraeten et al., 2010) lie out of the aims of the paper.

An important feature of ESNs is contractivity of reservoir state transition function, which always guarantees stability of the network state dynamics (regardless of other initialization aspects) and the ESP (therefore valid reservoirs). Moreover, under a contractive setting, the network state dynamics is bounded into a region of the state space with interesting properties. The characteristics of state contracting mappings have already been investigated in the contexts of Iterated Function Systems (IFSs), variable memory length predictive models, fractal theory and for describing the bias of trainable RNNs initialized with small weights (Hammer and Tiňo, 2003, Tiňo et al., 2004, Tiňo and Hammer, 2003, Tiňo et al., 2007). It is a known fact that RNNs initialized with contractive state transition functions are able to discriminate among different (recent) input histories even prior to learning (Hammer and Tiňo, 2003, Tiňo et al., 2004), according to a Markovian organization of the state dynamics. Such characterization also applies to ESNs (e.g. Tiňo et al., 2007), although in this context it has still not been completely clarified, and investigations about possibilities and limitations of the ESN approach due to a Markovian nature of state dynamics are needed.

In particular, ESNs exploit the consequences of Markovianity of state dynamics in combination with a typically high dimensionality and non-linearity of the recurrent reservoir. The importance of a richly varied ESN state dynamics within a large number of reservoir units has been theoretically and experimentally pointed out in ESN literature (e.g. Jaeger, 2001, Jaeger and Haas, 2004, Tiňo et al., 2007, Verstraeten et al., 2007), although neither completely analyzed nor empirically evaluated. Moreover, a high dimensional reservoir constitutes the basis to argue a universal approximation property with bounded memory of ESNs, even in presence of a linear readout layer (Tiňo et al., 2007). Indeed, although the Markovian organization of the reservoir state space rules the dynamics of ESNs, it is known (e.g. Jaeger, 2002c, Makula et al., 2004, Verstraeten et al., 2007) that large reservoirs show a goodness of predictive results on sequence tasks which is almost proportional to the number of reservoir units. The Markovian characterization of the reservoir state space seems therefore not sufficient to completely explain the performances of the model.

These points open interesting issues, motivating our investigation on the factors which may influence the model behavior and on the assessment of their relative importance. In particular, adopting a critical perspective as in Prokhorov (2005), we are interested in the complementary investigation of characterizing (and not only of identifying) classes of tasks to which ESNs can be successfully/unsuccessfully applied.

In this paper, to approach the mentioned investigations still lacking in the ESN literature, Markovianity of reservoir dynamics is directly considered in relation to the issue of identifying relevant factors which might determine success and limitations of the ESN model and is specifically studied in relation to other architectural factors of network design.

Complementarily, on tasks for which ESNs show good results, we pose the question of identifying the sources of richness in reservoir dynamics that can be fruitfully exploited in terms of predictive accuracy (performance in the following) of the model. The aspects of high dimensionality and non-linearity of reservoirs are studied by asking to which extent performance improvements obtained by increasing the number of recurrent reservoir units is due to a larger number of non-linear recurrent dynamics or to the effect of the possibility to regress an augmented feature space. We also propose a study of different architectural factors of ESN design which allow the reservoir units to effectively diversify their activations and lead to an enrichment of the state dynamics. This is done by measuring and comparing the effects on the performance due to the inclusion of individual factors and combination of factors in the design of ESNs. This study also investigates the effectiveness on ESN performance of the characteristic of sparsity among reservoir units connections, which is commonly claimed to be a crucial feature of ESN modeling.

Recently, there has been a growing interest in studying architectural variants and simplifications of the standard ESN model. In particular, a number of reservoir models with an even simpler architecture than ESN have been proposed. A model with self-recurrent connections only, linear reservoir neurons and unitary input-to-reservoir weights, the so called “Simple ESN” (SESN) was presented in Fette and Eggert (2005). A feed-forward variant of ESN, the “Feed-Forward ESN” (FFESN), was introduced in Cernanský and Makula (2005), while in Cernanský and Tiňo (2008) a further simplification of the model with reservoir units organized into a tapped delay line was proposed. Our work, being directed towards a deeper understanding of the comparative predictive performance effects of different architectural factors of ESN design, can also be intended in this research direction as well.

According to the motivations described above, in short, the aims of this paper can be summarized as follows. We outline complementary cases of the ESN behavior. First, independently of the architectural network design (and reservoir dimensionality), we provide a characterization of contractive ESNs, captured by the concept of Markovian factor. Then, we identify relevant factors of architectural ESN design that allow a larger dimensional reservoir to be effective in terms of network predictive performance. In the approach adopted in the paper, the existence of such cases and the relative relevance of the proposed factors are concretely assessed by specific instances where the effect can be empirically evaluated.

The rest of the paper is organized as follows. Section 2 reviews the ESN model in the framework of RNN processing of sequential data. Section 3 focuses on the Markovian organization of reservoir state dynamics. Section 4 introduces the identified architectural factors of ESN design and the corresponding architectural variants proposed to the standard ESN model. Experimental results are illustrated in Section 5, by firstly discussing the influence of Markovianity on ESN performance, and then by assessing the relevance of the proposed architectural factors on tasks of common consideration in the ESN literature, showing a significant effect of the reservoir dimensionality. Finally, Section 6 summarizes the main general results of the paper.

Section snippets

Recurrent and echo states models for sequence processing

In this paper we are interested in processing sequence domains. In the following, an input element and an input sequence are represented by $u$ and $s (u)$ , respectively. In particular, if $s (u)$ is of length $n$ , then we can show its elements by using the notation $s (u) = [u (1), u (2), \dots, u (n)]$ , where $u (1)$ is the oldest entry and $u (n)$ is the most recent one. An empty input sequence is denoted by $s (u) = []$ . The concatenation of the sequences $s (u)$ and $s (v)$ is denoted by $s (u) \cdot s (v)$ . An output element and an output

Markovian factor of ESNs

For the aims of this paper, we say that a state model on sequence domains has a state space organization of a Markovian nature whenever the states assumed in correspondence of two different input sequences sharing a common suffix are close to each other proportionally to the length of the common suffix. This Markovian characterization of the state space dynamics is referred in this paper as the Markovian factor. A class of models on sequences on which the concept of Markovian factor applies is

Architectural factors of ESN design

Even though reservoirs dynamics are governed by the Markovian factor, there still are several other factors, related to the architectural design, which might influence the richness of the Markovian dynamics and thus the performance of ESNs. Indeed ESNs with the same contractive coefficient but different topologies can lead to different results on the same task. At the same time, the richness of the dynamics is related to the growth of the number of units (reservoir dimensionality). It is

Experimental results

The experiments presented in the following aim at testing the empirical effects of the factors introduced in Sections 3 Markovian factor of ESNs, 4 Architectural factors of ESN design. Firstly, in Section 5.2, we use two tasks to show the condition underlying the ESN state space organization, i.e. the Markovian assumption. Under such extreme condition we show that the Markovian factor dominates the behavior of the model and then complex architectures are even not necessary. In particular, the

Conclusions

Markovianity and high dimensionality (along with non-linearity) of the reservoir state space representation have revealed a relevant influence on the behavior and performance of the ESN model. Such factors have a complementary role and characterize distinct classes of tasks, for which we have provided representative instances. In the following the findings are detailed distinguishing the case for which Markovianity has a prominent role independent of the architectural design, and the

References (47)

G. Huang et al.
Extreme learning machine: theory and applications
Neurocomputing
(2006)
M. Lukoševičius et al.
Reservoir computing approaches to recurrent neural network training
Computer Science Review
(2009)
B. Schrauwen et al.
Improving reservoirs using intrinsic plasticity
Neurocomputing
(2008)
H.T. Siegelmann et al.
Turing computability with neural nets
Applied Mathematics Letters
(1991)
J.J. Steil
Online stability of backpropagation–decorrelation recurrent learning
Neurocomputing
(2006)
G.K. Venayagamoorthy et al.
Effects of spectral radius and settling time in the performancee of echo state networks
Neural Networks
(2009)
D. Verstraeten et al.
An experimental unification of reservoir computing methods
Neural Networks
(2007)
A.F. Atiya et al.
New results on recurrent network training: unifying the algorithms and accelerating convergence
IEEE Transactions on Neural Networks
(2000)
Bengio, Y., Frasconi, P., & Simard, P. (1993). The problem of learning long-term dependencies in recurrent networks. In...
J. Boedecker et al.
Initialization and self-organized optimization of recurrent neural network connectivity
HFSP Journal
(2009)

M. Buehner et al.

A tighter bound for the echo state property

IEEE Transactions on Neural Networks

(2006)

Butcher, J., Verstraeten, D., Schrauwen, B., Day, C., & Haycock, P. (2010). Extending reservoir computing with random...

Cernanský, M., & Makula, M. (2005). Feed-forward echo state networks. In Proceedings of the IEEE international joint...

Cernanský, M., & Tiňo, P. (2007). Comparison of echo state networks with simple recurrent networks and variable-length...

M. Cernanský et al.

Predictive modeling with echo state networks

T. Cover

Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition

IEEE Transactions on Electronic Computers

(1965)

L. Feldkamp et al.

A signal processing framework based on dynamic neural networks with application to problems in adaptation, filtering, and classification

Proceedings of the IEEE

(1998)

G. Fette et al.

Short term memory and pattern matching with simple echo state networks

Gallicchio, C., & Micheli, A. (2009). On the predictive effects of Markovian and architectural factors of echo state...

Hajnal, M., & Lorincz, A. (2006). Critical echo state networks. In Proceedings of the international conference on...

B. Hammer et al.

Recurrent neural networks with small weights implement definite memory machines

Neural Computation

(2003)

J. Hertzberg et al.

Learning to ground fact symbols in behavior-based robots

Ishu, K., van der Zant, T., Becanovic, V., & Ploger, P. (2004). Identification of motion with echo state network. In...

Cited by (164)

Euler State Networks: Non-dissipative Reservoir Computing
2024, Neurocomputing
Inspired by the numerical solution of ordinary differential equations, in this paper, we propose a novel Reservoir Computing (RC) model, called the Euler State Network (EuSN). The presented approach makes use of forward Euler discretization and antisymmetric recurrent matrices to design reservoir dynamics that are both stable and non-dissipative by construction.
Our mathematical analysis shows that the resulting model is biased towards a unitary effective spectral radius and zero local Lyapunov exponents, intrinsically operating near the edge of stability. Experiments on long-term memory tasks show the clear superiority of the proposed approach over standard RC models in problems requiring effective propagation of input information over multiple time steps. Furthermore, results on time-series classification benchmarks indicate that EuSN can match (or even exceed) the accuracy of trainable Recurrent Neural Networks, while retaining the training efficiency of the RC family, resulting in up to $\approx 464$ -fold savings in computation time and $\approx 1750$ -fold savings in energy consumption. At the same time, our results on time-series modeling tasks show competitive results against standard RC when the architecture is complemented by direct input-readout connections.
DHESN: A deep hierarchical echo state network approach for algal bloom prediction
2024, Expert Systems with Applications
A deep hierarchical echo state network (DHESN) is designed for rectifying the shortcomings of the shallow coupled structure with less reservoir dynamics. This design is with reference to algal bloom which is a complex ecological phenomenon. Accurate prediction of algal bloom can reduce the environmental impact and economic loss. Since the formation of algal bloom has chaotic characteristics, the ESN has been employed to realize its prediction function. First, the candidate variables with strong causal relationship have been screened by transfer entropy, and the redundant variables is eliminated. Then, a hierarchical reservoir structure is established that is inspired by the hierarchical characteristics from the brain. The hierarchical reservoir has realized the connection between the representative nodes of each subreservoir, and improved the information processing ability of the reservoir. Finally, the pruning and compression of the output weights have been realized by the elastic regularization method, which improves the robustness of the prediction model. The simulation results demonstrate that the DHESN has appreciable prediction accuracy in both the chaotic and the public algal bloom datasets. The DHESN contains richer dynamic characteristics, and can realize the self-organization of the network structure. It provides a novel idea to realize the prediction model of algal bloom with a high accuracy and low complexity.
Echo state networks for the recognition of type 1 Brugada syndrome from conventional 12-LEAD ECG
2024, Heliyon
Artificial Intelligence (AI) applications and Machine Learning (ML) methods have gained much attention in recent years for their ability to automatically detect patterns in data without being explicitly taught rules. Specific features characterise the ECGs of patients with Brugada Syndrome (BrS); however, there is still ambiguity regarding the correct diagnosis of BrS and its differentiation from other pathologies.
This work presents an application of Echo State Networks (ESN) in the Recurrent Neural Networks (RNN) class for diagnosing BrS from the ECG time series.
12-lead ECGs were obtained from patients with a definite clinical diagnosis of spontaneous BrS Type 1 pattern (Group A), patients who underwent provocative pharmacological testing to induce BrS type 1 pattern, which resulted in positive (Group B) or negative (Group C), and control subjects (Group D). One extracted beat in the V2 lead was used as input, and the dataset was used to train and evaluate the ESN model using a double cross-validation approach. ESN performance was compared with that of 4 cardiologists trained in electrophysiology.
The model performance was assessed in the dataset, with a correct global diagnosis observed in 91.5 % of cases compared to clinicians (88.0 %). High specificity (94.5 %), sensitivity (87.0 %) and AUC (94.7 %) for BrS recognition by ESN were observed in Groups A + B vs. C + D.
Our results show that this ML model can discriminate Type 1 BrS ECGs with high accuracy comparable to expert clinicians. Future availability of larger datasets may improve the model performance and increase the potential of the ESN as a clinical support system tool for daily clinical practice.
Physics-informed hierarchical echo state network for predicting the dynamics of chaotic systems
2023, Expert Systems with Applications
Echo state network (ESN), a type of special recurrent neural network, has gained attention for its simplicity and low computational cost, making it commonly used for data-driven prediction of complex dynamical systems. However, in cases of insufficient or poor-quality data, data-driven approaches can suffer from low prediction accuracy caused by overfitting. To address this problem, a physics-informed hierarchical echo state network (Pi-HESN) is proposed for predicting the dynamics of chaotic systems. Firstly, the Pi-HESN can capture the latent evolutionary patterns hidden in the dynamical systems by processing data layer by layer in stacked reservoirs. Secondly, the Pi-HESN integrates data and physical laws in a unified way, incorporating prior physical knowledge into the objective function to ensure basic physical principles are respected. The combination of data-based and knowledge-based approaches in Pi-HESN improves model generalization, alleviates the shortage of training data, and ensures physical consistency of results. Experiments on four classical chaotic systems illustrate that the proposed Pi-HESN outperforms the original ESN and existing hierarchical ESN-based models in accuracy and predictability horizon.
Addressing heterophily in node classification with graph echo state networks
2023, Neurocomputing
Node classification tasks on graphs are addressed via fully-trained deep message-passing models that learn a hierarchy of node representations via multiple aggregations of a node’s neighbourhood. While effective on graphs that exhibit a high ratio of intra-class edges, this approach poses challenges in the opposite case, i.e. heterophily, where nodes belonging to the same class are usually further apart. In graphs with a high degree of heterophily, the smoothed representations based on close neighbours computed by convolutional models are no longer effective. So far, architectural variations in message-passing models to reduce excessive smoothing or rewiring the input graph to improve longer-range message passing have been proposed. In this paper, we address the challenges of heterophilic graphs with Graph Echo State Network (GESN) for node classification. GESN is a reservoir computing model for graphs, where node embeddings are recursively computed by an untrained message-passing function. Our experiments show that reservoir models are able to achieve better or comparable accuracy with respect to most fully trained deep models that implement ad hoc variations in the architectural bias or perform rewiring as a preprocessing step on the input graph, with an improvement in terms of efficiency/accuracy trade-off. Furthermore, our analysis shows that GESN is able to effectively encode the structural relationships of a graph node, by showing a correlation between iterations of the recursive embedding function and the distribution of shortest paths in a graph.
Evolutionary Echo State Network: A neuroevolutionary framework for time series prediction
2023, Applied Soft Computing
From one side, Evolutionary Algorithms have enabled enormous progress over the last years in the optimization field. They have been applied to a variety of problems, including optimization of Neural Networks’ architectures. On the other side, the Echo State Network (ESN) model has become increasingly popular in time series prediction, for instance when modeling chaotic sequences. The network has numerous hidden neurons forming a recurrent topology, so-called reservoir, which is fixed during the learning process. Initial reservoir design has mostly been made by human experts; as a consequence, it is prone to errors and bias, and it is a time consuming task.
In this paper, we introduce an automatic general neuroevolutionary framework for ESNs, on which we develop a computational tool for evolving reservoirs, called EVOlutionary Echo State Network (EvoESN). To increase efficiency, we represent the large matrix of reservoir weights in the Fourier space, where we perform the evolutionary search strategy. This frequency space has major advantages compared with the original weight space. After updating the Fourier coefficients, we go back to the weight space and perform a conventional training phase for full setting the reservoir architecture. We analyze the evolutionary search employing genetic algorithms and particle swarm optimization, obtaining promising results with the latter over three well-known chaotic time series. The proposed framework leads fast to very good results compared with modern ESN models. Hence, this contribution positions an important family of recurrent systems in the promising neuroevolutionary domain.

View all citing articles on Scopus

View full text

Architectural and Markovian factors of echo state networks

Abstract

Introduction

Section snippets

Recurrent and echo states models for sequence processing

Markovian factor of ESNs

Architectural factors of ESN design

Experimental results

Conclusions

Neurocomputing

Computer Science Review

Neurocomputing

Applied Mathematics Letters

Neurocomputing

Neural Networks

Neural Networks

New results on recurrent network training: unifying the algorithms and accelerating convergence

IEEE Transactions on Neural Networks

Initialization and self-organized optimization of recurrent neural network connectivity

HFSP Journal

A tighter bound for the echo state property

IEEE Transactions on Neural Networks

Predictive modeling with echo state networks

Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition

IEEE Transactions on Electronic Computers

A signal processing framework based on dynamic neural networks with application to problems in adaptation, filtering, and classification

Proceedings of the IEEE

Short term memory and pattern matching with simple echo state networks

Recurrent neural networks with small weights implement definite memory machines

Neural Computation

Learning to ground fact symbols in behavior-based robots