Abstract
Echo State Networks (ESN) are a class of recurrent neural networks that can learn to regress on or classify sequential data by keeping the recurrent component random and training only on a set of readout weights, which is of interest to the current edge computing and neuromorphic community. However, they have struggled to perform well with regression and classification tasks and therefore, could not compete in performance with traditional RNNs, such as LSTM and GRU networks. To address this limitation, we have developed a novel hybrid network, called Parallelized Deep Readout Echo State Network (PDR-ESN) that combines the deep learning readout with a fast random recurrent component, with multiple ESNs computing in parallel. We show the PDR-ESN architecture allows for different configurations of the sub-reservoirs, leading to different variants which we explore. Our findings suggest that different variants of the PDR-ESN offer various advantages in different task domains, with some performing better in regression and others in classification. In all cases, our PDR-ESN architecture outperforms the corresponding gradient-based LSTM and GRU architectures in terms of training time as well as accuracy. To further evaluate, we also compared against a Transformer encoder classifier, where the PDR-ESN outperformed on all tasks. We conclude that our proposed network demonstrates a good trade-off between the fast training times of traditional ESNs with the accuracy of deep backpropagation for real-world tasks. We hope that this architecture offers an alternative approach to sequential processing for edge computing as well as more biologically-realistic network development.







Similar content being viewed by others
References
Vaswani A, et al. Attention is all you need. In: Guyon I, et al., editors. Advances in neural information processing systems, vol. 30. New York: Curran Associates Inc; 2017.
Oord A et al. Wavenet: a generative model for raw audio. 2016; arXiv preprint arXiv:1609.03499.
Devlin J, Chang M-W, Lee K, Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding. 2018; arXiv preprint arXiv:1810.04805.
Douglas RJ, Martin KA. Recurrent neuronal circuits in the neocortex. Curr Biol. 2007;17(13):R496–500.
Lukoševičius M, Jaeger H. Reservoir computing approaches to recurrent neural network training. Comput Sci Rev. 2009;3(3):127–49.
van Bergen RS, Kriegeskorte N. Going in circles is the way forward: the role of recurrence in visual inference. arXiv preprint arXiv:2003.12128 (2020).
Khrulkov V, Novikov A, Oseledets I. Expressive power of recurrent neural networks. arXiv preprint 2017 arXiv:1711.00811.
Jaeger H. The, “echo state’’ approach to analysing and training recurrent neural networks-with an erratum note. Bonn, Germany: German Natl Res Center Inform Technol GMD Tech Rep. 2010;148(34):13.
Polydoros A, Nalpantidis L, Krüger V. Advantages and limitations of reservoir computing on model learning for robot control. In: IROS Workshop on Machine Learning in Planning and Control of Robot Motion, Hamburg, Germany 2015.
Ma Q et al. Convolutional multitimescale echo state network. IEEE Trans Cybern 2019.
Zhao Z et al. Combining forward with recurrent neural networks for hourly air quality prediction in northwest of China. Environ Sci Pollut Res Int 2020.
Schrauwen B, Wardermann M, Verstraeten D, Steil JJ, Stroobandt D. Improving reservoirs using intrinsic plasticity. Neurocomputing. 2008;71(7–9):1159–71.
Xue F, Li Q, Li X. Reservoir computing with both neuronal intrinsic plasticity and multi-clustered structure. Cogn Comput. 2017;9(3):400–10.
Inubushi M, Yoshimura K. Reservoir computing beyond memory-nonlinearity trade-off. Sci Rep. 2017;7(1):1–10.
Ferreira AA, Ludermir TB. Genetic algorithm for reservoir computing optimization. IN: 2009 International Joint Conference on Neural Networks. 2009; 811–815.
Woodward A, Ikegami T. A reservoir computing approach to image classification using coupled echo state and back-propagation neural networks. In: International conference image and vision computing, Auckland, New Zealand 2011; 543–458.
Bianchi FM, Scardapane S, Løkse S, Jenssen R. Bidirectional deep-readout echo state networks. arXiv preprint 2017 arXiv:1711.06509.
Pathak J, Hunt B, Girvan M, Ott E. Model-free prediction of large spatiotemporally chaotic systems from data: a reservoir computing approach. Phys Rev Lett. 2018;120(2):024102.
Qiao J, Li F, Han H, Li W. Growing echo-state network with multiple subreservoirs. IEEE Trans Neural Netw Learn Syst. 2016;28(2):391–404.
Jeong D-H, Jeong J. In-ear EEG based attention state classification using echo state network. Brain Sci. 2020;10(6):321.
Kostas D, Aroca-Ouellette S, Rudzicz F. Bendr: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data. Front Hum Neurosci 2021;15.
Wang P, Jiang A, Liu X, Shang J, Zhang L. Lstm-based EEG classification in motor imagery tasks. IEEE Trans Neural Syst Rehab Eng. 2018;26(11):2086–95.
Xing X, et al. Sae+ lstm: a new framework for emotion recognition from multi-channel EEG. Front neurorobot. 2019;13:37.
Sussillo D, Abbott LF. Generating coherent patterns of activity from chaotic neural networks. Neuron. 2009;63(4):544–57.
DePasquale B, Cueva CJ, Rajan K, Escola GS, Abbott L. Full-force: a target-based method for training recurrent networks. PloS One. 2018;13(2):e0191527.
Bouchacourt F, Buschman TJ. A flexible model of working memory. Neuron. 2019;103(1):147–60.
Ganguli S, Huh D, Sompolinsky H. Memory traces in dynamical systems. Proc Natl Acad Sci. 2008;105(48):18970–5.
Charles AS, Yin D, Rozell CJ. Distributed sequence memory of multidimensional inputs in recurrent networks. J Mach Learn Res. 2017;18(1):181–217.
Charles AS, Yap HL, Rozell CJ. Short-term memory capacity in networks via the restricted isometry property. Neural Comput. 2014;26(6):1198–235.
Walter F, Röhrbein F, Knoll A. Computation by time. Neural Process Lett. 2016;44(1):103–24.
Izhikevich EM, Gally JA, Edelman GM. Spike-timing dynamics of neuronal groups. Cereb Cortex. 2004;14(8):933–44.
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.
Ghosh-Dastidar S, Adeli H. Spiking neural networks. Int J Neural Syst. 2009;19(04):295–308.
Werbos PJ. Backpropagation through time: What it does and how to do it. Proc IEEE. 1990;78(10):1550–60.
Bellec G et al. Biologically inspired alternatives to backpropagation through time for learning in recurrent neural networks. arXiv preprint 2019 arXiv:1901.09049.
Monner D, Reggia JA. A generalized LSTM-like training algorithm for second-order recurrent neural networks. Neural Netw. 2012;25:70–83.
O’Reilly RC. Biologically plausible error-driven learning using local activation differences: the generalized recirculation algorithm. Neural Comput. 1996;8(5):895–938.
Pineda FJ. Generalization of back-propagation to recurrent neural networks. Phys Rev Lett. 1987;59(19):2229.
Maass W. Liquid state machines: motivation, theory, and applications. 2011;275–296.
Tino P. Dynamical systems as temporal feature spaces. J Mach Learn Res. 2020;21(44):1–42.
Jaeger H. Discovering multiscale dynamical features with hierarchical echo state networks. Tech. Rep. Bremen: Jacobs University Bremen; 2007.
Tong Z, Tanaka, G. Reservoir computing with untrained convolutional neural networks for image recognition. 2018;1289–1294.
Yildiz IB, Jaeger H, Kiebel SJ. Re-visiting the echo state property. Neural Netw. 2012;35:1–9.
Ferreira AA, Ludermir TB. Comparing evolutionary methods for reservoir computing pre-training. In: The 2011 International Joint Conference on Neural Networks. 2011;283–290.
Chouikhi N, Ammar B, Rokbani N, Alimi AM. Pso-based analysis of echo state network parameters for time series forecasting. Appl Soft Comput. 2017;55:211–25.
Basterrech S, Alba E, Snášel V. An experimental analysis of the echo state network initialization using the particle swarm optimization. In: Sixth World Congress on Nature and Biologically Inspired Computing (NaBIC). 2014;214–219.
Neofotistos G, et al. Machine learning with observers predicts complex spatiotemporal behavior. Front Phys. 2019;7:24.
Bianchi FM, De Santis E, Rizzi A, Sadeghian A. Short-term electric load forecasting using echo state networks and PCA decomposition. IEEE Access. 2015;3:1931–43.
Jaeger H, Haas H. Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science. 2004;304(5667):78–80.
Antonelo EA, Schrauwen B. On learning navigation behaviors for small mobile robots with reservoir computing architectures. IEEE Trans Neural Netw Learn Syst. 2014;26(4):763–80.
Chang H, Futagami K. Convolutional reservoir computing for world models. arXiv preprint 2019 arXiv:1907.08040.
Soures N, Kudithipudi D. Deep liquid state machines with neural plasticity for video activity recognition. Front Neurosci. 2019;13:686.
Rypma B, D’Esposito M. The roles of prefrontal brain regions in components of working memory: effects of memory load and individual differences. Proc Natl Acad Sci. 1999;96(11):6558–63.
Jensen J, et al. Separate brain regions code for salience vs. valence during reward prediction in humans. Hum Brain Mapp. 2007;28(4):294–302.
MacKay DJ, Mac Kay DJ. Information theory, inference and learning algorithms. Cambridge: Cambridge University Press; 2003.
French RM. Catastrophic forgetting in connectionist networks. Trends Cogn Sci. 1999;3(4):128–35.
Masse NY, Grant GD, Freedman DJ. Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization. Proc Natl Acad Sci. 2018;115(44):E10467–75.
Rikhye RV, Gilra A, Halassa MM. Thalamic regulation of switching between cortical representations enables cognitive flexibility. Nat Neurosci. 2018;21(12):1753–63.
LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.
Rumelhart DE, Durbin R, Golden R, Chauvin Y. Backpropagation: The basic theory. Backpropagation: theory, architectures and applications. 1995;1–34.
Cho K. et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint 2014 arXiv:1406.1078.
Wan J et al. Chalearn looking at people RGB-D isolated and continuous datasets for gesture recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2016;56–64.
Foundation, W. Aircraft marshalling (2019). https://en.wikipedia.org/wiki/Aircraft_marshalling.
Cao Z, Hidalgo Martinez G, Simon T, Wei S, Sheikh YA. Openpose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans Pattern Anal Mach Intell 2019.
Murray JD, et al. A hierarchy of intrinsic timescales across primate cortex. Nat Neurosci. 2014;17(12):1661–3.
Koelstra S, et al. Deap: a database for emotion analysis; using physiological signals. IEEE Trans Affect Comput. 2011;3(1):18–31.
Pan C, Shi C, Mu H, Li J, Gao X. Eeg-based emotion recognition using logistic regression with gaussian kernel and Laplacian prior and investigation of critical frequency bands. Appl Sci. 2020;10(5):1619.
Acknowledgements
This work was supported by NSF awards DGE-1632976, BCS 1824198 and OISE 2020624.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the topical collection Pattern Recognition Applications and Methods, guest edited by Ana Fred, Maria De Marsico and Gabriella Sanniti di Baja.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Evanusa, M., Shrestha, S., Patil, V. et al. Deep-Readout Random Recurrent Neural Networks for Real-World Temporal Data. SN COMPUT. SCI. 3, 222 (2022). https://doi.org/10.1007/s42979-022-01118-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-022-01118-9