Abstract
Recent studies show that state-space dynamics of randomly initialized recurrent neural network (RNN) has interesting and potentially useful properties even without training. More precisely, when initializing RNN with small weights, recurrent unit activities reflect history of inputs presented to the network according to the Markovian scheme. This property of RNN is called Markovian architectural bias. Our work focuses on various techniques that make use of architectural bias. The first technique is based on the substitution of RNN output layer with prediction model, resulting in capabilities to exploit interesting state representation. The second approach, known as echo state networks (ESNs), is based on large untrained randomly interconnected hidden layer, which serves as reservoir of interesting behavior. We have investigated both approaches and their combination and performed simulations to demonstrate their usefulness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bengio, Y., Simard, P., Frasconi, P.: Learning Long-Term Dependencies with Gradient Descent is Difficult. IEEE Transactions on Neural Networks 5(2), 157–166 (1994)
Christiansen, M.H., Chater, N.: Toward a Connectionist Model of Recursion in Human Linguistic Performance. Cognitive Science 23, 417–437 (1999)
Elman, J.L.: Finding Structure in Time. Cognitive Science 14, 179–211 (1990)
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies. In: Kolen, J., Kremer, S. (eds.) Field Guide to Dynamic Recurrent Networks, pp. 237–243. Wiley-IEEE Press (2001)
Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory. Neural Computation 9(8), 1735–1780 (1997)
Jaeger, H.: The “Echo State” Approach to Analysing and Training Recurrent Neural Networks. Technical Report GMD Report 148, German National Research Center for Information Technology (2001)
Jaeger, H.: Short Term Memory in Echo State Networks. Technical Report GMD Report 152, German National Research Center for Information Technology (2001)
Jaeger, H.: Tutorial on Training Recurrent Neural Networks (2002), Available on http://www.ais.fraunhofer.de/INDY/herbert/ESNTutorial/ (release September / October, 2002)
Kolen, J.F.: The Origin of Clusters in Recurrent Neural Network State Space. In: Proceedings from the Sixteenth Annual Conference of the Cognitive Science Society, pp. 508–513. Lawrence Erlbaum Associates, Hillsdale (1994)
Kolen, J.F.: Recurrent Networks: State Machines or Iterated Function Systems? In: Touretzky, D.S., Elman, J.L., Mozer, M.C., Smolensky, P., Weigend, A.S. (eds.) Proceedings of the 1993 Connectionist Models Summer School, pp. 203–210. Erlbaum Associates, Hillsdale (1994)
Tiňo, P., Čerňanský, M., Beňušková, L.: Markovian Architectural Bias of Recurrent Neural Networks. Accepted to IEEE Transactions on Neural Networks
Tiňo, P., Čerňanský, M., Beňušková, L.: Markovian Architectural Bias of Recurrent Neural Networks. In: Sinčák, P., et al. (eds.) Intelligent Technologies – Theory and Applications, pp. 203–210. IOS Press, Amsterdam (2002)
Werbos, P.J.: Backpropagation through Time; What It Does and How to Do It. Proceedings of the IEEE 78, 1550–1560 (1990)
Williams, R.J., Zipser, D.: Gradient-Based Learning Algorithms for Recurrent Networks and Their Computational Complexity. In: Chauvin, Y., Rumelhart, D.E. (eds.) Back-Propagation: Theory, Architectures and Applications, pp. 433–486. Lawrence Erlbaum Publishers, Hillsdale (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Makula, M., Čerňanský, M., Beňušková, Ľ. (2004). Approaches Based on Markovian Architectural Bias in Recurrent Neural Networks. In: Van Emde Boas, P., Pokorný, J., Bieliková, M., Štuller, J. (eds) SOFSEM 2004: Theory and Practice of Computer Science. SOFSEM 2004. Lecture Notes in Computer Science, vol 2932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24618-3_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-24618-3_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20779-5
Online ISBN: 978-3-540-24618-3
eBook Packages: Springer Book Archive