Error Entropy Minimization for LSTM Training

Alexandre, Luís A.; de Sá, J. P. Marques

doi:10.1007/11840817_26

Luís A. Alexandre²⁰ &
J. P. Marques de Sá²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4131))

Included in the following conference series:

International Conference on Artificial Neural Networks

3259 Accesses
6 Citations

Abstract

In this paper we present a new training algorithm for the Long Short-Term Memory (LSTM) recurrent neural network. This algorithm uses entropy instead of the usual mean squared error as the cost function for the weight update. More precisely we use the Error Entropy Minimization approach, were the entropy of the error is minimized after each symbol is present to the network. Our experiments show that this approach enables the convergence of the LSTM more frequently than with the traditional learning algorithm. This in turn relaxes the burden of parameter tuning since learning is achieved for a wider range of parameter values. The use of EEM also reduces, in some cases, the number of epochs needed for convergence.

This work was supported by the Portuguese FCT-Fundação para a Ciência e Tecnologia (project POSC/EIA/56918/2004).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9(8), 1735–1780 (1997)
Article Google Scholar
Gers, F., Schmidhuber, J., Cummins, F.: Learning to forget: Continual prediction with LSTM. Neural Computation 12(10), 2451–2471 (2000)
Article Google Scholar
Gers, F., Schmidhuber, J.: Recurrent nets that time and count. In: Proc. IJCNN 2000, Int. Joint Conf. on Neural Networks, Como, Italy (2000)
Google Scholar
Pérez-Ortiz, J., Gers, F., Eck, D., Schmidhuber, J.: Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets. Neural Networks 16(2), 241–250 (2003)
Article Google Scholar
Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd edn. Prentice Hall, Englewood Cliffs (1999)
MATH Google Scholar
Erdogmus., D., Principe, J.: An error-entropy minimization algorithm for supervised training of nonlinear adaptive systems. IEEE Trans. Signal Processing 50(7), 1780–1786 (2002)
Article MathSciNet Google Scholar
Santos, J., Alexandre, L., Sereno, F., de Sá, J.M.: Optimization of the error entropy minimization algorithm for neural network classification. In: ANNIE 2004, St.Louis, USA. Intelligent Engineering Systems Through Artificial Neural Networks, vol. 14, pp. 81–86. ASME Press Series, St. Louis (2004)
Google Scholar
Santos, J., Alexandre, L., de Sá, J.M.: The error entropy minimization algorithm for neural network classification. In: Lofti, A. (ed.) Proceedings of the 5th International Conference on Recent Advances in Soft Computing, Nottingham, United Kingdom, pp. 92–97 (2004)
Google Scholar
Silva, L., de Sá, J.M., Alexandre, L.: Neural network classification using Shannon’s entropy. In: 13th European Symposium on Artificial Neural Networks - ESANN 2005, Bruges, Belgium, pp. 217–222 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics and IT-Networks and Multimedia Group, University of Beira Interior, Covilhã, Portugal
Luís A. Alexandre
Faculty of Engineering and INEB, University of Porto, Portugal
J. P. Marques de Sá

Authors

Luís A. Alexandre
View author publications
You can also search for this author in PubMed Google Scholar
J. P. Marques de Sá
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Electrical and Computer Engineering, Image, Video and Multimedia Systems Laboratory, National Technical University of Athens, GR-157 80, Zographou, Greece
Stefanos D. Kollias
Department of Electrical and Computer Engineering, National Technical University of Athens, 15780, Zographou, Greece
Andreas Stafylopatis
Department of Informatics, Nicolaus Copernicus University, Toruń, Poland
Włodzisław Duch
Adaptive Informatics Research Centre, Helsinki University of Technology, HUT, P.O. Box 5400, 02015, Finland
Erkki Oja

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alexandre, L.A., de Sá, J.P.M. (2006). Error Entropy Minimization for LSTM Training. In: Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds) Artificial Neural Networks – ICANN 2006. ICANN 2006. Lecture Notes in Computer Science, vol 4131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11840817_26

Download citation

DOI: https://doi.org/10.1007/11840817_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38625-4
Online ISBN: 978-3-540-38627-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics