Abstract
Arithmetic coding is one of the most outstanding techniques for lossless data compression. It attains its good performance with the help of a probability model which indicates at each step the probability of occurrence of each possible input symbol given the current context. The better this model, the greater the compression ratio achieved. This work analyses the use of discrete-time recurrent neural networks and their capability for predicting the next symbol in a sequence in order to implement that model. The focus of this study is on online prediction, a task much harder than the classical offline grammatical inference with neural networks. The results obtained show that recurrent neural networks have no problem when the sequences come from the output of a finite-state machine, easily giving high compression ratios. When compressing real texts, however, the dynamics of the sequences seem to be too complex to be learned online correctly by the net.
Similar content being viewed by others
References
Bell, T. C., Cleary, J. G. and Witten, I. H.: Text compression, Prentice Hall, 1990.
Bengio, Y., Simard, P. and Frasconi, P.: Learning long-term dependencies with gradient descent is difficult, IEEE Transactions on Neural Networks 5(2) (1994), 157-166.
Burrows, M. and Wheeler, D. J.: A block-sorting lossless data compression algorithm, Technical Report 124, Digital Systems Research Center (1994).
Carrasco, R. C., Forcada, M. L., Valdés-Muñoz, M. A. and Ñeco, R. P.: Stable-encoding of finite-state machines in discrete-time recurrent neural nets with sigmoid units, Neural Computation 12(9) (2000), 2129-2174.
Cleeremans, A., Servan-Schreiber, D. and McClelland, J. L.: Finite state automata and simple recurrent networks, Neural Computation 1(3) (1989), 372-381.
Elman, J. L.: Finding structure in time, Cognitive Science 14 (1990), 179-211.
Haykin, S.: Neural networks: a comprehensive foundation, New Jersey: Prentice Hall, 2nd edition (1999).
Hochreiter, S. and Schmidhuber, J.: Long short-term memory, Neural computation 9(8) (1997), 1735-1780.
Hopcroft, J. E. and Ullman, J. D.: Introduction to automata theory, languages and computation, Addison-Wesley (1979).
Jacobs, R. A.: Increased rates of convergence through learning rate adaptation, Neural Networks 1 (1988), 295-307.
Long, P. M., Natsev, A. I. and Vitter, J. S.: Text compression via alphabet re-representation, Neural Networks 12 (1999), 755-765.
Mahoney, M. V.: Fast text compression with neural networks, In: 13th International FLAIRS Conference. Orlando, Florida (2000).
Narendra, K. S. and Parthasarathy, K.: Identification and control of dynamical systems using neural networks, IEEE Transactions on Neural Networks 1 (1990), 4-27.
Nelson, M.: Arithmetic coding + statistical modeling = data compression, Dr. Dobb's Journal, February (1991). Available at http://dogma.net/markn/articles/arith/part1.htm
Nelson, M.: Data compression with the Burrows-Wheeler transform, Dr. Dobb's Journal, September (1996). Available at http://dogma.net/markn/articles/bwt/bwt.htm
Nelson, M. and Gailly, J.-L.: The data compression book. New York: M&T Books, 2nd edition (1995).
Puskorius, G. V. and Feldkamp, L. A.: Decoupled extended Kalman filter training of feedforward layered networks, In: International Joint Conference on Neural Networks, Vol. 1, (1991), pp. 771–777.
Robinson, A. J. and Fallside, F.: A recurrent error propagation speech recognition system, Computer Speech and Language 5 (1991), 259-274.
Rumelhart, D. E., Hinton, G. E. and Williams, R. J.: Learning representations by back-propagating errors, Nature 323 (1986), 533-536.
Schmidhuber, J. and Stefan, H.: Sequential neural text compression, IEEE Transactions on Neural Networks 7(1) (1996), 142-146.
Williams, R. J. and Zipser, R. A.: 1989, A learning algorithm for continually training recurrent neural networks, Neural Computation 1 (1989), 270-280.
Ziv, J. and Lempel, A.: A universal algorithm for sequential data compression, IEEE Transactions on Information Theory (1997).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Pérez-Ortiz, J.A., Calera-Rubio, J. & Forcada, M.L. Online Text Prediction with Recurrent Neural Networks. Neural Processing Letters 14, 127–140 (2001). https://doi.org/10.1023/A:1012491324276
Issue Date:
DOI: https://doi.org/10.1023/A:1012491324276