Abstract
In this paper, we propose a new measure within the framework of reinforcement learning, by describing a model of an information source as a representation of a learning process. We confirm in experiments that Lempel-Ziv coding for a string of episode sequences provides a quality measure to describe the degree of complexity for learning. In addition, we discuss functions comparing expected return and its variance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Shannon, C.E., Weavor, W.: The Mathematical Theory of Communication, University of Illinois Press, (1949), Urbana.
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, MIT Press, 1998, Adaptive Computation and Machine Learning, Cambridge, Massachusetts, http://envy.cs.umass.edu/rich/book/the-book.html
Lempel, A., Ziv, J.: On the Complexity of Finite Sequences, IEEE Transactions on Information Theory, (1976) vol. IT-22,1 75–81
Ziv, J., Lempel, A.: Compression of Individual Sequences via Variable-Rate Coding”, IEEE Transactions on Information Theory, (1978), IT-24,5 530–536
Ziv, J., Lempel, A.: A Universal Algorithm for Sequential Data Compression, IEEE Transactions on Information Theory, (1977), IT-23,3 337–343
Han, T.S., Kobayashi, K.: Mathematics of Information and Coding, American Mathematical Society, (2002),Vol 203, Translations of Mathematical Monographs
Watkins, C.J.C.H., Dayan, P.: Technical Note: Q-learning, Machine Learning, (1992), vol. 8, 279–292, Kluwer Academic
Sato, M., Kobayashi S.: Average-Reward Reinforcement Learning for Variance Penalized Markov Decision Problems, Machine Learning: Proceedings of the 18th International Conference, (2001) 473–480, San Francisco, Calif.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Iwata, K., Ishii, N. (2002). Lempel-Ziv Coding in Reinforcement Learning. In: Yin, H., Allinson, N., Freeman, R., Keane, J., Hubbard, S. (eds) Intelligent Data Engineering and Automated Learning — IDEAL 2002. IDEAL 2002. Lecture Notes in Computer Science, vol 2412. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45675-9_80
Download citation
DOI: https://doi.org/10.1007/3-540-45675-9_80
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44025-3
Online ISBN: 978-3-540-45675-9
eBook Packages: Springer Book Archive