Abstract
The new epoch-incremental reinforcement learning algorithm with fuzzy approximation of action-value function is developed. This algorithm is practically tested in the control of the mobile robot which realizes goal seeking behavior. The obtained results are compared with results of fuzzy version of reinforcement learning algorithms, such as Q(0)-learning, Q(λ)-learning, Dyna-learning and prioritized sweeping. The adaptation of the fuzzy approximator to the model based reinforcement learning algorithms is also proposed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Appl, M., Brauer, W.: Fuzzy Mode-Based Reinforcement Learning. In: Proc. of the European Symposium on Intelligent Techniques, pp. 212–218 (2000)
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning problem. IEEE Trans. SMC 13, 834–847 (1983)
Berenji, H.R.: Fuzzy Reinforcement Learning and Dynamic Programming. In: L. Ralescu, A. (ed.) IJCAI-WS 1993. LNCS, vol. 847, pp. 1–9. Springer, Heidelberg (1994)
Bonarini, A., Lazaric, A., Montrone, F., Restelli, M.: Reinforcement distribution in Fuzzy Q-learning. Fuzzy Sets and Systems 160, 1420–1443 (2009)
Deng, C., Er, M.J.: Real-Time Dynamic Fuzzy Q-Learning abd Control of Mobile Robots. In: Proc. of 5th Asian Control Conference, vol. 3, pp. 1568–1576 (2004)
Lambercy, F., Caprari, G.: Khepera III manual ver. 2.2, K-Team (2008)
Moore, A., Atkeson, C.: Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning 13, 103–130 (1993)
Peng, J., Williams, R.: Efficient learning and planning within the Dyna framework. In: Proc. of the 2nd International Conference on Simulation of Adaptive Behavior, pp. 281–290 (1993)
Rummery, G., Niranjan, M.: On line q-learning using connectionist systems. Technical Report CUED/F-INFENG/TR 166, Cambridge University Engineering Department (1994)
Sutton, R.: Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming. In: Proc. of Seventh Int. Conf. on Machine Learning, pp. 216–224 (1990)
Sutton, R.S., Barto, A.G.: Reinforcement learning: An Introduction. MIT Press, Cambridge (1998)
Wiktorowicz, K., Zajdel, R.: A Fuzzy Navigation of a Mobile Robot. Systems Science 23(4), 87–100 (1997)
Watkins, C.J.C.H.: Learning from delayed Rewards. PhD thesis, Cambridge University, Cambridge, England (1989)
Zajdel, R.: Epoch-Incremental Queue-Dyna Algorithm. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2008. LNCS (LNAI), vol. 5097, pp. 1160–1170. Springer, Heidelberg (2008)
Zajdel, R.: Fuzzy Q(λ)-Learning Algorithm. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2010, Part I. LNCS (LNAI), vol. 6113, pp. 256–263. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zajdel, R. (2012). Fuzzy Epoch-Incremental Reinforcement Learning Algorithm. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2012. Lecture Notes in Computer Science(), vol 7267. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29347-4_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-29347-4_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29346-7
Online ISBN: 978-3-642-29347-4
eBook Packages: Computer ScienceComputer Science (R0)