Abstract
We propose a combination of belief revision and reinforcement learning which leads to a self-learning agent. The agent shows six qualities we deem necessary for a successful and adaptive learner. This is achieved by representing the agent’s belief in two different levels, one numerical and one symbolical. While the former is implemented using basic reinforcement learning techniques, the latter is represented by Spohn’s ranking functions. To make these ranking functions fit into a reinforcement learning framework, we studied the revision process and identified key weaknesses of the to-date approach. Despite the fact that the revision was modeled to support frequent updates, we propose and justify an alternative revision which leads to more plausible results. We show in an example application the benefits of the new approach, including faster learning and the extraction of learned rules.
Similar content being viewed by others
References
Alchourron CE, Gardenfors P, Makinson D (1985) On the logic of theory change partial meet contraction and revision functions. J Symbol Log 50(2): 510–530
Anderson JR (1983) The architecture of cognition. Hardvard University Press, Cambridge
Blockeel H, De Raedt L (1998) Top-down induction of first-order logical decision trees. Artif Intell 101: 285–297
Darwiche A, Pearl J (1996) On the logic of iterated belief revision. Artif Intell 89: 1–29
Driessens K, Ramon J (2003) Relational instance based regression for relational reinforcement learning. In: Proceedings of the twentieth international conference on machine learning, pp 123–130
Dzeroski S, De Raedt L, Driessens K (2001) Relational reinforcement learning. Mach Learn 43: 7–52
Gartner T, Driessens K, Ramon J (2003) Graph kernels and gaussian processes for relational reinforcement learning. In: Inductive logic programming, 13th international conference, ILP
Gombert JE (2003) Implicit and explicit learning to read: implication as for subtypes of dyslexia. Curr Psychol Lett 1(10)
Häming K, Peters G (2010) An alternative approach to the revision of ordinal conditional functions in the context of multi-valued logic. In: Diamantaras K, Duch W, Iliadis LS (eds) 20th international conference on artificial neural networks, September 15–18. Springer, Thessaloniki, pp 200–203
Häming K, Peters G (2011) A hybrid learning system for object recognition. In: 8th international conference on informatics in control, automation, and robotics (ICINCO 2011), Noordwijkerhout, The Netherlands, July 28–31
Häming K, Peters G (2011) Ranking functions in large state spaces. In: 7th international conference on artificial intelligence applications and innovations (AIAI 2011), September 15–18, Corfu, Greece
Kern-Isberner G (2001) Conditionals in nonmonotonic reasoning and belief revision: considering conditionals as agents. Springer, New York
Leopold T, Kern Isberner G, Peters G,(2008) Combining reinforcement learning and belief revision: a learning system for active vision. In: Everingham M, Needham C, Fraile R (eds) 19th British machine vision conference (BMVC 2008), September 1–4, vol 1. Leeds, UK, pp 473–482
Peters G (2011)Six necessary qualities of self-learning systems—a short brainstorming. In: International conference on neural computation theory and applications (NCTA 2011), October, Paris, France, pp 24–26
Reber AS (1989) Implicit learning and tacit knowledge. J Exper Psycol Gen 3(118): 219–235
Robinson, JA, Voronkov, A (eds) (2001) Handbook of automated reasoning (in 2 volumes). Elsevier, New York
Spohn W (August 1988) Ordinal conditional functions: a dynamic theory of epistemic states. In: Causation in decision, belief change and statistics, pp 105–134
Spohn W (2009) A survey of ranking theory. In: Degrees of belief. Springer, New York
Sun R, Merrill E, Peterson T (2001) From implicit skills to explicit knowledge: a bottom-up model of skill learning. Cogn Sci 25: 203–244
Sun R, Terry C, Slusarz P (2005) The interaction of the explicit and the implicit in skill learning a dual-process approach. Psychol Rev 112: 159–192
Sun R, Zhang X, Slusarz P, Mathews R (2006) The interaction of implicit learning, explicit hypothesis testing, and implicit-to-explicit knowledge extraction. Neural Netw 1: 34–47
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
Tenenbaum JB, Kemp C, Griffiths TL, Goodman ND (2011) How to grow a mind: statistics, structure, and abstraction. Science 331(6022): 1279–1285
Ye C, Yung NHC, Wang D (2003) A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance. IEEE Trans Syst Man Cybern B 33(1): 17–27
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Häming, K., Peters, G. Towards a Self-Learning Agent: Using Ranking Functions as a Belief Representation in Reinforcement Learning. Neural Process Lett 38, 117–129 (2013). https://doi.org/10.1007/s11063-012-9248-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-012-9248-7