Skip to main content
Log in

Towards a Self-Learning Agent: Using Ranking Functions as a Belief Representation in Reinforcement Learning

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

We propose a combination of belief revision and reinforcement learning which leads to a self-learning agent. The agent shows six qualities we deem necessary for a successful and adaptive learner. This is achieved by representing the agent’s belief in two different levels, one numerical and one symbolical. While the former is implemented using basic reinforcement learning techniques, the latter is represented by Spohn’s ranking functions. To make these ranking functions fit into a reinforcement learning framework, we studied the revision process and identified key weaknesses of the to-date approach. Despite the fact that the revision was modeled to support frequent updates, we propose and justify an alternative revision which leads to more plausible results. We show in an example application the benefits of the new approach, including faster learning and the extraction of learned rules.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Alchourron CE, Gardenfors P, Makinson D (1985) On the logic of theory change partial meet contraction and revision functions. J Symbol Log 50(2): 510–530

    Article  MathSciNet  MATH  Google Scholar 

  2. Anderson JR (1983) The architecture of cognition. Hardvard University Press, Cambridge

    Google Scholar 

  3. Blockeel H, De Raedt L (1998) Top-down induction of first-order logical decision trees. Artif Intell 101: 285–297

    Article  MathSciNet  MATH  Google Scholar 

  4. Darwiche A, Pearl J (1996) On the logic of iterated belief revision. Artif Intell 89: 1–29

    Article  MathSciNet  Google Scholar 

  5. Driessens K, Ramon J (2003) Relational instance based regression for relational reinforcement learning. In: Proceedings of the twentieth international conference on machine learning, pp 123–130

  6. Dzeroski S, De Raedt L, Driessens K (2001) Relational reinforcement learning. Mach Learn 43: 7–52

    Article  MATH  Google Scholar 

  7. Gartner T, Driessens K, Ramon J (2003) Graph kernels and gaussian processes for relational reinforcement learning. In: Inductive logic programming, 13th international conference, ILP

  8. Gombert JE (2003) Implicit and explicit learning to read: implication as for subtypes of dyslexia. Curr Psychol Lett 1(10)

  9. Häming K, Peters G (2010) An alternative approach to the revision of ordinal conditional functions in the context of multi-valued logic. In: Diamantaras K, Duch W, Iliadis LS (eds) 20th international conference on artificial neural networks, September 15–18. Springer, Thessaloniki, pp 200–203

  10. Häming K, Peters G (2011) A hybrid learning system for object recognition. In: 8th international conference on informatics in control, automation, and robotics (ICINCO 2011), Noordwijkerhout, The Netherlands, July 28–31

  11. Häming K, Peters G (2011) Ranking functions in large state spaces. In: 7th international conference on artificial intelligence applications and innovations (AIAI 2011), September 15–18, Corfu, Greece

  12. Kern-Isberner G (2001) Conditionals in nonmonotonic reasoning and belief revision: considering conditionals as agents. Springer, New York

    Book  Google Scholar 

  13. Leopold T, Kern Isberner G, Peters G,(2008) Combining reinforcement learning and belief revision: a learning system for active vision. In: Everingham M, Needham C, Fraile R (eds) 19th British machine vision conference (BMVC 2008), September 1–4, vol 1. Leeds, UK, pp 473–482

  14. Peters G (2011)Six necessary qualities of self-learning systems—a short brainstorming. In: International conference on neural computation theory and applications (NCTA 2011), October, Paris, France, pp 24–26

  15. Reber AS (1989) Implicit learning and tacit knowledge. J Exper Psycol Gen 3(118): 219–235

    Article  Google Scholar 

  16. Robinson, JA, Voronkov, A (eds) (2001) Handbook of automated reasoning (in 2 volumes). Elsevier, New York

    Google Scholar 

  17. Spohn W (August 1988) Ordinal conditional functions: a dynamic theory of epistemic states. In: Causation in decision, belief change and statistics, pp 105–134

  18. Spohn W (2009) A survey of ranking theory. In: Degrees of belief. Springer, New York

    Google Scholar 

  19. Sun R, Merrill E, Peterson T (2001) From implicit skills to explicit knowledge: a bottom-up model of skill learning. Cogn Sci 25: 203–244

    Article  Google Scholar 

  20. Sun R, Terry C, Slusarz P (2005) The interaction of the explicit and the implicit in skill learning a dual-process approach. Psychol Rev 112: 159–192

    Article  Google Scholar 

  21. Sun R, Zhang X, Slusarz P, Mathews R (2006) The interaction of implicit learning, explicit hypothesis testing, and implicit-to-explicit knowledge extraction. Neural Netw 1: 34–47

    Google Scholar 

  22. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge

    Google Scholar 

  23. Tenenbaum JB, Kemp C, Griffiths TL, Goodman ND (2011) How to grow a mind: statistics, structure, and abstraction. Science 331(6022): 1279–1285

    Article  MathSciNet  MATH  Google Scholar 

  24. Ye C, Yung NHC, Wang D (2003) A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance. IEEE Trans Syst Man Cybern B 33(1): 17–27

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Klaus Häming.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Häming, K., Peters, G. Towards a Self-Learning Agent: Using Ranking Functions as a Belief Representation in Reinforcement Learning. Neural Process Lett 38, 117–129 (2013). https://doi.org/10.1007/s11063-012-9248-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-012-9248-7

Keywords

Navigation