Abstract
Intelligence can be defined, informally, as knowing a lot and being able to use that knowledge flexibly to achieve one’s goals. In this sense it is clear that knowledge is central to intelligence. However, it is less clear exactly what knowledge is, what gives it meaning, and how it can be efficiently acquired and used. In this talk we re-examine aspects of these age-old questions in light of modern experience (and particularly in light of recent work in reinforcement learning). Such questions are not just of philosophical or theoretical import; they directly effect the practicality of modern knowledge-based systems, which tend to become unwieldy and brittle—difficult to change—as the knowledge base becomes large and diverse.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baird, L.C.: Residual algorithms: Reinforcement learning with function approximation. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 30–37 (1995)
Degris, T., Modayil, J.: Scaling-up knowledge for a cognizant robot. In: Notes of the AAAI Spring Symposium on Designing Intelligent Robots: Reintegrating AI (2012)
Konidaris, G., Barto, A.G.: Building portable options: Skill transfer in reinforcement learning. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 895–900 (2007)
Koop, A.: Investigating Experience: Temporal Coherence and Empirical Knowledge Representation. MSc. thesis, University of Alberta (2007)
Maei, H.R.: Gradient Temporal-Difference Learning Algorithms. PhD. thesis, University of Alberta (2011)
Maei, H.R., Sutton, R.S.: GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces. In: Proceedings of the Third Conference on Artificial General Intelligence (2010)
Maei, H.R., Szepesvári, C., Bhatnagar, S., Precup, D., Silver, D., Sutton, R.S.: Convergent temporal-difference learning with arbitrary smooth function approximation. In: Advances in Neural Information Processing Systems, vol. 22. MIT Press (2009)
Maei, H.R., Szepesvári, C., Bhatnagar, S., Sutton, R.S.: Toward off-policy learning control with function approximation. In: Proceedings of the 27th International Conference on Machine Learning (2010)
Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the Twenty-First International Conference on Machine Learning (2004)
McGovern, A., Sutton, R.S.: Macro-actions in reinforcement learning: An empirical analysis. Technical Report 98-70, University of Massachusetts, Department of Computer Science (1998)
Modayil, J., White, A., Sutton, R.S.: Multi-timescale nexting in a reinforcement learning robot. In: Proceedings of the 2012 Conference on Simulation of Adaptive Behaviour (to appear, 2012)
Parr, R.: Hierarchical Control and Learning for Markov Decision Processes. PhD thesis, University of California at Berkeley (1998)
Precup, D.: Temporal Abstraction in Reinforcement Learning. PhD thesis, University of Massachusetts (2000)
Rafols, E.J.: Temporal Abstraction in Temporal-difference Networks. MSc. thesis, University of Alberta (2006)
Singh, S., Barto, A.G., Chentanez, N.: Intrinsically motivated reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1281–1288 (2005)
Stolle, M., Precup, D.: Learning Options in Reinforcement Learning. In: Koenig, S., Holte, R.C. (eds.) SARA 2002. LNCS (LNAI), vol. 2371, pp. 212–223. Springer, Heidelberg (2002)
Sutton, R.S.: “Verification” and “Verfication, the key to AI” (2001), http://richsutton.com/IncIdeas/Verification.html , http://richsutton.com/IncIdeas/KeytoAI.html
Sutton, R.S.: The grand challenge of predictive empirical abstract knowledge. In: Working Notes of the IJCAI 2009 Workshop on Grand Challenges for Reasoning from Experiences (2009)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)
Sutton, R.S., Maei, H.R., Precup, D., Bhatnagar, S., Silver, D., Szepesvári, C., Wiewiora, E.: Fast gradient-descent methods for temporal-difference learning with linear function approximation. In: Proceedings of the 26th International Conference on Machine Learning (2009)
Sutton, R.S., Modayil, J., Delp, M., Degris, T., Pilarski, P.M., White, A., Precup, D.: Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction. In: Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems, AAMAS (2011)
Sutton, R.S., Precup, D., Singh, S.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181–211 (1999)
Sutton, R.S., Szepesvári, C., Maei, H.R.: A convergent O(n) algorithm for off-policy temporal-difference learning with linear function approximation. In: Advances in Neural Information Processing Systems, vol. 21. MIT Press (2009)
Tsitsiklis, J.N., Van Roy, B.: An analysis of temporal-difference learning with function approximation. IEEE Transactions on Automatic Control 42, 674–690 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sutton, R.S. (2012). Beyond Reward: The Problem of Knowledge and Data. In: Muggleton, S.H., Tamaddoni-Nezhad, A., Lisi, F.A. (eds) Inductive Logic Programming. ILP 2011. Lecture Notes in Computer Science(), vol 7207. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31951-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-31951-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31950-1
Online ISBN: 978-3-642-31951-8
eBook Packages: Computer ScienceComputer Science (R0)