Abstract
This paper investigates the issue of adaptability of behaviour in the context of agent-oriented programming. We focus on improving action selection in rule-based agent programming languages using a reinforcement learning mechanism under the hood. The novelty is that learning utilises the existing mental state representation of the agent, which means that (i) the programming model is unchanged and using learning within the program becomes straightforward, and (ii) adaptive behaviours can be combined with regular behaviours in a modular way. Overall, the key to effective programming in this setting is to balance between constraining behaviour using operational knowledge, and leaving flexibility to allow for ongoing adaptation. We illustrate this using different types of programs for solving the Blocks World problem.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Rao, A., Georgeff, M.: Modeling rational agents within a BDI-architecture. In: International Conference on Principles of Knowledge Representation and Reasoning, KR, pp. 473–484. Morgan Kaufmann (1991)
Rao, A.: Agentspeak(l): BDI agents speak out in a logical computable language. In: Van de Velde, W., Perram, J.W. (eds.) MAAMAW 1996. LNCS, vol. 1038, pp. 42–55. Springer, Heidelberg (1996)
Busetta, P., Rönnquist, R., Hodgson, A., Lucas, A.: JACK intelligent agents: Components for intelligent agents in Java. AgentLink Newsletter 2, 2–5 (1999)
Bordini, R., Hübner, J., Wooldridge, M.: Programming multi-agent systems in AgentSpeak using Jason. Wiley-Interscience (2007)
Pokahr, A., Braubach, L., Lamersdorf, W.: JADEX: Implementing a BDI-infrastructure for JADE agents. EXP - in Search of Innovation (Special Issue on JADE) 3(3), 76–85 (2003)
Sardina, S., Padgham, L.: A BDI agent programming language with failure recovery, declarative goals, and planning. Autonomous Agents and Multi-Agent Systems 23(1), 18–70 (2010)
Hindriks, K., Boer, F.D., Hoek, W.V.D., Meyer, J.: Agent programming in 3APL. Autonomous Agents and Multi-Agent Systems 2(4), 357–401 (1999)
Dastani, M.: 2APL: A practical agent programming language. Autonomous Agents and Multi-Agent Systems 16(3), 214–248 (2008)
Hindriks, K.: Programming Rational Agents in GOAL. Multi-Agent Tools: Languages, Platforms and Applications, 119–157 (2009)
Hindriks, K.V., van Riemsdijk, B., Behrens, T., Korstanje, R., Kraayenbrink, N., Pasman, W., de Rijk, L.: unreal goal bots - conceptual design of a reusable interface. In: Dignum, F. (ed.) Agents for Games and Simulations II. LNCS, vol. 6525, pp. 1–18. Springer, Heidelberg (2011)
Hindriks, K., Neerincx, M.A., Vink, M.: The iCat as a natural interaction partner. In: Dechesne, F., Hattori, H., ter Mors, A., Such, J.M., Weyns, D., Dignum, F. (eds.) AAMAS 2011 Workshops. LNCS, vol. 7068, pp. 212–231. Springer, Heidelberg (2012)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press (1998)
Rao, A., Georgeff, M.: BDI agents: From theory to practice. In: Proceedings of the First International Conference on Multi-Agent Systems (ICMAS), San Francisco, pp. 312–319 (1995)
Džeroski, S., Raedt, L.D., Driessens, K.: Relational reinforcement learning. Machine Learning 43, 7–52 (2001), doi:10.1023/A:1007694015589
Bordini, R.H., Dix, J., Dastani, M., Seghrouchni, A.E.F.: Multi-Agent Programming: Languages, Platforms and Applications. Multiagent Systems, Artificial Societies, and Simulated Organizations, vol. 15. Springer (2005)
Bordini, R.H., Dix, J., Dastani, M., Seghrouchni, A.E.F.: Multi-Agent Programming: Languages, Tools and Applications. Springer (2009)
Broekens, J., Hindriks, K., Wiggers, P.: Reinforcement Learning as Heuristic for Action-Rule Preferences. In: Collier, R., Dix, J., Novák, P. (eds.) ProMAS 2010. LNCS, vol. 6599, pp. 25–40. Springer, Heidelberg (2012)
Hindriks, K.V., van Riemsdijk, M.B.: Using temporal logic to integrate goals and qualitative preferences into agent programming. In: Baldoni, M., Son, T.C., van Riemsdijk, M.B., Winikoff, M. (eds.) DALT 2008. LNCS (LNAI), vol. 5397, pp. 215–232. Springer, Heidelberg (2009)
Bellman, R.E.: Dynamic Programming. Princeton University Press (1957)
Watkins, C.J.: Learning from delayed rewards. PhD thesis, King’s College London (1989)
Andre, D., Russell, S.J.: State abstraction for programmable reinforcement learning agents. In: Eighteenth National Conference on Artificial Intelligence, Menlo Park, CA, USA, pp. 119–125. American Association for Artificial Intelligence (2002)
Pokahr, A., Braubach, L., Lamersdorf, W.: Jadex: A BDI reasoning engine. In: Multi-Agent Programming. Multiagent Systems, Artificial Societies, and Simulated Organizations, vol. 15, pp. 149–174. Springer (2005)
Subagdja, B., Sonenberg, L., Rahwan, I.: Intentional learning agent architecture. Autonomous Agents and Multi-Agent Systems 18, 417–470 (2009)
Singh, D., Sardina, S., Padgham, L.: Extending BDI plan selection to incorporate learning from experience. Robotics and Autonomous Systems 58, 1067–1075 (2010)
Singh, D., Sardina, S., Padgham, L., Airiau, S.: Learning context conditions for BDI plan selection. In: Proceedings of Autonomous Agents and Multi-Agent Systems (AAMAS), pp. 325–332 (May 2010)
Singh, D., Sardina, S., Padgham, L., James, G.: Integrating learning into a BDI agent for environments with changing dynamics. In: Toby Walsh, C.K., Sierra, C. (eds.) Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Barcelona, Spain, pp. 2525–2530. AAAI Press (July 2011)
Anderson, J., Bothell, D., Byrne, M., Douglass, S., Lebiere, C., Qin, Y.: An integrated theory of the mind. Psychological Review 111(4), 1036 (2004)
Fu, W., Anderson, J.: From recurrent choice to skill learning: A reinforcement-learning model. Journal of Experimental Psychology: General 135(2), 184 (2006)
Klahr, D., Langley, P., Neches, R.: Production system models of learning and development. The MIT Press (1987)
Laird, J., Rosenbloom, P., Newell, A.: Chunking in soar: The anatomy of a general learning mechanism. Machine Learning 1(1), 11–46 (1986)
Nason, S., Laird, J.: Soar-rl: Integrating reinforcement learning with soar. Cognitive Systems Research 6(1), 51–59 (2005)
Nejati, N., Langley, P., Konik, T.: Learning hierarchical task networks by observation. In: International Conference on Machine Learning, pp. 665–672. ACM Press (2006)
Slaney, J., Thiébaux, S.: Blocks world revisited. Artificial Intelligence 125(1-2), 119–153 (2001)
Gupta, N., Nau, D.: On the complexity of blocks-world planning. Artificial Intelligence 56(2-3), 223–254 (1992)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. The Journal of Machine Learning Research 3, 1157–1182 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Singh, D., Hindriks, K.V. (2013). Learning to Improve Agent Behaviours in GOAL. In: Dastani, M., Hübner, J.F., Logan, B. (eds) Programming Multi-Agent Systems. ProMAS 2012. Lecture Notes in Computer Science(), vol 7837. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38700-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-38700-5_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38699-2
Online ISBN: 978-3-642-38700-5
eBook Packages: Computer ScienceComputer Science (R0)