A Reinforcement Learning Approach with Spline-Fit Object Tracking for AIBO Robot’s High Level Decision Making

Mukherjee, Subhasis; Huda, Shamsul; Yearwood, John

doi:10.1007/978-3-642-22288-7_14

A Reinforcement Learning Approach with Spline-Fit Object Tracking for AIBO Robot’s High Level Decision Making

Subhasis Mukherjee³,
Shamsul Huda³ &
John Yearwood³

Conference paper

863 Accesses

Part of the book series: Studies in Computational Intelligence ((SCI,volume 368))

Abstract

Robocup is a popular test bed for AI programs around the world. Robosoccer is one of the two major parts of Robocup, in which AIBO entertainment robots take part in the middle sized soccer event. The three key challenges that robots need to face in this event are manoeuvrability, image recognition and decision making skills. This paper focuses on the decision making problem in Robosoccer- The goal keeper problem. We investigate whether reinforcement learning (RL) as a form of semi-supervised learning can effectively contribute to the goal keeper’s decision making process when penalty shot and two attacker problem are considered. Currently, the decision making process in Robosoccer is carried out using rule-base system. RL also is used for quadruped locomotion and navigation purpose in Robosoccer using AIBO. Moreover the ball distance is being calculated using IR sensors available at the nose of the robot. In this paper, we propose a reinforcement learning based approach that uses a dynamic state-action mapping using back propagation of reward and Q-learning along with spline fit (QLSF) for the final choice of high level functions in order to save the goal. The novelty of our approach is that the agent learns while playing and can take independent decision which overcomes the limitations of rule-base system due to fixed and limited predefined decision rules. The spline fit method used with the nose camera was also able to find out the location and the ball distance more accurately compare to the IR sensors. The noise source and near and far sensor dilemma problem with IR sensor was neutralized using the proposed spline fit method. Performance of the proposed method has been verified against the bench mark data set made with Upenn’03 code logic and a base line experiment with IR sensors. It was found that the efficiency of our QLSF approach in goalkeeping was better than the rule based approach in conjunction with the IR sensors. The QLSF develops a semi-supervised learning process over the rule-base system’s input-output mapping process, given in the Upenn’03 code.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mackworth, A.: Computer Vision: System, Theory, and Applications. World Scientific Press, Singapore (1993)
Google Scholar
Sony, Qrio (2004), http://www.sonyaibo.net/aboutqrio.htm
Honda, Asimo (2008), http://asimo.honda.com/
Sony, Aibo (2006), http://support.sony-europe.com/aibo
Kokoro, Actroidder (2008), http://www.kokoro-dreams.co.jp/english/robot/act/index.html
Operating system documentation project (2003), http://www.operating-system.org
Currie, A.: The history of robotics
Google Scholar
Fujita, M., Kitano, H.: Development of autonomous robot quadruped robot for robot entertainment. In: Autonomous Agents, pp. 7–18. Springer, Heidelberg (1998)
Google Scholar
Coradeschi, S., Karlsson, L., Stone, P., Balch, T., Kraetzschmar, G., Asad, M., Veloso, M.: Overview of robocup 1999. Springer, Heidelberg (2000)
Google Scholar
Stone, P., Balch, T., Kraetzschmar, G.K. (eds.): RoboCup 2000. LNCS (LNAI), vol. 2019. Springer, Heidelberg (2001)
MATH Google Scholar
Birk, A., Coradeschi, S., Tadokoro, S. (eds.): RoboCup 2001. LNCS (LNAI), vol. 2377. Springer, Heidelberg (2002)
MATH Google Scholar
Sammut, C., Uther, W., Hengst, B.: ”runswift 2003”, school of Computer Science and Engineering University of New South Wales and National ICT Australia (2003)
Google Scholar
Kleiner, A.: Rescue simulation project (December 14, 2008), http://kaspar.informatik.uni-freiburg.de/rcr2005/
Niranjan, M., Rummery, G.A.: On-line q-learning using connectionist systems, Ph.D. dissertation, Cambridge University Engineering Department (1994)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge (1998)
Google Scholar
Teknomo, K.: Q learning numerical example (2006), http://people.revoledu.com/kardi/tutorial/ReinforcementLearning/Q-Learning-Example.htm
Negnevitsky, M.: Artificial Intelligence: Aguide to intelligent system, ch. 2. Pearson education, London (2002) ISBN 0-201-71159-1
Google Scholar
Mukherjee, S., Yearwood, J., Vamplew, P.: Applying Reinforcement Learning in playing Robosoccer using the AIBO. In: GSITMS, University of Ballarat, Victoria, Australia (2010)
Google Scholar
Robocup, Humanoid league 2006, http://www.humanoidsoccer.org/
URBI AIBO home page, http://www.urbiforge.org/index.php/Robots/Aibo

Download references

Author information

Authors and Affiliations

Centre for Informatics and Applied optimization GSITMS, University of Ballarat, Victoria, Australia
Subhasis Mukherjee, Shamsul Huda & John Yearwood

Authors

Subhasis Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
Shamsul Huda
View author publications
You can also search for this author in PubMed Google Scholar
John Yearwood
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Software Engineering & Information Technology Institute, Central Michigan University, 48859, Mt. Pleasant, MI, U.S.A.
Roger Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mukherjee, S., Huda, S., Yearwood, J. (2011). A Reinforcement Learning Approach with Spline-Fit Object Tracking for AIBO Robot’s High Level Decision Making. In: Lee, R. (eds) Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2011. Studies in Computational Intelligence, vol 368. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22288-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-22288-7_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22287-0
Online ISBN: 978-3-642-22288-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics