Abstract
In this paper, we investigated an approach for robots to learn to adapt dance actions to human’s preferences through interaction and feedback. Human’s preferences were extracted by analysing the common action patterns with positive or negative feedback from the human during robot dancing. By using a buffering technique to store the dance actions before a feedback, each individual’s preferences can be extracted even when a reward is received late. The extracted preferred dance actions from different people were then combined to generate improved dance sequences, i.e. performing more of what was preferred and less of that was not preferred. Together with Softmax action-selection method, the Sarsa reinforcement learning algorithm was used as the underlining learning algorithm and to effectively control the trade-off between exploitation of the learnt dance skills and exploration of new dance actions. The results showed that the robot learnt, using interactive reinforcement learning, the preferences of human partners, and the dance improved with the extracted preferences from more human partners.
Similar content being viewed by others
References
Aucouturier JJ (2008) Cheek to chip: dancing robots and AI’s future. IEEE Intell Syst 23(2):74–84
Austermann A, Yamada S (2008) “Good robot”, “bad robot”-analyzing users’ feedback in a human-robot teaching task. In: The 17th IEEE international symposium on robot and human interactive communication, pp 41–46
Cyberbotics (2011) Webots 6 for fast prototyping and simulation of mobile robots [online]. Available from: http://www.cyberbotics.com
Dozier G (2001) Evolving robot behavior via interactive evolutionary computation: from real-world to simulation. In: Proceedings of the 2001 ACM symposium on applied computing, pp 340–344
Jens H, Peer A, Buss M (2010) Synthesis of an interactive haptic dancing partner. In: The 19th IEEE international symposium on robot and human interactive communication, Viareggio, Italy, 12–15 Sept 2010, pp 527–532
Kober J, Bagnell JAD, Peters J (2013) Reinforcement learning in robotics: a survey. Int J Rob Res (in press)
Leopold T, Kern-Isberner G, Peters G (2008) Belief revision with reinforcement learning for interactive object recognition. In: Proceedings of 18th European conference on artificial intelligence, pp 65–69
Liu F, Su J (2004) Visual learning framework based on reinforcement learning. In: Fifth World Congress on Intelligent Control and Automation, 6:4865–4868
Peralta R, Kaochar T, Fasel I, Morrison C, Walsh T, Cohen P (2011) Challenges to decoding the intention behind natural instruction. In: Proceedings of RO-MAN, 2011, pp 113–118
Santiago C, Oliveira J, Reis L, Sousa A (2011) Autonomous robot dancing synchronized to musical rhythmic stimuli. In: 2011 6th Iberian conference on information systems and technologies (CISTI), pp 1–6
Shiratori T, Ikeuchi K (2008) Synthesis of dance performance based on analyses of human motion and music. IPSJ Trans Comput Vis Image Media 1(1):34–47
Shiratori T, Nakazawa A, Ikeuchi K (2006) Dancing-to-music character animation. Comput Graph Forum Proc Eurograph 2006 25(3):449–458
Solis J, Chida K, Suefuji K, Takanishi A (2005) Improvements of the sound perception processing of the anthropomorphic flutist robot (WF-4R) to effectively interact with humans. In: Proceedings of IEEE international workshop on robots and human interactive communication, Nashville, USA, pp 450–455
Suay HB, Chernova S (2011) Effect of human guidance and state space size on interactive reinforcement learning. In: 20th IEEE international symposium on robot and human interactive communication, 31 July 2011–3 Aug 2011, pp 1–6
Suga Y, Ikuma Y, Nagao D, Ogata T, Sugano S (2005) Interactive evolution of human-robot communication in real world. In: Proceedings of IEEE/RSJ international conference on intelligent robots and systems, pp 1438–1443
Sutton R, Barto A (1998) Reinforcement learning: an introduction. MIT press, Bradford Books, Cambridge
Tanaka F, Movellan JR, Fortenberry B, Aisaka K (2006) Daily HRI evaluation at a classroom environment: reports from dance interaction experiments. In: Proceedings of the 1st ACM SIGCHI/SIGART conference on human-robot interaction, pp 3–9
Tholley I (2012) Towards a framework to make robots learn to dance. PhD thesis, Loughborough University, UK
Thomaz A, Breazeal C (2007) Asymmetric interpretations of positive and negative human feedback for a social learning agent. In: The 16th IEEE international symposium on robot and human interactive communication, pp 720–725
Thomaz AL, Hoffman G, Breazeal C (2005) Real-time interactive reinforcement learning for robots. In: AAAI 2005 workshop on human comprehensible machine learning
Vircikova M, Sincak P (2010) Dance choreography design of humanoid robots using interactive evolutionary computation. In: 3rd Workshop for Young Researchers: Human Friendly Robotics for Young Researchers
Wang H, Kosuge K (2012) Attractor design and prediction-based adaption for a robot waltz dancer in physical human-robot interaction. In: Proceedings of the 2012 World Congress on intelligent control and automation, pp 3810–3815
Wang H, Kosuge K (2012) Understanding and reproducing waltz dancers’ body dynamics in physical human-robot interaction. In: Proceedings of 2012 IEEE international conference robotics and automation (ICRA), pp 3134–3140
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Meng, Q., Tholley, I. & Chung, P.W.H. Robots learn to dance through interaction with humans. Neural Comput & Applic 24, 117–124 (2014). https://doi.org/10.1007/s00521-013-1504-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-013-1504-x