Skip to main content

Can a Social Robot Learn to Gesticulate Just by Observing Humans?

  • Conference paper
  • First Online:
Advances in Physical Agents II (WAF 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1285))

Included in the following conference series:

Abstract

The goal of the system presented in this paper is to develop a natural talking gesture generation behavior for a humanoid robot. With that aim, human talking gestures are recorded by a human pose detector and the motion data captured is afterwards used to feed a Generative Adversarial Network (GAN). The motion capture system is capable of properly estimating the limbs/joints involved in human expressive talking behavior without any kind of wearable. Tested in a Pepper robot, the developed system is able to generate natural gestures without becoming repetitive in large talking periods. The approach is compared with a previous work, in order to evaluate the improvements introduced by a computationally more demanding approach. This comparison is made by calculating the end effectors’ trajectories in terms of jerk and path lengths. Results show that the described system is able to learn natural gestures just by observation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.vicon.com/.

  2. 2.

    https://www.ald.softbankrobotics.com/en/robots/pepper.

  3. 3.

    http://wiki.ros.org/naoqi_driver.

  4. 4.

    http://doc.aldebaran.com/2-5/naoqi/index.html.

  5. 5.

    http://www.ros.org.

  6. 6.

    https://github.com/firephinx/openpose_ros.

  7. 7.

    https://github.com/stevenjj/openpose_ros/tree/master/skeleton_extract_3d.

  8. 8.

    https://www.youtube.com/watch?v=h9wpMEH8JQc.

  9. 9.

    https://www.youtube.com/watch?v=iW1566ozbdg.

References

  1. Alibeigi, M., Rabiee, S., Ahmadabadi, M.N.: Inverse kinematics based human mimicking system using skeletal tracking technology. J. Intell. Rob. Syst. 85(1), 27–45 (2017)

    Article  Google Scholar 

  2. Augello, A., Cipolla, E., Infantino, I., Manfrè, A., Pilato, G., Vella, F.: Creative robot dance with variational encoder. CoRR abs/1707.01489 (2017)

    Google Scholar 

  3. Beck, A., Yumak, Z., Magnenat-Thalmann, N.: Body movements generation for virtual characters and social robots. Soc. Signal Process. 20, 273–286 (2017)

    Article  Google Scholar 

  4. Breazeal, C.: Designing Sociable Robots. Intelligent Robotics and Autonomous Agents. MIT Press, Cambridge (2004)

    Book  Google Scholar 

  5. Calinon, S., D’halluin, F., Sauser, E.L., Cakdwell, D.G., Billard, A.G.: Learning and reproduction of gestures by imitation. In: International Conference on Intelligent Robots and Systems, pp. 2769–2774 (2004)

    Google Scholar 

  6. Cao, Z., Hidalgo Martinez, G., Simon, T., Wei, S., Sheikh, Y.A.: Openpose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Machi. Intell. 1 (2019). https://doi.org/10.1109/TPAMI.2019.2929257

  7. Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR (2017)

    Google Scholar 

  8. Enrique Castillo, J.M.G., Hadi, A.S.: Learning bayesian networks. In: Expert Systems and Probabilistic Network Models. Monographs in Computer Science, Springer, New York (1997)

    Google Scholar 

  9. Everitt, B., Hand, D.: Finite Mixture Distributions. Chapman and Hall, Dordrecht (1981)

    Book  Google Scholar 

  10. Fadli, H., Machbub, C., Hidayat, E.: Human gesture imitation on NAO humanoid robot using Kinect based on inverse kinematics methodn. In: International Conference on Advanced Mechatronics, Intelligent Manufacture, and Industrial Automation (ICAMIMIA). IEEE (2015)

    Google Scholar 

  11. Goodfellow, I.: NIPS Tutorial: Generative adversarial networks. ArXiv e-prints (2017)

    Google Scholar 

  12. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  13. Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A.: Social GAN: socially acceptable trajectories with generative adversarial networks. CoRR abs/1803.10892 (2018). http://arxiv.org/abs/1803.10892

  14. Kofinas, N., Orfanoudakis, E., Lagoudakis, M.G.: Complete analytical forward and inverse kinematics for the NAO humanoid robot. J. Intell. Rob. Syst. 77(2), 251–264 (2015). https://doi.org/10.1007/s10846-013-0015-4

    Article  Google Scholar 

  15. Kwon, J., Park, F.C.: Using hidden markov models to generate natural humanoid movement. In: International Conference on Intelligent Robots and Systems (IROS). IEEE/RSJ (2006)

    Google Scholar 

  16. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  17. Manfrè, A., Infantino, I., Vella, F., Gaglio, S.: An automatic system for humanoid dance creation. Biologically Inspired Cogn. Architect. 15, 1–9 (2016)

    Article  Google Scholar 

  18. McNeill, D.: Hand and Mind: What Gestures Reveal about Thought. University of Chicago press, Chicago (1992)

    Google Scholar 

  19. Mehta, D., Sridhar, S., Sotnychenko, O., Rhodin, H., Shafiei, M., Seidel, H.P., Xu, W., Casas, D., Theobalt, C.: VNect: real-time 3D human pose estimation with a single RGB camera. ACM Trans. Graph. 36(4), 1–14 (2017)

    Article  Google Scholar 

  20. Poubel, L.P.: Whole-body online human motion imitation by a humanoid robot using task specification. Master’s thesis, Ecole Centrale de Nantes–Warsaw University of Technology (2013)

    Google Scholar 

  21. Rabiner, L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 257–286 (1989)

    Article  Google Scholar 

  22. Rodriguez, I., Astigarraga, A., Jauregi, E., Ruiz, T., Lazkano, E.: Humanizing NAO robot teleoperation using ROS. In: International Conference on Humanoid Robots (Humanoids), IEEE-RAS (2014)

    Google Scholar 

  23. Rodriguez, I., Astigarraga, A., Ruiz, T., Lazkano, E.: Singing minstrel robots, a means for improving social behaviors. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 2902–2907 (2016)

    Google Scholar 

  24. Rodriguez, I., Martínez-Otzeta, J.M., Irigoien, I., Lazkano, E.: Spontaneous talking gestures using generative adversarial networks. Robot. Auton. Syst. 114, 57–65 (2019)

    Article  Google Scholar 

  25. Schubert, T., Eggensperger, K., Gkogkidis, A., Hutter, F., Ball, T., Burgard, W.: Automatic bone parameter estimation for skeleton tracking in optical motion capture. In: International Conference on Robotics and Automation (ICRA). IEEE (2016)

    Google Scholar 

  26. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, pp. 1297–1304. IEEE Computer Society, Washington, DC, USA (2011). https://doi.org/10.1109/CVPR.2011.5995316

  27. Tanwani, A.K.: Generative models for learning robot manipulation. Ph.D. thesis, École Polytechnique Fédéral de Laussane (EPFL) (2018)

    Google Scholar 

  28. Tits, M., Tilmanne, J., Dutoit, T.: Robust and automatic motion-capture data recovery using soft skeleton constraints and model averaging. PLOS ONE 13(7), 1–21 (2018)

    Article  Google Scholar 

  29. Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemometr. Intell. Lab. Syst. 2(1–3), 37–52 (1987)

    Article  Google Scholar 

  30. Zabala, U., Rodriguez, I., Martínez-Otzeta, J.M., Lazkano, E.: Learning to gesticulate by observation using a deep generative approach. In: 11th International Conference on Social Robotics (ICSR), pp. 666–675. Springer (2019). https://doi.org/10.1007/978-3-030-35888-4_62

  31. Zhang, Z., Niu, Y., Kong, L.D., Lin, S., Wang, H.: A real-time upper-body robot imitation system. Int. J. Robot. Control 2, 49–56 (2019). https://doi.org/10.5430/ijrc.v2n1p49

    Article  Google Scholar 

  32. Zhang, Z., Niu, Y., Yan, Z., Lin, S.: Real-time whole-body imitation by humanoid robots and task-oriented teleoperation using an analytical mapping method and quantitative evaluation. Appl. Sci. 8(10) (2018). https://www.mdpi.com/2076-3417/8/10/2005

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Igor Rodriguez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zabala, U., Rodriguez, I., Martínez-Otzeta, J.M., Lazkano, E. (2021). Can a Social Robot Learn to Gesticulate Just by Observing Humans?. In: Bergasa, L.M., Ocaña, M., Barea, R., López-Guillén, E., Revenga, P. (eds) Advances in Physical Agents II. WAF 2020. Advances in Intelligent Systems and Computing, vol 1285. Springer, Cham. https://doi.org/10.1007/978-3-030-62579-5_10

Download citation

Publish with us

Policies and ethics