Skip to main content

Advertisement

Log in

Improving biped walk stability with complementary corrective demonstration

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

We contribute a method for improving the skill execution performance of a robot by complementing an existing algorithmic solution with corrective human demonstration. We apply the proposed method to the biped walking problem, which is a good example of a complex low level skill due to the complicated dynamics of the walk process in a high dimensional state and action space. We introduce an incremental learning approach to improve the Nao humanoid robot’s stability during walking. First, we identify, extract, and record a complete walk cycle from the motion of the robot as it executes a given walk algorithm as a black box. Second, we apply offline advice operators for improving the stability of the learned open-loop walk cycle. Finally, we present an algorithm to directly modify the recorded walk cycle using real time corrective human demonstration. The demonstrator delivers the corrective feedback using a commercially available wireless game controller without touching the robot. Through the proposed algorithm, the robot learns a closed-loop correction policy for the open-loop walk by mapping the corrective demonstrations to the sensory readings received while walking. Experiment results demonstrate a significant improvement in the walk stability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Algorithm 1
Algorithm 2
Fig. 14

Similar content being viewed by others

Notes

  1. The Nao V3 model is delivered with a new closed-loop omni-directional walk; however, that model was not available to us during our experimental study.

  2. The mentioned frequency is for the Nao V2 model which was the platform used in this study. The internal control software on the more recent V3 model runs at 100 Hz.

  3. The open-loop walk performance was comparable to the performance of the original ZMP-based walk, which was not available to be accounted for in this empirical comparison.

References

  • Abbeel, P., & Ng, A. Y. (2004). Apprenticeship learning via inverse reinforcement learning. In Proceedings of the twenty-first international conference on machine learning. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.6759.

    Google Scholar 

  • Aldebaran (2008). Aldebaran robotics—Nao Humanoid Robot. http://www.aldebaran-robotics.com/pageProjetsNao.php.

  • Argall, B., Browning, B., & Veloso, M. (2007). Learning from demonstration with the critique of a human teacher. In Second annual conference on human-robot interactions (HRI’07).

    Google Scholar 

  • Argall, B., Browning, B., & Veloso, M. (2008). Learning robot motion control with demonstration and advice-operators. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS’08).

    Google Scholar 

  • Argall, B., Sauser, E., & Billard, A. (2010). Tactile feedback for policy refinement and reuse. In Proceedings of the 9th IEEE international conference on development and learning (ICDL’10). Ann Arbor, MI, August 2010.

    Google Scholar 

  • Argall, B. D., Chernova, S., Veloso, M., & Browning, B. (2009). A survey of robot learning from demonstration. Robotics and Automation Systems, 57(5), 469–483. doi:10.1016/j.robot.2008.10.024

    Article  Google Scholar 

  • Atkeson, C., Moore, A., & Schaal, S. (1997). Locally weighted learning. AI Review, 11, 11–73.

    Google Scholar 

  • Atkeson, C. G., & Schaal, S. (1997a). Learning tasks from a single demonstration. In: IEEE international conference on robotics and automation (ICRA97), Piscataway: IEEE, pp. 1706–1712, http://www-clmc.usc.edu/publications/A/atkeson-ICRA1997.pdf.

    Google Scholar 

  • Atkeson, C. G., & Schaal, S. (1997b). Robot learning from demonstration. In: machine learning: proceedings of the fourteenth international conference (ICML’97), San Mateo: Morgan Kaufmann, pp. 12–20, http://www-clmc.usc.edu/publications/A/atkeson-ICML1997.pdf.

    Google Scholar 

  • Bentivegna, D. C., Atkeson, C. G., & Cheng, G. (2006). Learning similar tasks from observation and practice. In: Proceedings of the 2006 IEEE/RSJ international conference on intelligent robots and systems, pp. 2677–2683.

    Chapter  Google Scholar 

  • Breazeal, C., Hoffman, G., & Lockerd, A. (2004). Teaching and working with robots as a collaboration. In AAMAS ’04: Proceedings of the third international joint conference on autonomous agents and multiagent systems (pp. 1030–1037). Washington: IEEE Computer Society, doi:10.1109/AAMAS.2004.258.

    Google Scholar 

  • Chernova, S., & Veloso, M. (2008). Teaching collaborative multirobot tasks through demonstration. In Proceedings of AAMAS’08, the seventh international joint conference on autonomous agents and multi-agent systems, Estoril, Portugal, May 2008.

    Google Scholar 

  • Chernova, S., & Veloso, M. (2009). Interactive policy learning through confidence-based autonomy. Journal of Artificial Intelligence Research 34, 1–25.

    MathSciNet  MATH  Google Scholar 

  • Czarnetzki, S., Kerner, S., & Urbann, O. (2009). Observer-based dynamic walking control for biped robots. Robotics and Autonomous Systems, 57, 839–845.

    Article  Google Scholar 

  • Gokce, B., & Akin, H. L. (2009). Parameter optimization of a signal-based biped locomotion approach using evolutionary strategies. In O. Tosun, H. L. Akin, M. O. Tokhi, & G. S. Virk (Eds.), Proceedings of the twelfth international conference on climbing and walking robots and the support technologies for mobile machines (CLAWAR 2009), Istanbul, Turkey, 9–11 September 2009. Singapore: World Scientific.

    Google Scholar 

  • Gouaillier, D., Hugel, V., Blazevic, P., Kilner, C., Monceaux, J., Marnier, P.L., Serre, B., & Maisonnier, J. B. (2009). Mechatronic design of Nao humanoid. In ICRA, pp. 769–774.

    Google Scholar 

  • Graf, C., Härtl, A., Röfer, T., & Laue, T. (2009). A robust closed-loop gait for the standard platform league humanoid. In C. Zhou, E. Pagello, E. Menegatti, S. Behnke, & T. Röfer (Eds.), Proceedings of the fourth workshop on humanoid soccer robots in conjunction with the 2009 IEEE-RAS international conference on humanoid robots (pp. 30–37). France: Paris.

    Google Scholar 

  • Hersch, M., Guenter, F., Calinon, S., & Billard, A. (2008). Dynamical system modulation for robot learning via kinesthetic demonstrations. IEEE Transactions on Robotics 24(6), 1463–1467.

    Article  Google Scholar 

  • Kolter, J. Z., Abbeel, P., & Ng, A. Y. (2007). Hierarchical apprenticeship learning with application to quadruped locomotion. In: Platt, J. C., Koller, D., Singer, Y., & Roweis, S. T. (Eds.), Advances in neural information processing systems 20, Proceedings of the twenty-first annual conference on neural information processing systems, Vancouver, British Columbia, Canada, 3–6 December 2007. Cambridge: MIT Press. doi:10.1007/s10514-012-9284-1.

    Google Scholar 

  • Liu, J., & Veloso, M. (2008). Online ZMP sampling search for biped walking planning. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS’08), Nice, France, September 2008.

    Google Scholar 

  • Liu, J., Chen, X., & Veloso, M. (2009). Simplified walking: a new way to generate flexible biped patterns. In O. Tosun, H. L. Akin, M. O. Tokhi, & G. S. Virk (Eds.), Proceedings of the twelfth international conference on climbing and walking robots and the support technologies for mobile machines (CLAWAR 2009), Istanbul, Turkey, 9–11 September 2009. Singapore: World Scientific.

    Google Scholar 

  • Nakanishi, J. (2004). Learning from demonstration and adaptation of biped locomotion. Robotics and Autonomous Systems, 47(2–3), 79–91.

    Article  Google Scholar 

  • Nintendo (2007). Nintendo—WII game controllers. http://www.nintendo.com/wii/what/controllers.

  • RoboCup (2009). RoboCup international robot soccer competition. http://www.robocup.org.

  • RoboCup SPL (2009). The RoboCup standard platform league. http://www.tzi.de/spl.

  • Röfer, T., Laue, T., Müller, J., Bösche, O., Burchardt, A., Damrose, E., Gillmann, K., Graf, C., de Haas, T. J., Härtl, A., Rieskamp, A., Schreck, A., Sieverdingbeck, I., & Worch, J. H. (2009). B-human 2009 team report. Tech. rep., DFKI Lab and University of Bremen, Bremen, Germany. http://www.b-human.de/download.php?file=coderelease09_doc.

  • Rybski, P. E., Yoon, K., Stolarz, J., & Veloso, M. M. (2007). Interactive robot task training through dialog and demonstration. In Proceedings of the ACM/IEEE international conference on human-robot interaction, Washington, DC, pp. 255–262.

    Google Scholar 

  • Strom, J., Slavov, G., & Chown, E. (2009). Omnidirectional walking using ZMP and preview control for the Nao humanoid robot. In RoboCup 2009: robot soccer world cup XIII.

    Google Scholar 

  • Thomaz, A. L., & Breazeal, C. (2006). Reinforcement learning with human teachers: evidence of feedback and guidance with implications for learning performance. In AAAI’06: Proceedings of the 21st national conference on artificial intelligence, Menlo Park: AAAI Press, pp. 1000–1005.

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Tekin Meriçli and Brian Coltin for their valuable feedback on the manuscript. We further thank the Cerberus team for their debugging system, and the CMWrEagle team for their ZMP-based robot walk.

The first author is supported by The Scientific and Technological Research Council of Turkey Programme 2214 and the Turkish State Planning Organization (DPT) under the TAM Project, number 2007K120610.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Çetin Meriçli.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Meriçli, Ç., Veloso, M. & Akın, H.L. Improving biped walk stability with complementary corrective demonstration. Auton Robot 32, 419–432 (2012). https://doi.org/10.1007/s10514-012-9284-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-012-9284-1

Keywords

Navigation