Skip to main content

Optimal Control of Variable Stiffness Policies: Dealing with Switching Dynamics and Model Mismatch

  • Chapter
  • First Online:
Geometric and Numerical Foundations of Movements

Part of the book series: Springer Tracts in Advanced Robotics ((STAR,volume 117))

  • 1921 Accesses

Abstract

Controlling complex robotic platforms is a challenging task, especially in designs with high levels of kinematic redundancy. Novel variable stiffness actuators (VSAs) have recently demonstrated the possibility of achieving energetically more efficient and safer behaviour by allowing the ability to simultaneously modulate the output torque and stiffness while adding further levels of actuation redundancy. An optimal control approach has been demonstrated as an effective method for such a complex actuation mechanism in order to devise a control strategy that simultaneously provides optimal control commands and time-varying stiffness profiles. However, traditional optimal control formulations have typically focused on optimisation of the tasks over a predetermined time horizon with smooth, continuous plant dynamics. In this chapter, we address the optimal control problem of robotic systems with VSAs for the challenging domain of switching dynamics and discontinuous state transition arising from interactions with an environment. First, we present a systematic methodology to simultaneously optimise control commands, time-varying stiffness profiles as well as the optimal switching instances and total movement duration based on a time-based switching hybrid dynamics formulation. We demonstrate the effectiveness of our approach on the control of a brachiating robot with a VSA considering multi-phase swing-up and locomotion tasks as an illustrative application of our proposed method in order to exploit the benefits of the VSA and intrinsic dynamics of the system. Then, to address the issue of model discrepancies in model-based optimal control, we extend the proposed framework by incorporating an adaptive learning algorithm. This performs continuous data-driven adjustments to the dynamics model while re-planning optimal policies that reflect this adaptation. We show that this augmented approach is able to handle a range of model discrepancies in both simulations and hardware experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    iLQG is the stochastic extension to iLQR [13] and in the sequel, we may refer to these two interchangeably.

  2. 2.

    \({\varvec{\alpha }}=\mathrm {diag}(a_1, \ldots , a_m)\) and \({\varvec{\alpha }}^2=\mathrm {diag}(a_1^2, \ldots , a_m^2)\) for notational convenience.

  3. 3.

    We include position controlled servo motor dynamics as defined in (6). For the bandwidth parameters for the motors we use \({\varvec{\alpha }}= \mathrm {diag} (20, 25)\). The range of the commands of the servo motors are limited as \(u_1 \in [-\pi /2, \pi /2]\) and \(u_2 \in [0, \pi /2]\).

  4. 4.

    In the brachiating robot model in Fig. 2, \(q=q_2\).

  5. 5.

    Hereafter, we use the term iLQG for the optimisation algorithm of our concern.

  6. 6.

    Note that the changes introduced by iLQG-LD only affect the dynamics modelling in (1), while the instantaneous state transition map in (2) remains unchanged.

  7. 7.

    We assume that if the position at the end of each phase is within a threshold \(\varepsilon _{T}=0.040\) m from the desired target, the system is able to start the next phase movement from the ideal location considering the effect of the gripper on the hardware.

  8. 8.

    With the reduced input dimensionality, practically, there could be the case that it is not possible to predict the full state of the system particularly in the swing-up motion due to unobserved input dimensions. Thus, we only considered the swing locomotion task in the hardware experiment with model learning.

References

  1. C.G. Atkeson, A.W. Moore, S. Schaal, Locally weighted learning for control. Artif. Intell. Rev. 11(1–5), 75–113 (1997)

    Article  Google Scholar 

  2. G. Bätz, U. Mettin, A. Schmidts, M. Scheint, D. Wollherr, A. S. Shiriaev, Ball dribbling with an underactuated continuous-time control phase: theory and experiments, in IEEE/RSJ International Conference on Intelligent Robots and Systems (2010), pp. 2890–2895

    Google Scholar 

  3. D. Braun, M. Howard, S. Vijayakumar, Optimal variable stiffness control: formulation and application to explosive movement tasks. Auton. Robot. 33(3), 237–253 (2012)

    Article  Google Scholar 

  4. D.J. Braun, F. Petit, F. Huber, S. Haddadin, P. van der Smagt, A. Albu-Schäffer, S. Vijayakumar, Robots driven by compliant actuators: optimal control under actuation constraints. IEEE Trans. Robot. 29(5), 1085–1101 (2013)

    Article  Google Scholar 

  5. A.E. Bryson, Y.-C. Ho, Applied Optimal Control (Taylor and Francis, United Kingdom, 1975)

    Google Scholar 

  6. M. Buehler, D.E. Koditschek, P.J. Kindlmann, Planning and control of robotic juggling and catching tasks. Int. J. Robot. Res. 13(2), 101–118 (1994)

    Article  Google Scholar 

  7. M. Buss, M. Glocker, M. Hardt, O. von Stryk, R. Bulirsch, G. Schmidt, Nonlinear hybrid dynamical systems: modeling, optimal control, and applications, in Lecture Notes in Control and Information Science (Springer, Heidelberg, 2002), pp. 311–335

    Google Scholar 

  8. T.M. Caldwell, T.D. Murphey, Switching mode generation and optimal estimation with application to skid-steering. Automatica 47(1), 50–64 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  9. M.G. Catalano, G. Grioli, M. Garabini, F. Bonomo, M. Mancini, N. Tsagarakis, A. Bicchi. VSA-CubeBot: A modular variable stiffness platform for multiple degrees of freedom robots, in IEEE International Conference on Robotics and Automation (2011), pp. 5090–5095

    Google Scholar 

  10. M. Gomes, A. Ruina, A five-link 2D brachiating ape model with life-like zero-energy-cost motions. J. Theor. Biol. 237(3), 265–278 (2005)

    Article  MathSciNet  Google Scholar 

  11. K. Goris, J. Saldien, B. Vanderborght, D. Lefeber, Mechanical design of the huggable robot probo. Int. J. Humanoid Robot. 8(3), 481–511 (2011)

    Article  Google Scholar 

  12. S. S. Groothuis, G. Rusticelli, A. Zucchelli, S. Stramigioli, R. Carloni, The vsaUT-II: A novel rotational variable stiffness actuator, in IEEE International Conference on Robotics and Automation (2012), pp. 3355–3360

    Google Scholar 

  13. W. Li, E. Todorov, Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic system. Int. J. Control 80(9), 1439–1453 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  14. A.W. Long, T.D. Murphey, K.M. Lynch, Optimal motion planning for a class of hybrid dynamical systems with impacts, in IEEE International Conference on Robotics and Automation (2011), pp. 4220–4226

    Google Scholar 

  15. D. Mitrovic, S. Klanke, M. Howard, S. Vijayakumar, Exploiting sensorimotor stochasticity for learning control of variable impedance actuators, in IEEE-RAS International Conference on Humanoid Robots (2010), pp. 536–541

    Google Scholar 

  16. D. Mitrovic, S. Klanke, S. Vijayakumar, Optimal control with adaptive internal dynamics models, in Fifth International Conference on Informatics in Control, Automation and Robotics (2008)

    Google Scholar 

  17. D. Mitrovic, S. Klanke, S. Vijayakumar, Adaptive optimal feedback control with learned internal dynamics models, in From Motor Learning to Interaction Learning in Robots (2010), pp. 65–84

    Google Scholar 

  18. K. Mombaur, Using optimization to create self-stable human-like running. Robotica 27(3):321330 (2009)

    Google Scholar 

  19. J. Nakanishi, J.A. Farrell, S. Schaal, Composite adaptive control with locally weighted statistical learning. Neural Netw. 18(1), 71–90 (2005)

    Article  MATH  Google Scholar 

  20. J. Nakanishi, T. Fukuda, D. Koditschek, A brachiating robot controller. IEEE Trans. Robot. Autom. 16(2), 109–123 (2000)

    Article  Google Scholar 

  21. J. Nakanishi, A. Radulescu, D. J. Braun, S. Vijayakumar, Spatio-temporal stiffness optimization with switching dynamics. Auton. Robot. 1–19 (2016)

    Google Scholar 

  22. J. Nakanishi, K. Rawlik, S. Vijayakumar, Stiffness and temporal optimization in periodic movements: an optimal control approach, in IEEE/RSJ International Conference on Intelligent Robots and Systems (2011), pp. 718–724

    Google Scholar 

  23. D. Nguyen-Tuong, J. Peters, Model learning for robot control: a survey. Cogn. Porocess. 12(4), 319–340 (2011)

    Article  Google Scholar 

  24. F. Petit, M. Chalon, W. Friedl, M. Grebenstein, A. Albu-Schäffer, G. Hirzinger, Bidirectional antagonistic variable stiffness actuation: analysis, design and implementation, in IEEE International Conference on Robotics and Automation (2010), pp. 4189–4196

    Google Scholar 

  25. B. Piccoli, Hybrid systems and optimal control, in IEEE Conference on Decision and Control (1998), pp. 13–18

    Google Scholar 

  26. M. Posa, C. Cantu, R. Tedrake, A direct method for trajectory optimization of rigid bodies through contact. Int. J. Robot. Res. 33(1), 69–81 (2014)

    Article  Google Scholar 

  27. M. Posa, S. Kuindersma, R. Tedrake, Optimization and stabilization of trajectories for constrained dynamical systems, in IEEE International Conference on Robotics and Automation (2016), pp. 1366–1373

    Google Scholar 

  28. A. Radulescu, M. Howard, D. J. Braun, S. Vijayakumar, Exploiting variable physical damping in rapid movement tasks, in IEEE/ASME International Conference on Advanced Intelligent Mechatronics (2012), pp. 141–148

    Google Scholar 

  29. A. Radulescu, J. Nakanishi, S. Vijayakumar, Optimal control of multi-phase movements with learned dynamics, in Man–Machine Interactions 4 (Springer, Heidelberg, 2016), pp. 61–76

    Google Scholar 

  30. K. Rawlik, M. Toussaint, S. Vijayakumar, An approximate inference approach to temporal optimization in optimal control, in Advances in Neural Information Processing Systems, vol. 23 (MIT Press, Cambridge, 2010), pp. 2011–2019

    Google Scholar 

  31. N. Rosa Jr., A. Barber, R.D. Gregg, K.M. Lynch, Stable open-loop brachiation on a vertical wall, in IEEE International Conference on Robotics and Automation (2012), pp. 1193–1199

    Google Scholar 

  32. F. Saito, T. Fukuda, F. Arai, Swing and locomotion control for a two-link brachiation robot. IEEE Control Syst. Mag. 14(1), 5–12 (1994)

    Article  Google Scholar 

  33. S. Schaal, C.G. Atkeson, Constructive incremental learning from only local information. Neural Comput. 10(8), 2047–2084 (1998)

    Article  Google Scholar 

  34. M.S. Shaikh, P.E. Caines, On the hybrid optimal control problem: theory and algorithms. IEEE Trans. Autom. Control 52(9), 1587–1603 (2007)

    Article  MathSciNet  Google Scholar 

  35. B. Siciliano, O. Khatib, Springer Handbook of Robotics (Springer, Heidelberg, 2008)

    Google Scholar 

  36. O. Sigaud, C. Salaün, V. Padois, On-line regression algorithms for learning mechanical models of robots: a survey. Robot. Auton. Syst. 59, 1115–1129 (2011)

    Article  Google Scholar 

  37. Y. Tassa, T. Erez, E. Todorov, Synthesis and stabilization of complex behaviors through online trajectory optimization, in IEEE/RSJ International Conference on Intelligent Robots and Systems (2012), pp. 2144–2151

    Google Scholar 

  38. M. Van Damme, B. Vanderborght, B. Verrelst, R. Van Ham, F. Daerden, D. Lefeber, Proxy-based sliding mode control of a planar pneumatic manipulator. Int. J. Robot. Res. 28(2), 266–284 (2009)

    Article  Google Scholar 

  39. R. Van Ham, B. Vanderborght, M. Van Damme, B. Verrelst, D. Lefeber, MACCEPA, the mechanically adjustable compliance and controllable equilibrium position actuator: design and implementation in a biped robot. Robot. Auton. Syst. 55(10), 761–768 (2007)

    Article  Google Scholar 

  40. B. Vanderborght, B. Verrelst, R. Van Ham, M. Van Damme, D. Lefeber, B.M.Y. Duran, P. Beyl, Exploiting natural dynamics to reduce energy consumption by controlling the compliance of soft actuators. Int. J. Robot. Res. 25(4), 343–358 (2006)

    Article  Google Scholar 

  41. S. Vijayakumar, S. Schaal, Locally weighted projection regression: An o (n) algorithm for incremental real time learning in high dimensional space, in International Conference on Machine Learning, Proceedings of the Sixteenth Conference (2000)

    Google Scholar 

  42. L.C. Visser, R. Carloni, S. Stramigioli, Energy-efficient variable stiffness actuators. IEEE Trans. Robot. 27(5), 865–875 (2011)

    Article  Google Scholar 

  43. W. Xi, C.D. Remy, Optimal gaits and motions for legged robots, in IEEE/RSJ International Conference on Intelligent Robots and Systems (2014), pp. 3259–3265

    Google Scholar 

  44. X. Xu, P.J. Antsaklis, Quadratic optimal control problems for hybrid linear autonomous systems with state jumps, in American Control Conference (2003), pp. 3393–3398

    Google Scholar 

  45. X. Xu, P.J. Antsaklis, Results and perspectives on computational methods for optimal control of switched systems, in International Workshop on Hybrid Systems: Computation and Control (Springer, Heidelberg, 2003), pp. 540–555

    Google Scholar 

  46. X. Xu, P.J. Antsaklis, Optimal control of switched systems based on parameterization of the switching instants. IEEE Trans. Autom. Control 49(1), 2–16 (2004)

    Article  MathSciNet  Google Scholar 

  47. C. Yang, G. Ganesh, S. Haddadin, S. Parusel, A. Albu-Schäeffer, E. Burdet, Human-like adaptation of force and impedance in stable and unstable interactions. IEEE Trans. Robot. 27(5), 918–930 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sethu Vijayakumar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Radulescu, A., Nakanishi, J., Braun, D.J., Vijayakumar, S. (2017). Optimal Control of Variable Stiffness Policies: Dealing with Switching Dynamics and Model Mismatch. In: Laumond, JP., Mansard, N., Lasserre, JB. (eds) Geometric and Numerical Foundations of Movements . Springer Tracts in Advanced Robotics, vol 117. Springer, Cham. https://doi.org/10.1007/978-3-319-51547-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51547-2_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51546-5

  • Online ISBN: 978-3-319-51547-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics