Learning potential functions from human demonstrations with encapsulated dynamic and compliant behaviors

Khansari-Zadeh, Seyed Mohammad; Khatib, Oussama

doi:10.1007/s10514-015-9528-y

Learning potential functions from human demonstrations with encapsulated dynamic and compliant behaviors

Published: 19 December 2015

Volume 41, pages 45–69, (2017)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Seyed Mohammad Khansari-Zadeh¹ &
Oussama Khatib¹

2794 Accesses
48 Citations
11 Altmetric
Explore all metrics

Abstract

We consider the problem of devising a unified control policy capable of regulating both the robot motion and its physical interaction with the environment. We formulate this control policy by a non-parametric potential function and a dissipative field, which both can be learned from human demonstrations. We show that the robot motion and its stiffness behaviors can be encapsulated by the potential function’s gradient and curvature, respectively. The dissipative field can also be used to model desired damping behavior throughout the motion, hence generating motions that follows the same velocity profile as the demonstrations. The proposed controller can be realized as a unification approach between “realtime motion generation” and “variable impedance control”, with the advantages that it has guaranteed stability as well as does not rely on following a reference trajectory. Our approach, called unified motion and variable impedance control (UMIC), is completely time-invariant and can be learned from a few demonstrations via solving two (convex) constrained quadratic optimization problems. We validate UMIC on a library of 30 human handwriting motions and on a set of experiments on 7-DoF KUKA light weight robot.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model Predictive Motion Control based on Generalized Dynamical Movement Primitives

Article 23 September 2014

Physical Interaction via Dynamic Primitives

Bionic Hand Motion Control Method Based on Imitation of Human Hand Movements and Reinforcement Learning

Article 27 January 2024

Notes

A time-invariant system is a system whose output does not explicitly depend on time. Note that such system could have dependency on time derivatives of the state variable (Slotine and Li 1991).
Note that $\varvec{\gamma }^i$ point at the negative direction of potential field gradient, i.e., $\nabla \varPhi (\varvec{\xi }^i;\varvec{\varTheta }) = - \varvec{\gamma }^i$. This is why $\varvec{\gamma }^i$ appears with positive sign in Eq. (13a).
As we consider diagonal stiffness matrices of the form $\varvec{S}= s^i \varvec{I}$, belted -ellipsoids are in fact circles.
Regulator case simply results in a motion that directly goes to the target point. Thus, it does not have the capability to follow complex patterns such as the ones illustrated in Fig. 10.
Note that depending on the value of $\sigma ^i$, the matrix $\varvec{H}$ may have one or more small eigenvalues (${<}10^{-10}$). Their associated eigenvectors correspond to indifference to shift in $\varvec{\varTheta }$; hence there could be infinite equally optimal solutions. Among these solutions, we choose the one with minimum norm.

References

Billard, A., Calinon, S., Dillmann, R., & Schaal, S. (2008). Robot programming by demonstration. In Handbook of Robotics (pp. 1371–1394). Berlin/Heidelberg: Springer.
Brock, O., Kuffner, J., & Xiao, J. (2008). Handbook of robotics. In Motion for Manipulation Tasks (pp. 615–645). Berlin/Heidelberg: Springer. doi:10.1007/978-3-540-30301-5_6.
Buchli, J., Stulp, F., Theodorou, E., & Schaal, S. (2011). Learning variable impedance control. The International Journal of Robotics Research, 30(7), 820–833.
Article Google Scholar
Calinon, S., D’halluin, F., Sauser, E. L., Caldwell, D. G., & Billard, A. G. (2010a). Learning and reproduction of gestures by imitation: An approach based on Hidden Markov Model and Gaussian Mixture Regression. IEEE Robotics and Automation Magazine, 17(2), 44–54.
Article Google Scholar
Calinon, S., Sardellitti, I., & Caldwell, D. G. (2010b). Learning-based control strategy for safe human-robot interaction exploiting task and robot redundancies. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 249–254).
Calinon, S., Pistillo, A., & Caldwell, D. G. (2011). Encoding the time and space constraints of a task in explicit-duration hidden Markov model. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 3413–3418).
Cohen, M., & Flash, T. (1991). Learning impedance parameters for robot control using an associative search network. IEEE Transactions on Robotics and Automation, 7(3), 382–390. doi:10.1109/70.88148.
Article Google Scholar
Ferraguti, F., Secchi, C., & Fantuzzi, C. (2013). A tank-based approach to impedance control with variable stiffness. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (pp. 4948–4953).
Ganesh, G., Jarrasse, N., Haddadin, S., Albu-Schaeffer, A., & Burdet, E. (2012). A versatile biomimetic controller for contact tooling and haptic exploration. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (pp. 3329–3334).
Gomez, J. V., Alvarez, D., Garrido, S., & Moreno, L. (2012). Kinesthetic teaching via fast marching square. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 1305–1310).
Gribovskaya, E. (2010). Seyed Mohammad Khansari-Zadeh, and Aude Billard. Learning Nonlinear Multivariate Dynamics of Motion in Robotic Manipulators. The International Journal of Robotics Research, 30, 1–37.
Google Scholar
Haddadin, S., Albu-Schaffer, A., De Luca, A., & Hirzinger, G. (2008). Collision detection and reaction: A contribution to safe physical human-robot interaction. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008. IROS 2008 (pp. 3356–3363).
Hogan, N. (1985). Impedance control: An approach to manipulation. ASME Journal of Dynamic Systems, Measurement, and Control, 107.
Hogan, N., & Buerger, S. P. (2005). Robotics and Automation Handbook, Impedance and Interaction Control. Boca Raton, FL: CRC.
Google Scholar
Howard, M., Braun, D. J., & Vijayakumar, S. (2013). Transferring human impedance behavior to heterogeneous variable impedance actuators. IEEE Transactions on Robotics, 29(4), 847–862. doi:10.1109/TRO.2013.2256311.
Article Google Scholar
Howard, M., Klanke, S., Gienger, M., Goerick, C., & Vijayakumar, S. (2010). Methods for learning control policies from variable-constraint demonstrations. In From Motor Learning to Interaction Learning in Robots (vol. 264, pp. 253–291). Berlin/Heidelberg: Springer.
Ijspeert, A. J., Nakanishi, J., & Schaal, S. (2002). Movement imitation with nonlinear dynamical systems in humanoid robots. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (pp. 1398–1403).
Kavraki, L. E., & LaValle, S. M. (2008). Handbook of Robotics, Incollection Motion Planning. Berlin/Heidelberg: Springer. doi:10.1007/978-3-540-30301-5_6.
Google Scholar
Khansari-Zadeh, S. M. (2011). Lasa human handwriting library. http://lasa.epfl.ch/khansari/LASA_Handwriting_Dataset.zip.
Khansari-Zadeh, S. M. (2012). A Dynamical system-based approach to modeling stable robot control policies via imitation learning. Phd Thesis, cole Polytechnique Fdrale de Lausanne. http://infoscience.epfl.ch/record/182663.
Khansari-Zadeh, S. M., & Billard, A. (2011). Learning stable nonlinear dynamical systems with Gaussian mixture models. IEEE Transactions on Robotics, 27(5), 943–957. doi:10.1109/TRO.2011.2159412. ISSN 1552-3098.
Article Google Scholar
Khansari-Zadeh, S. M., & Billard, A. (2012). A dynamical system approach to realtime obstacle avoidance. Autonomous Robots, 32, 433–454. ISSN 0929-5593.
Article Google Scholar
Khansari-Zadeh, S. M., & Billard, A. (2014). Learning control lyapunov function to ensure stability of dynamical system-based robot reaching motions. Robotics and Autonomous Systems, 62(6), 752–765.
Article Google Scholar
Khansari-Zadeh, S. M., Lemme, A., Meirovitch, Y., Schrauwen, B., Giese, M. A., Steil, J., Ijspeert, A. J., & Billard, A. (2013). Benchmarking of state-of-the-art algorithms in generating human-like robot reaching motions. In Workshop at the IEEE-RAS International Conference on Humanoid Robots (Humanoids). http://www.amarsi-project.eu/news/humanoids-2013-workshop.
Khansari-Zadeh, S. M., Kronander, K., & Billard, A. (2014). Modeling robot discrete movements with state-varying stiffness and damping: A framework for integrated motion generation and impedance control. In Proceedings of Robotics: Science and Systems X (RSS 2014). Berkeley, California.
Khatib, O. (1986). Real-time obstacle avoidance for manipulators and mobile robots. International Journal of Robotics Research, 5, 90–98.
Article Google Scholar
Khatib, O. (1987). A unified approach for motion and force control of robot manipulators: The operational space formulation. IEEE Journal of Robotics and Automation, 3, 43–53.
Article Google Scholar
Khatib, O. (1995). Inertial properties in robotic manipulation: An object-level framework. The International Journal of Robotics Research, 14(1), 19–36.
Article Google Scholar
Khatib, O., Sentis, L., & Park, J.-H. (2008). A unified framework for whole-body humanoid robot control with multiple constraints and contacts. In European Robotics Symposium 2008, volume 44 of Springer Tracts in Advanced Robotics (pp. 303–312). Berlin/Heidelberg: Springer.
Kim, B., Park, J., Park, S., & Kang, S. (2010). Impedance learning for robotic contact tasks using natural actor-critic algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 40(2), 433–443. doi:10.1109/TSMCB.2009.2026289.
Article Google Scholar
Kim, J.-O., & Khosla, P. K. (1992). Real-time obstacle avoidance using harmonic potential functions. IEEE Transactions on Robotics and Automation, 8(3), 338–349.
Article Google Scholar
Kishi, Y., Yamada, Y., & Yokoyama, K. (2012). The role of joint stiffness enhancing collision reaction performance of collaborative robot manipulators. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 376–381).
Koditschek, D. (1989). Robot Planning and Control Via Potential Functions (pp. 349–367).
Kormushev, P., Calinon, S., & Caldwell, D. G. (2010). Robot motor skill coordination with EM-based reinforcement learning. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 3232–3237). Taipei, Taiwan.
Kronander, K., & Billard, A. (2013). Learning compliant manipulation through kinesthetic and tactile human–robot interaction. IEEE Transactions on Haptics, 7(3), 367–380.
Article Google Scholar
Lee, K., & Buss, M. (2008). Force tracking impedance control with variable target stiffness. In Proceedings of the International Federation of Automatic Control World Congress (pp. 6751–6756).
Li, M., Yin, H., Tahara, K., & Billard, A. (2014). Learning object-level impedance control for robust grasping and dexterous manipulation. In 2014 IEEE International Conference on Robotics and Automation (ICRA) (pp. 6784–6791).
Mattingley, J., & Boyd, S. (2012). Cvxgen: A code generator for embedded convex optimization. Optimization and Engineering, 13(1), 1–27. ISSN 1389-4420.
Article MATH MathSciNet Google Scholar
Mitrovic, D., Klanke, S., & Vijayakumar, S. (2011). Learning impedance control of antagonistic systems based on stochastic optimization principles. The International Journal of Robotics Research, 30(5), 556–573.
Article Google Scholar
Muelling, K., Kober, J., Kroemer, O., & Peters, J. (2013). Learning to select and generalize striking movements in robot table tennis. International Journal of Robotics Research, 32, 263–279.
Article Google Scholar
Ott, C. (2008). Cartesian Impedance Control of Redundant and Flexible-Joint Robots. Springer Tracts in Advanced Robotics.
Pipe, A. G. (2000). An architecture for learning “potential field” cognitive maps with an application to mobile robotics. Adaptive Behavior, 8(2), 173–203. doi:10.1177/105971230000800205.
Article Google Scholar
Rimon, E., & Koditschek, D. E. (1992). Exact robot navigation using artificial potential functions. IEEE Transactions on Robotics and Automation, 8(5), 501–518. doi:10.1109/70.163777.
Article Google Scholar
Schaal, S. (1999). Is imitation learning the route to humanoid robots? Trends in Cognitive Sciences, 3(6), 233–242.
Article Google Scholar
Siciliano, B., Sciavicco, L., Villani, L., & Oriolo, G. (2009). Robotics: Modelling, Planning and Control. Advanced Textbooks in Control and Signal Processing. Springer.
Slotine, J. J. E., & Li, W. (1991). Applied Nonlinear Control. Englewood Cliffs: Prentice-Hall.
MATH Google Scholar
Stulp, F., Buchli, J., Ellmer, A., Mistry, M., Theodorou, E., & Schaal, S. (2012). Model-free reinforcement learning of impedance control in stochastic environments. IEEE Transactions on Autonomous Mental Development, 4(4), 330–341.
Article Google Scholar
Ude, A., Gams, A., Asfour, T., & Morimoto, J. (2010). Task-specific generalization of discrete and periodic dynamic movement primitives. IEEE Transactions on Robotics, 26(5), 800–815.
Article Google Scholar
Villani, L., & De Schutter, J. (2008). Handbook of Robotics, Force Control (pp. 161–185). Berlin/Heidelberg: Springer.
Google Scholar
Wolf, S., & Hirzinger, G. (2008). A new variable stiffness design: Matching requirements of the next robot generation. In IEEE International Conference on Robotics and Automation, 2008. ICRA 2008 (pp. 1741–1746). doi:10.1109/ROBOT.2008.4543452.
Zinn, M., Khatib, O., Roth, B., & Salisbury, J. K. (2004). Playing it safe (human-friendly robots). IEEE Robotics Automation Magazine, 11(2), 12–21. doi:10.1109/MRA.2004.1310938. ISSN 1070-9932.
Article Google Scholar

Download references

Acknowledgments

Mohammad Khansari is supported by the Swiss National Science Foundation.

Author information

Authors and Affiliations

Computer Science Department, Stanford University, Stanford, USA
Seyed Mohammad Khansari-Zadeh & Oussama Khatib

Authors

Seyed Mohammad Khansari-Zadeh
View author publications
You can also search for this author in PubMed Google Scholar
Oussama Khatib
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seyed Mohammad Khansari-Zadeh.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 22815 KB)

Appendix

1.1 Coefficients of the optimization function

As described in Sect. 4, the optimization problem given by Eq. (13) can be transformed into the well-known form:

$$\begin{aligned}&\min _{\varvec{\varTheta }} J(\varvec{\varTheta }) = \frac{1}{2} \varvec{\varTheta }^T \varvec{H} \varvec{\varTheta } + \varvec{h}^T \varvec{\varTheta } + h_0\nonumber \\&{\text {subject to}} \end{aligned}$$

(20a)

$$\begin{aligned}&\varvec{\mathcal {C}} \varvec{\varTheta } \le \varvec{0} \end{aligned}$$

(20b)

$$\begin{aligned}&\varvec{\mathcal {G}} \varvec{\varTheta } = \varvec{g} \end{aligned}$$

(20c)

where $\varvec{H} \in {\mathbb {R}}^{{\mathcal {T}} \times {\mathcal {T}}}$ is a symmetric positive semi-definite matrix,^{Footnote 5} $\varvec{{\mathcal {C}}} \in {\mathbb {R}}^{{\mathcal {T}} \times {\mathcal {T}}}$ and $\varvec{{\mathcal {G}}} \in {\mathbb {R}}^{d \times {\mathcal {T}}}$ are full rank matrices, $\varvec{h} \in {\mathbb {R}}^{{\mathcal {T}}}$, $\varvec{g} \in {\mathbb {R}}^{d}$, and $h_0$ is a scalar independent of $\varvec{\varTheta }$.

In this appendix, we provide the formulation to compute the coefficients of this quadratic optimization function as given by Eq. (20). The first step to reach our goal is to transform the term $- \nabla \varPhi (\varvec{\xi }^i;\varvec{\varTheta })$ into the form $\varvec{A}^i \varvec{\varTheta } + \varvec{b}^i$, where $\varvec{A}^i \in {\mathbb {R}}^{d \times {\mathcal {T}}}$ and $\varvec{b}^i \in {\mathbb {R}}^{d}$. Let us define:

$$\begin{aligned}&\varvec{\eta }^{ik} = \frac{\tilde{\omega }^{k}(\varvec{\xi }^{i})}{(\sigma ^{k})^2}\left( \varvec{\xi }^{i} - \varvec{\xi }^{k}\right) \end{aligned}$$

(21a)

$$\begin{aligned}&\varvec{\rho }^{i} = \sum _{k=1}^{{\mathcal {T}}} \varvec{\eta }^{ik}\end{aligned}$$

(21b)

$$\begin{aligned}&\upsilon ^{ik} = \frac{1}{2}\left( \varvec{\xi }^i - \varvec{\xi }^k\right) ^T \varvec{S}^k \left( \varvec{\xi }^i - \varvec{\xi }^k\right) \end{aligned}$$

(21c)

then we have:

$$\begin{aligned}&\varvec{A}^i = \left[ \varvec{\eta }^{i1} - \tilde{\omega }^{1}(\varvec{\xi }^{i}) \varvec{\rho }^{i} \ldots \varvec{\eta }^{i {\mathcal {T}}} - \tilde{\omega }^{{\mathcal {T}}}(\varvec{\xi }^{i}) \varvec{\rho }^{i} \right] \end{aligned}$$

(22a)

$$\begin{aligned}&\varvec{b}^i = \sum _{k=1}^{{\mathcal {T}}} \upsilon ^{ik} \varvec{\eta }^{ik} - \tilde{\omega }^{k}(\varvec{\xi }^{i}) \Big ( \upsilon ^{ik} \varvec{\rho }^{i} + \varvec{S}^k\left( \varvec{\xi }^{i} - \varvec{\xi }^{k}\right) \Big ) \end{aligned}$$

(22b)

The desired gradient for the i-th data point is also given by $\varvec{\gamma }^i$ (see Sect. 4). To account for all the points in the data set, we concatenate the matrices $\varvec{A}^i$, vectors $\varvec{b}^i$ and $\varvec{\gamma }^i$ into a bigger matrix $\varvec{A}\in {\mathbb {R}}^{{\mathcal {T}}d \times {\mathcal {T}}}$ and vectors $\varvec{b} \in {\mathbb {R}}^{{\mathcal {T}}d}$ and $\varvec{\gamma } \in {\mathbb {R}}^{{\mathcal {T}}d}$:

$$\begin{aligned}&\varvec{A}= \left[ (\varvec{A}^1)^T \ldots (\varvec{A}^{{\mathcal {T}}})^T \right] ^{T}\end{aligned}$$

(23a)

$$\begin{aligned}&\varvec{b} = \left[ \varvec{b}^1 \ldots \varvec{b}^{{\mathcal {T}}} \right] ^{T}\end{aligned}$$

(23b)

$$\begin{aligned}&\varvec{\gamma } = \left[ \varvec{\gamma }^1 \cdots \varvec{\gamma }^{{\mathcal {T}}} \right] ^{T} \end{aligned}$$

(23c)

Then we could obtain the equivalent of Eq. (13a) in the quadratic form $\frac{1}{2} \varvec{\varTheta }^T \varvec{H} \varvec{\varTheta } + \varvec{h}^T \varvec{\varTheta } + h_0$:

$$\begin{aligned}&\varvec{H} = 2 \varvec{A}^T \varvec{A}\end{aligned}$$

(24a)

$$\begin{aligned}&\varvec{h} = 2 \varvec{A}^T (\varvec{b} - \varvec{\gamma }) \end{aligned}$$

(24b)

$$\begin{aligned}&h_0 = (\varvec{b} - \varvec{\gamma })^T (\varvec{b} - \varvec{\gamma }) \end{aligned}$$

(24c)

The matrix $\varvec{{\mathcal {C}}}$ in Eq. (20b) can be computed by:

$$\begin{aligned} \varvec{{\mathcal {C}}}^{ij} = {\left\{ \begin{array}{ll} -1 &{}\quad i=j \\ 1 &{}\quad i=j-1,~ i \notin \varOmega \\ 0 &{}\quad otherwise \end{array}\right. } \end{aligned}$$

(25)

where $\varvec{{\mathcal {C}}}^{ij}$ refers to the component in the i-th row and j-th column of $\varvec{{\mathcal {C}}}$, and $\varOmega $ is the set of indices that corresponds to the last point of each demonstration trajectory.

The matrix $\varvec{{\mathcal {G}}}$ and vector $\varvec{g}$ in Eq. (20c) are in fact equal to $\varvec{A}^{{\mathcal {T}}}$ and $-\varvec{b}^{{\mathcal {T}}}$, respectively. This is due to the fact that the last point of each demonstration trajectory is in fact the target point, hence $\varvec{\xi }^{{\mathcal {T}}} = \varvec{\xi }^*$. Note that as we have N trajectories, we could use any of the N available $\varvec{A}^i$ and $\varvec{b}^i$, $i \in \varOmega $ to obtain $\varvec{{\mathcal {G}}}$ and vector $\varvec{g}$. Here for simplicity we use the last point of the last trajectory which by construction has the index ${\mathcal {T}}$.

1.2 Stability proof

For a manipulator with d generalized degrees of freedom $\varvec{\xi }\in {\mathbb {R}}^{d}$, the robot dynamics (whether in operational or joint space) can be represented by (Khatib 1995; Ott 2008; Siciliano et al. 2009):

$$\begin{aligned} \varvec{M}(\varvec{\xi }) \ddot{\varvec{\xi }}+ \varvec{C}\left( \varvec{\xi },\dot{\varvec{\xi }}\right) \dot{\varvec{\xi }}+ \varvec{g}(\varvec{\xi }) = \varvec{\tau } + \varvec{\tau }_{ext} \end{aligned}$$

(26)

where $\varvec{M}(\varvec{\xi }) \in {\mathbb {R}}^{d \times d}$ is the inertia matrix, $\varvec{C}(\varvec{\xi },\dot{\varvec{\xi }}) \in {\mathbb {R}}^{d \times d}$ is the Coriolis/centrifugal matrix, $\varvec{g}(\varvec{\xi })$ is the gravitational force, $\varvec{\tau }$ represents the actuators generalized force, and $\varvec{\tau }_{ext}$ is the external generalized force applied to the robot by the environment.

Similarly to an impedance controller, the actuators generalized force in our control setting is composed of two terms: the gravitational term $\varvec{g}(\varvec{\xi })$ to compensate for the weight of the robot and the controller term $\varvec{\tau }_{c}$ to perform the task:

$$\begin{aligned} \varvec{\tau } = \varvec{\tau }_{c} + \varvec{g}(\varvec{\xi }) \end{aligned}$$

(27)

To verify stability of UMIC, we use the following definition of passivity, taken from (Slotine and Li 1991):

Definition 1

A system with input effort v and output flow y is passive if it satisfies:

$$\begin{aligned} \dot{V} = v^Ty-n \end{aligned}$$

(28)

for some lower bounded scalar functions V and $n \ge 0$.

In our setting, the effort is $\varvec{\tau }_{ext}$ and the flow is $\dot{\varvec{\xi }}$. To ensure passivity/stability of our controller, we define the following candidate Lyapunov function:

$$\begin{aligned} V(\varvec{\xi },\dot{\varvec{\xi }}) = \varPhi (\varvec{\xi }) + \frac{1}{2} \dot{\varvec{\xi }}^T \varvec{M}(\varvec{\xi }) \dot{\varvec{\xi }}\end{aligned}$$

(29)

Taking the time-derivative of $V(\varvec{\xi },\dot{\varvec{\xi }})$ yields:

$$\begin{aligned} \dot{V}(\varvec{\xi },\dot{\varvec{\xi }}) = \dot{\varvec{\xi }}^T \nabla \varPhi (\varvec{\xi }) + \dot{\varvec{\xi }}^T \varvec{M}(\varvec{\xi }) \ddot{\varvec{\xi }}+ \frac{1}{2}\dot{\varvec{\xi }}^T \dot{\varvec{M}}\left( \varvec{\xi },\dot{\varvec{\xi }}\right) \dot{\varvec{\xi }}\end{aligned}$$

(30)

The term $\varvec{M}(\varvec{\xi }) \ddot{\varvec{\xi }}$ can be obtained by rearranging Eq. (26):

$$\begin{aligned} \varvec{M}(\varvec{\xi }) \ddot{\varvec{\xi }}= \varvec{\tau } + \varvec{\tau }_{ext} - \varvec{C}\left( \varvec{\xi },\dot{\varvec{\xi }}\right) \dot{\varvec{\xi }}- \varvec{g}(\varvec{\xi }) \end{aligned}$$

(31)

Furthermore from Eqs. (1) and (27), we have $\varvec{\tau } = \varvec{\tau }_{c} + \varvec{g}(\varvec{\xi }) = -\nabla \varPhi (\varvec{\xi }) - \varvec{\varPsi }(\varvec{\xi },\dot{\varvec{\xi }}) + \varvec{g}(\varvec{\xi })$. Considering this and the skew-symmetric property $\dot{\varvec{M}}(\varvec{\xi },\dot{\varvec{\xi }}) - 2\varvec{C}(\varvec{\xi },\dot{\varvec{\xi }})$, Eq. (31) can be simplified to:

$$\begin{aligned} \dot{V}(\varvec{\xi },\dot{\varvec{\xi }})&= - \dot{\varvec{\xi }}^T \varvec{\varPsi }(\varvec{\xi },\dot{\varvec{\xi }}) - \dot{\varvec{\xi }}^T \varvec{\tau }_{ext} \nonumber \\&\quad -\, \underbrace{\dot{\varvec{\xi }}^T \Big ( \sum _{i} \tilde{\omega }^i(\varvec{\xi }) \varvec{D}^i \Big ) \dot{\varvec{\xi }}}_{\varvec{n}(\varvec{\xi },\dot{\varvec{\xi }}) \ge 0} + \dot{\varvec{\xi }}^T \varvec{\tau }_{ext} \nonumber \\&= - \varvec{n}(\varvec{\xi },\dot{\varvec{\xi }}) + \dot{\varvec{\xi }}^T \varvec{\tau }_{ext} \end{aligned}$$

(32)

Note that $\varvec{n}(\varvec{\xi },\dot{\varvec{\xi }}) \ge 0$ because $D^i$ are positive definite matrices and by construction $\tilde{\omega }^i(\varvec{\xi })>0$. Hence, following Definition 1, UMIC yields a passive map from the external generalized force to the velocity of the manipulator.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khansari-Zadeh, S.M., Khatib, O. Learning potential functions from human demonstrations with encapsulated dynamic and compliant behaviors. Auton Robot 41, 45–69 (2017). https://doi.org/10.1007/s10514-015-9528-y

Download citation

Received: 13 February 2015
Accepted: 25 November 2015
Published: 19 December 2015
Issue Date: January 2017
DOI: https://doi.org/10.1007/s10514-015-9528-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning potential functions from human demonstrations with encapsulated dynamic and compliant behaviors

Abstract

Access this article

Similar content being viewed by others

Model Predictive Motion Control based on Generalized Dynamical Movement Primitives

Physical Interaction via Dynamic Primitives

Bionic Hand Motion Control Method Based on Imitation of Human Hand Movements and Reinforcement Learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Appendix

1.1 Coefficients of the optimization function

1.2 Stability proof

Definition 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning potential functions from human demonstrations with encapsulated dynamic and compliant behaviors

Abstract

Access this article

Similar content being viewed by others

Model Predictive Motion Control based on Generalized Dynamical Movement Primitives

Physical Interaction via Dynamic Primitives

Bionic Hand Motion Control Method Based on Imitation of Human Hand Movements and Reinforcement Learning

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Appendix

Appendix

1.1 Coefficients of the optimization function

1.2 Stability proof

Definition 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation