Robot trajectory generation using modified hidden Markov model and Lloyd's algorithm in joint space

https://doi.org/10.1016/j.engappai.2016.03.006Get rights and content

Abstract

Human guide robots usually generate desired trajectories from human demonstrations. The training process can be in task space or joint space. The task space method needs the inverse kinematics; the joint space method uses dynamic time warping. Both of them destroy the accuracy of the generated trajectory.

In this paper, we first use Lloyd's algorithm to encode the input signals such that the observations are time-independent. The desired trajectory is generated in joint space without dynamic time warping. Then we modify the hidden Markov model (HMM) such that it can work in joint space. Since the desired trajectories are the joint angles, they can be applied directly to robot control without calculating the inverse kinematics. Simulation and experimental results show that the modified HMM with Lloyd's algorithm work well in joint space.

Introduction

The control structure of many human guide robots, such as robot exoskeleton, humanoid robot, and surgical robot, has three levels: joint angle tracking, trajectory planning, and path planning, see Fig. 1. The path planning is in task space. It generates a path from a starting-point to an ending-point with respect to some restrictions. The trajectory planning can be in task space or joint space. It gives the desired trajectories of the end-effector (task space), or desired joint angles (joint space), which are in the path generated by the path planning. The joint angle tracking usually uses PD/PID controller to force the joints to follow the desired angles generated by the trajectory planning.

The final problem of this control structure is to generate desired joint angles which satisfy human requirements (Xu and Lee, 2005). This is the object of transferring human skill to the robot through demonstrations. We also call it programming by demonstration (PbD) or learning from demonstration (LfD) (Argalla et al., 2009). The robot trajectory generation can be broadly divided into two trends: (1) Symbolic level. The human skill is decomposed into a sequence of action–perception units, then a statistical model is used to deal with the demonstrations (Hersch et al., 2008, Billard et al., 2007). (2) Trajectories level. A nonlinear mapping is used to model the sensor/motor information. The trajectories level method is robust to the environment changes (Ijspeert et al., 2012). The symbolic level method is suitable to model complex human actions. In this paper we use the symbolic level method.

Although the trajectory planning in joint space can avoid the calculation of the inverse kinematics, the demonstrations in joint space are time-dependent. Fig. 2 shows the trajectories in joint space and task space, when a two-link planar robot draws a broken line. Since the trajectories in task space only give space relation, the three lines overlap in task space. However, in joint space they are complete different because of different drawing speeds. In this sense, training in task space is easier than in joint space (Vakanski et al., 2012, Kuniyoshia et al., 2004, Wang et al., 2013, Gribovskaya et al., 2011). After the training in task space, the inverse kinematics need to be solved, which requires complete knowledge of the robot.

There are few works in joint space (Berg et al., 2010, Peters and Schaal, 2008). The dynamic time warping (DTW) is an effective tool to deal with the time-dependent problem. The computation time for one-dimensional signals, such as time series, is in polynomial. The extension of DTW for more than two-dimension, like robots, becomes NP-complete. The accuracy of the high dimension approximation is also very low (Sakoe and Chiba, 1978).

Statistical learning techniques deal with the high variability inherent in the demonstrations. They are not sensitive to disturbances. For instance, spline smoothing technique can deal with the uncertainty in several motion demonstrations (Ude, 1993). The mean and variance of the collected variables are applied in Ogawara et al. (2003) to generate a model. Ito et al. (2006) realizes online imitation by encoding two different motor loops.

Hidden Markov Model (HMM) generates a sequence which is called Markov chain (Rabiner, 1989, Yang et al., 1997). It can encode the motion of a robot, and find the highest probability state path by Viterbi algorithm (Viterbi, 1967). HMMs use finite Gaussian mixture models as their hidden state distributions. The Gaussian mixture model can encode a set of trajectories (Argalla et al., 2009). The Gaussian mixture regression can retrieve a smooth trajectory from several demonstrations (Billard et al., 2007). There are many successful application of robot trajectory generation via HMM (Yang et al., 1997, Kwon and Park, 2008). HMM offers many advantages over the other statistical models for human behavior modeling, such as better compression, variant structures, and training incrementally. One weakness of HMM is that the trajectory generation can only use current state for the emission and the transition probabilities. HMM does not map well to many time-dependent domains, such as joint space (Mohamed and Gader, 2000).

To train HMM, it is necessary to map continuous trajectories into discrete values, named codebook. It is impossible to use all sampled data to train HMM. The key-points include necessary information for HMM. The normal method of selecting the key-points uses the shape of the trajectory. It can be positions evaluation (Yang et al., 1997) or position/velocity evaluation (Vakanski et al., 2012). Linde–Buzo–Gray (LBG) is the most popular method. The above methods do not work well in joint space, because the trajectories in joint space are time-dependent, while these methods use the shape information (Gernot, 2008). Lloyd's algorithm partitions data into well-shaped and uniformly sized convex cells (Lloyd, 1982). It repeatedly finds the centroid of each set in the partition using Voronoi diagrams. In this paper, we use Lloyd's algorithm to avoid the following two problems: (1) calculation problem of the dynamic time warping in joint space, (2) selection problems of the codebook and key-points in joint space. The classical HMM is also modified such that it is more feasible to generate joint trajectory. A new element is introduced for HMM.

In order to generate the desired robot trajectory, this paper proposes a different method from the previous works like inverse kinematics and dynamic time warping. The contributions of the paper include three parts:

  • 1.

    The classical HMM is modified such that it is more feasible to generate trajectory in joint space. We introduce a new auxiliary output in HMM to help training process.

  • 2.

    The Lloyd's algorithm is modified for HMM, such that it can solve the problems in joint space learning.

  • 3.

    The proposed method is validated with a two-link planar robot and a four degree-of-freedom (4-DoF) exoskeleton robot (Garrido et al., 2016).

Section snippets

Codebook and key-points generation

The dynamics of a nlink robot in joint space can be written asM(q)q¨+c(q,q̇)q̇+g(q)+F(q̇)=τwhere qRn denotes the links angles, q̇Rn denotes the links velocity, M(q)Rn×n is the inertia matrix, c(q,q̇)Rn×n is the centripetal and Coriolis matrix, g(q)Rn is the gravity vector, FRn is the frictional terms, and τRn is the input control vector.

The object of this paper is to generate the desired trajectory q from the demonstrations in the joint space Q=[Q1Qn]T, Qi=[Xi1Xim]T. Here the

Joint space trajectory generation with modified hidden Markov model

After the codebook, key points, and the observation symbols are generated by Lloyd's algorithm, they are used for training a discrete Hidden Markov Model (HMM). This model can generate a desired trajectory in the joint space as we want, see Fig. 5.

An HMM can be understood as a finite state machine where at any instant N, the different states are defined as S={S1,S2,S3SN}. At regular intervals, the HMM may change its states. The time associated with the changes are denoted by t=1,2,, the

Experiments of learning trajectory

In this section, we evaluate the effectiveness of the proposed method with two examples: a two-link revolute joint arm (planar elbow manipulator) and our 4-DoF exoskeleton robot (Garrido et al., 2016). The experiments are implemented in Matlab without code optimization. The computer is a PC with Intel Core i3 3.30 Ghz processor. We use Kevin Murphy's HMM Toolbox (Murphy, 1998) (Baum–Welch algorithm (Baum et al., 1970)) to train HMM.

We compare our method (LHMM) with the other two approaches: HMM

Conclusions

The advantage of programming robot by demonstration in joint space is to avoid the inverse kinematics. The disadvantage is the demonstrations in joint space are strong time-dependent. This paper uses Lloyd's algorithm and modified HMM to solve the problems in joint space.

The contributions of the paper are: (1) we use Lloyd's algorithm to encode and quantize the input vectors. This approach improves accuracy compared with the dynamic time warping method. (2) The traditional HMM is modified such

References (26)

  • M. Hersch et al.

    Dynamical system modulation for robot learning via kinesthetic demonstrations

    IEEE Trans. Robot.

    (2008)
  • A.J. Ijspeert et al.

    Dynamical movement primitiveslearning attractor models for motor behaviors

    Neural Comput.

    (2012)
  • Y. Kuniyoshia et al.

    Embodied basis of invariant features in execution and perception of whole-body dynamic actions-knacks and focuses of Roll-and-Rise motion

    Robot. Auton. Syst.

    (2004)
  • Cited by (20)

    • A novel health prognosis method for system based on improved degenerated Hidden Markov model

      2022, Robotics and Computer-Integrated Manufacturing
      Citation Excerpt :

      Moreover, these models were generally applied to offline health prognostics as its recognition and training processes are time-consuming. In previous studies [22–24], the Lloyd algorithm was widely used to realize the scalarization of input data. It performs effectively for vibrational signal data type.

    • Error model-oriented vibration suppression control of free-floating space robot with flexible joints based on adaptive neural network

      2022, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      For nearly half a century, from the former Soviet Union’s “salute” space station to the EU’s “International Space Station” and then to China’s “sky one” space station, the space robot manipulator has been more and more widely used in exploration, assembly, construction, service, maintenance or assisting astronauts in extravehicular tasks (Dong et al., 2019; Fan and Tang, 2021; Meng et al., 2021; Seddaoui and Saaj, 2019; Zhao et al., 2021) However, due to the space environment in which the space robot is located and the free-floating mode of the matrix, it has different dynamic characteristics and restrictions from the ground robot: the dynamic coupling between the manipulator and the base, the dynamic singularity, the limited fuel supply and the restrictions of the attitude control system (Chen et al., 2021; Swei et al., 2020) . At the same time, there are many uncertainties in the dynamic model of space robot, such as manipulator mass, inertia matrix, load mass and other dynamic models cannot be obtained accurately, and the external disturbance signals have a certain impact on the controller (Garrido et al., 2016; Liu et al., 2022; Xia et al., 2021). For the control problem of free-floating space robot, in (Asar et al., 2019; Ning and Wu, 2019; Yan et al., 2022; Zhao et al., 2017), their own control methods based on adaptive control theory successively are proposed, solving the high-precision control problem of space robot under parameter uncertainty and external interference, and achieved good control effect.

    • A multi-objective approach for the trajectory planning of a 7-DOF serial-parallel hybrid humanoid arm

      2021, Mechanism and Machine Theory
      Citation Excerpt :

      Additionally, in the obtained trajectory, the initial and final values of the velocity and acceleration can be configured almost arbitrarily as need. The approach works on a space of the output angle of joint moving platform (OAJ), demanding less computational effort than performing the trajectory planning of the HHRA in operating space [23-25]. The OAJ space can be transferred to operating space through the equivalent forward kinematics analysis.

    • Review of research on human-robot collaboration safety in the context of intelligent construction

      2023, Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition)
    View all citing articles on Scopus
    View full text