Robot trajectory generation using modified hidden Markov model and Lloyd's algorithm in joint space

doi:10.1016/j.engappai.2016.03.006

Engineering Applications of Artificial Intelligence

Volume 53, August 2016, Pages 32-40

https://doi.org/10.1016/j.engappai.2016.03.006 Get rights and content

Abstract

Human guide robots usually generate desired trajectories from human demonstrations. The training process can be in task space or joint space. The task space method needs the inverse kinematics; the joint space method uses dynamic time warping. Both of them destroy the accuracy of the generated trajectory.

In this paper, we first use Lloyd's algorithm to encode the input signals such that the observations are time-independent. The desired trajectory is generated in joint space without dynamic time warping. Then we modify the hidden Markov model (HMM) such that it can work in joint space. Since the desired trajectories are the joint angles, they can be applied directly to robot control without calculating the inverse kinematics. Simulation and experimental results show that the modified HMM with Lloyd's algorithm work well in joint space.

Introduction

The control structure of many human guide robots, such as robot exoskeleton, humanoid robot, and surgical robot, has three levels: joint angle tracking, trajectory planning, and path planning, see Fig. 1. The path planning is in task space. It generates a path from a starting-point to an ending-point with respect to some restrictions. The trajectory planning can be in task space or joint space. It gives the desired trajectories of the end-effector (task space), or desired joint angles (joint space), which are in the path generated by the path planning. The joint angle tracking usually uses PD/PID controller to force the joints to follow the desired angles generated by the trajectory planning.

The final problem of this control structure is to generate desired joint angles which satisfy human requirements (Xu and Lee, 2005). This is the object of transferring human skill to the robot through demonstrations. We also call it programming by demonstration (PbD) or learning from demonstration (LfD) (Argalla et al., 2009). The robot trajectory generation can be broadly divided into two trends: (1) Symbolic level. The human skill is decomposed into a sequence of action–perception units, then a statistical model is used to deal with the demonstrations (Hersch et al., 2008, Billard et al., 2007). (2) Trajectories level. A nonlinear mapping is used to model the sensor/motor information. The trajectories level method is robust to the environment changes (Ijspeert et al., 2012). The symbolic level method is suitable to model complex human actions. In this paper we use the symbolic level method.

Although the trajectory planning in joint space can avoid the calculation of the inverse kinematics, the demonstrations in joint space are time-dependent. Fig. 2 shows the trajectories in joint space and task space, when a two-link planar robot draws a broken line. Since the trajectories in task space only give space relation, the three lines overlap in task space. However, in joint space they are complete different because of different drawing speeds. In this sense, training in task space is easier than in joint space (Vakanski et al., 2012, Kuniyoshia et al., 2004, Wang et al., 2013, Gribovskaya et al., 2011). After the training in task space, the inverse kinematics need to be solved, which requires complete knowledge of the robot.

There are few works in joint space (Berg et al., 2010, Peters and Schaal, 2008). The dynamic time warping (DTW) is an effective tool to deal with the time-dependent problem. The computation time for one-dimensional signals, such as time series, is in polynomial. The extension of DTW for more than two-dimension, like robots, becomes NP-complete. The accuracy of the high dimension approximation is also very low (Sakoe and Chiba, 1978).

Statistical learning techniques deal with the high variability inherent in the demonstrations. They are not sensitive to disturbances. For instance, spline smoothing technique can deal with the uncertainty in several motion demonstrations (Ude, 1993). The mean and variance of the collected variables are applied in Ogawara et al. (2003) to generate a model. Ito et al. (2006) realizes online imitation by encoding two different motor loops.

Hidden Markov Model (HMM) generates a sequence which is called Markov chain (Rabiner, 1989, Yang et al., 1997). It can encode the motion of a robot, and find the highest probability state path by Viterbi algorithm (Viterbi, 1967). HMMs use finite Gaussian mixture models as their hidden state distributions. The Gaussian mixture model can encode a set of trajectories (Argalla et al., 2009). The Gaussian mixture regression can retrieve a smooth trajectory from several demonstrations (Billard et al., 2007). There are many successful application of robot trajectory generation via HMM (Yang et al., 1997, Kwon and Park, 2008). HMM offers many advantages over the other statistical models for human behavior modeling, such as better compression, variant structures, and training incrementally. One weakness of HMM is that the trajectory generation can only use current state for the emission and the transition probabilities. HMM does not map well to many time-dependent domains, such as joint space (Mohamed and Gader, 2000).

To train HMM, it is necessary to map continuous trajectories into discrete values, named codebook. It is impossible to use all sampled data to train HMM. The key-points include necessary information for HMM. The normal method of selecting the key-points uses the shape of the trajectory. It can be positions evaluation (Yang et al., 1997) or position/velocity evaluation (Vakanski et al., 2012). Linde–Buzo–Gray (LBG) is the most popular method. The above methods do not work well in joint space, because the trajectories in joint space are time-dependent, while these methods use the shape information (Gernot, 2008). Lloyd's algorithm partitions data into well-shaped and uniformly sized convex cells (Lloyd, 1982). It repeatedly finds the centroid of each set in the partition using Voronoi diagrams. In this paper, we use Lloyd's algorithm to avoid the following two problems: (1) calculation problem of the dynamic time warping in joint space, (2) selection problems of the codebook and key-points in joint space. The classical HMM is also modified such that it is more feasible to generate joint trajectory. A new element is introduced for HMM.

In order to generate the desired robot trajectory, this paper proposes a different method from the previous works like inverse kinematics and dynamic time warping. The contributions of the paper include three parts:

1.
The classical HMM is modified such that it is more feasible to generate trajectory in joint space. We introduce a new auxiliary output in HMM to help training process.
2.
The Lloyd's algorithm is modified for HMM, such that it can solve the problems in joint space learning.
3.
The proposed method is validated with a two-link planar robot and a four degree-of-freedom (4-DoF) exoskeleton robot (Garrido et al., 2016).

Section snippets

Codebook and key-points generation

The dynamics of a $n -$ link robot in joint space can be written as $M (q) \ddot{q} + c (q, \dot{q}) \dot{q} + g (q) + F (\dot{q}) = τ$ where $q \in R^{n}$ denotes the links angles, $\dot{q} \in R^{n}$ denotes the links velocity, $M (q) \in R^{n \times n}$ is the inertia matrix, $c (q, \dot{q}) \in R^{n \times n}$ is the centripetal and Coriolis matrix, $g (q) \in R^{n}$ is the gravity vector, $F \in R^{n}$ is the frictional terms, and $τ \in R^{n}$ is the input control vector.

The object of this paper is to generate the desired trajectory q^⁎ from the demonstrations in the joint space $Q = {[Q_{1} \dots Q_{n}]}^{T},$ $Q_{i} = {[X_{i}^{1} \dots X_{i}^{m}]}^{T}$ . Here the

Joint space trajectory generation with modified hidden Markov model

After the codebook, key points, and the observation symbols are generated by Lloyd's algorithm, they are used for training a discrete Hidden Markov Model (HMM). This model can generate a desired trajectory in the joint space as we want, see Fig. 5.

An HMM can be understood as a finite state machine where at any instant $N$ , the different states are defined as $S = {S_{1}, S_{2}, S_{3} \dots S_{N}}$ . At regular intervals, the HMM may change its states. The time associated with the changes are denoted by $t = 1, 2, \dots$ , the

Experiments of learning trajectory

In this section, we evaluate the effectiveness of the proposed method with two examples: a two-link revolute joint arm (planar elbow manipulator) and our 4-DoF exoskeleton robot (Garrido et al., 2016). The experiments are implemented in Matlab without code optimization. The computer is a PC with Intel Core i3 3.30 Ghz processor. We use Kevin Murphy's HMM Toolbox (Murphy, 1998) (Baum–Welch algorithm (Baum et al., 1970)) to train HMM.

We compare our method (LHMM) with the other two approaches: HMM

Conclusions

The advantage of programming robot by demonstration in joint space is to avoid the inverse kinematics. The disadvantage is the demonstrations in joint space are strong time-dependent. This paper uses Lloyd's algorithm and modified HMM to solve the problems in joint space.

The contributions of the paper are: (1) we use Lloyd's algorithm to encode and quantize the input vectors. This approach improves accuracy compared with the dynamic time warping method. (2) The traditional HMM is modified such

References (26)

M. Ito et al.
Dynamic and interactive generation of object handling behaviors by a small humanoid robot using a dynamic neural network model
Neural Netw.
(2006)
A. Ude
Trajectory generation from noisy positions of object features for teaching robot paths
Robot. Auton. Syst.
(1993)
Amelia Zafra et al.
HyDR-MIa hybrid algorithm to reduce dimensionality in multiple instance learning
Inf. Sci.
(2013)
B.D. Argalla et al.
A survey of robot learning from demonstration
Robot. Auton. Syst.
(2009)
L.E. Baum et al.
A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains
Ann. Math. Stat.
(1970)
Berg, J.V.D., Miller, S., Duckworth, D., Hu, H., Wan, A., Fu, X., Goldberg, K., Abbeel, P., 2010. Superhuman...
A. Billard et al.
Handbook of Robotics-Robot Programming by Demonstration
(2007)
Garrido, J., Yu, W., Li, X., 2016. Modular design and control of an upper limb exoskeleton. J. Mech. Sci. Technol. In...
A.F. Gernot
Markov Models for Pattern RecognitionFrom Theory to Applications
(2008)
E. Gribovskaya et al.
Learning non-linear multivariate dynamics of motion in robotic manipulators
Int. J. Robot. Res.
(2011)

M. Hersch et al.

Dynamical system modulation for robot learning via kinesthetic demonstrations

IEEE Trans. Robot.

(2008)

A.J. Ijspeert et al.

Dynamical movement primitiveslearning attractor models for motor behaviors

Neural Comput.

(2012)

Y. Kuniyoshia et al.

Embodied basis of invariant features in execution and perception of whole-body dynamic actions-knacks and focuses of Roll-and-Rise motion

Robot. Auton. Syst.

(2004)

Cited by (20)

A novel health prognosis method for system based on improved degenerated Hidden Markov model
2022, Robotics and Computer-Integrated Manufacturing
Citation Excerpt :
Moreover, these models were generally applied to offline health prognostics as its recognition and training processes are time-consuming. In previous studies [22–24], the Lloyd algorithm was widely used to realize the scalarization of input data. It performs effectively for vibrational signal data type.
Health prognosis for power system is considered as a crucial process of condition-based maintenance. In order to solve the problem of large deviation between Hidden Markov Model and actual system health diagnosis, this paper developed an improved Degenerated Hidden Markov Model (DGHMM) with a core of the quasi power relation. First, the model adopts the degradation factors described the process of recession for equipment continuous decreasing in performance. Compared with the conventional exponential accelerated degradation, the quasi power relation accelerated degradation can better describe the process that the performance of the system decreases gradually with the increase of service age. Then, the improved genetic algorithm can replace the conventional EM algorithm for parameters estimation, which overcomes the limitation that the EM algorithm is easy to fall into local optimization. At the same time, in terms of the limitation of life prognosis problem as a result of the Hidden Markov Model must obey exponential distribution, an algorithm named greed approximation based on approximation algorithm and Viterbi algorithm can be proposed to seek maximum probability remaining observation for the purpose of seeking maximum probability dynamically surplus state path to predict the residual life of system. Finally, the proposed method is validated and evaluated with the data set of caterpillar hydraulic pumps. The results show that the method of system health diagnosis and life prognosis based on the improved degraded hidden Markov model is more effective in describing system degeneration and the accuracy of equipment state diagnosis, and is also feasible in the prediction of residual life.
Error model-oriented vibration suppression control of free-floating space robot with flexible joints based on adaptive neural network
2022, Engineering Applications of Artificial Intelligence
Citation Excerpt :
For nearly half a century, from the former Soviet Union’s “salute” space station to the EU’s “International Space Station” and then to China’s “sky one” space station, the space robot manipulator has been more and more widely used in exploration, assembly, construction, service, maintenance or assisting astronauts in extravehicular tasks (Dong et al., 2019; Fan and Tang, 2021; Meng et al., 2021; Seddaoui and Saaj, 2019; Zhao et al., 2021) However, due to the space environment in which the space robot is located and the free-floating mode of the matrix, it has different dynamic characteristics and restrictions from the ground robot: the dynamic coupling between the manipulator and the base, the dynamic singularity, the limited fuel supply and the restrictions of the attitude control system (Chen et al., 2021; Swei et al., 2020) . At the same time, there are many uncertainties in the dynamic model of space robot, such as manipulator mass, inertia matrix, load mass and other dynamic models cannot be obtained accurately, and the external disturbance signals have a certain impact on the controller (Garrido et al., 2016; Liu et al., 2022; Xia et al., 2021). For the control problem of free-floating space robot, in (Asar et al., 2019; Ning and Wu, 2019; Yan et al., 2022; Zhao et al., 2017), their own control methods based on adaptive control theory successively are proposed, solving the high-precision control problem of space robot under parameter uncertainty and external interference, and achieved good control effect.
At the same time, considering the uncertain factors such as load variation, external interference and joint flexibility in engineering practice, a robust control method based on adaptive neural network is proposed. The dynamic model of free-floating space robot is established, and the error model caused by uncertain factors is deduced. Different from the traditional compensation algorithm that ignores the error model, a compensation controller based on Radial basis function neural network (RBFNN) is designed to approximate the error model. The approximation error is eliminated by robust controller to improve the control accuracy. In order to make full use of the nonlinear approximation ability of neural network, the error model is decomposed into four parts according to the input characteristics, and the neural network compensator is designed for separate and overall compensation, which further improves the control accuracy and robustness. The adaptive learning rates of network weights are designed to ensure online real-time adjustment without offline learning stage. A flexible compensator based on torque and a controller based on moment difference feedback controller (MDFC) are designed to suppress elastic vibration. Simulation and experimental studies show that the proposed strategy can have good compensation performance and robustness, and can better suppress elastic vibration, which proves the effectiveness and superiority of the proposed scheme.
A multi-objective approach for the trajectory planning of a 7-DOF serial-parallel hybrid humanoid arm
2021, Mechanism and Machine Theory
Citation Excerpt :
Additionally, in the obtained trajectory, the initial and final values of the velocity and acceleration can be configured almost arbitrarily as need. The approach works on a space of the output angle of joint moving platform (OAJ), demanding less computational effort than performing the trajectory planning of the HHRA in operating space [23-25]. The OAJ space can be transferred to operating space through the equivalent forward kinematics analysis.
Humanoid robots are an elusive long-term goal in robotics, and a humanoid arm is an essential actuator. Inspired by the motion characteristics of the human arm, a seven-degree-of-freedom (DOF) serial-parallel hybrid humanoid robotic arm is proposed. With the mechanism as an example, a solution of multi-objective trajectory planning is addressed using an indirect method to execute a point-to-point mission with faster time, lower energy and higher stability. First, to avoid abrupt changes of the joint angle, the path with a better joint-rate distribution is gained based on normalized Jacobian condition number. On the basis of this path, a multi-optimization model considering travelling time, energy consumption and torque fluctuation is formulated to plan the trajectory interpolated by quintic B-splines. The elitist non-dominated sorting genetic algorithm (NSGA-II) is used to address the model, and the optimal solution is determined by defining the weight value of the objective. Then, the complete dynamic model is obtained by combining equilibrium and deformation coordination equations. Finally, compared with another important trajectory planning methodology, the feasibility and effectiveness of the proposed methodology are validated through numerical simulations.
Health Prognosis for Equipment Based on ACO-K-Means and MCS-SVM under Small Sample Noise Unbalanced Data
2024, Tehnicki Vjesnik
Research on the precise reconstruction method of robot demonstration trajectory based on feedforward hidden Markov model
2023, Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument
Review of research on human-robot collaboration safety in the context of intelligent construction
2023, Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition)

View all citing articles on Scopus

View full text

Robot trajectory generation using modified hidden Markov model and Lloyd's algorithm in joint space

Abstract

Introduction

Section snippets

Codebook and key-points generation

Joint space trajectory generation with modified hidden Markov model

Experiments of learning trajectory

Conclusions

Neural Netw.

Robot. Auton. Syst.

Inf. Sci.

A survey of robot learning from demonstration

Robot. Auton. Syst.

A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains

Ann. Math. Stat.

Handbook of Robotics-Robot Programming by Demonstration

Markov Models for Pattern RecognitionFrom Theory to Applications

Learning non-linear multivariate dynamics of motion in robotic manipulators

Int. J. Robot. Res.

Dynamical system modulation for robot learning via kinesthetic demonstrations

IEEE Trans. Robot.

Dynamical movement primitiveslearning attractor models for motor behaviors

Neural Comput.

Embodied basis of invariant features in execution and perception of whole-body dynamic actions-knacks and focuses of Roll-and-Rise motion

Robot. Auton. Syst.