Learning inverse kinematics and dynamics of a robotic manipulator using generative adversarial networks

doi:10.1016/j.robot.2019.103386

Robotics and Autonomous Systems

Volume 124, February 2020, 103386

https://doi.org/10.1016/j.robot.2019.103386 Get rights and content

Highlights

•
GANs are used to solve the inverse kinematic and the inverse dynamics.
•
Methods are evaluated with different sizes of dataset and standard deviations.
•
Two types of robotic manipulators, MICO and Fetch, are used in the experiments.

Abstract

Obtaining inverse kinematics and dynamics of a robotic manipulator is often crucial for robot control. Analytical models are typically used to approximate real robot systems, and various controllers have been designed on top of the analytical model to compensate for the approximation error. Recently, machine learning techniques have been developed for error compensation, resulting in better performance. Unfortunately, combining a learned compensator with an analytical model makes the designed controller redundant and computationally expensive. Also, general machine learning techniques require a lot of data to perform the training process and approximation, especially in solving high dimensional problems. As a result, state-of-the-art machine learning applications are either expensive in terms of computation and data collection, or limited to a local approximation for a specific task or routine. In order to address the high dimensionality problem in learning inverse kinematics and dynamics, as well as to make the training process more data efficient, this paper presents a novel approach using a series of modified Generative Adversarial Networks (GANs). Namely, we use Conditional GANs (CGANs), Least Squares GANs (LSGANs), Bidirectional GANs (BiGANs) and Dual GANs(DualGANs). We trained and tested the proposed methods using real-world data collected from two types of robotic manipulators, a MICO robotic manipulator and a Fetch robotic manipulator. The data input to the GANs was obtained using a sampling method applied to the real data. The proposed approach enables approximating the real model using limited data without compromising the performance and accuracy. The proposed methods were tested in real-world experiments using unseen trajectories to validate the “learned” approximate inverse kinematics and inverse dynamics as well as to demonstrate the capability and effectiveness of the proposed algorithm over existing analytical models.

Introduction

Identification of the Inverse Kinematics (IK) and the Inverse Dynamics (ID) plays an important role in precise robot control and trajectory tracking [1], [2]. Existing literature details various approaches aimed at obtaining precise models of the system to lower feedback gain and improve adaptability in designing a stable controller [3], [4]. These techniques can be broadly classified into two categories: analytical methods, and numerical methods.

Analytical methods involve deriving an explicit mathematical model of the system under consideration from first principles. However, these methods rely on simplifying assumptions, prior knowledge, and experimental parameter estimations using the real system. Imperfections in any of the above can cause the analytical model to differ from the real system. In most cases, deriving the underlying mathematical model is unnecessarily complicated, and could suffer from singularities and nonlinearities [5], [6]. In contrast, numerical methods are data-driven and can provide approximate solutions within a desired tolerance [7]. With dedicated algorithms and sufficient data collected from real-world experiments, numerical methods can learn the uncertainty part in the real system that is difficult to model, and thereby provide better predictions of the system behavior [8].

Over the past few decades, the applicability of machine learning has improved greatly along with improvements in the computational capability of hardware. Many techniques have been developed to solve highly nonlinear problems, such as learning the sequences of motion primitives for robot manipulation [9], cleaning a table [10], and generating trajectories for biped robots to follow ZMP critics [11]. The majority of existing techniques have focused on solving high-level tasks or trajectory planning, while using a general model-based controller for the low-level actions, resulting in a hybrid control system. Reinforcement learning techniques became popular in the research community due to the applicability of physics engine simulations [12] and a replay buffer [13]. However, in many cases, relying on the analytical model behind the physics engine instead of using real-world data builds a gap between the simplified analytical model and the complex real-world system.

Applying machine learning techniques to acquire the IK and ID of a given system has a history of almost two decades in the research community. Karlik et al. worked on finding the best Artificial Neural Network (ANN) configuration to solve the IK problem for a six Degree-of-Freedom (DOF) robotic arm [14]. Comparison of Radial Basis function network (RBF) and Multilayer Perceptron Network (MLP) in solving IK of a 6-DOF arm was performed in [15]. A neural network architecture, combined with evolutionary techniques were used to solve the IK of a 6-DOF Stanford robotic manipulator in [16]. In addition to planar manipulators [17], the IK of a spatial 3-DOF structure was studied in [18]. Instead of using a single-agent neural network to solve the kinematic problem, Ansari et al. applied actor-critic architecture (two agents in one neural network architecture) to learn the IK of a 6-DOF robotic manipulator inside a reinforcement learning environment. However, this work explored only a discrete action space (joint space) instead of the continuous action space [19]. In addition to offline training techniques, an adaptive online strategy based on the Lyapunov stability theorem was presented to solve for the IK of redundant manipulators in [20]. Multiple soft computing algorithms for solving the IK of different robotic manipulators were compared in [7]. However, the majority of existing works used analytical models as ground truth or used analytic models embedded inside physics-based simulations, instead of using the dataset collected from the real world.

Compensation methods using reinforcement learning were developed for better trajectory following, and the learning process was demonstrated in real-world online conditions [1]. Even though the compensator was learned, they also used analytical models inside the controller.

Compared to the IK, learning the ID is more difficult due to the high dimensionality of the input. To address this issue, existing techniques in this domain have used analytical models along with learning approaches to handle the modeling error. To this extent, Meier et al. proposed a nonlinear function approximator to learn a constant error model in order to improve tracking performance on specific trajectories [21]. On the other hand, Rayyes et al. proposed learning the inverse statics model by taking advantage of the symmetry of the robot [22]. However, the improved efficiency offered by this method is limited to symmetric robot designs. Machine learning methods have also been used to learn rich dynamics as in the case of a soft robotic manipulator [8], [23], [24]. Deep learning networks along with physics-based simulators have also been used to study robot dynamics [25]. Reinforcement learning techniques have also been used to learn the closed-loop predictive controller for a real robot [8].

Similar to other numerical methods, the need for a large dataset plays an important role in training the neural network to approximate the target model. As such, data collection is the most time consuming and expensive part in the global estimation of the ID. The proposed approach in [8] requires real-world data collection lasting approximately two hours to develop a closed-loop controller from scratch. To overcome the problem of training a neural network with limited data, Generative Adversarial Networks (GANs) were proposed by the computer vision research community. The idea was to create additional “fake” data similar to the real world data, and thereby enlarge the total dataset available for training the target neural network [26], [27], [28]. GANs have also been used in inverse reinforcement learning to recover the reward functions embedded in training environments to perform specific tasks [29], [30]. In a similar fashion, our work aims to approximate the real model globally using a limited real-world dataset, which is augmented with fake data generated using GANs. The main contributions of this paper are as follows:

•
We extend the success of GANs used in the domain of computer vision towards learning the IK and the ID in cases where real-world data collection is expensive and highly nonlinear with high dimensional inputs.
•
Four types of popular GANs, namely, CGANs [31], LSGANs [32], BiGANs [33], and DualGANs [34] were modified for applicability towards solving the IK and the ID problems. Performance of these methods was compared over the unseen real-world trajectories.
•
Experimental evaluation of these methods was performed on a 3-DOF MICO robotic manipulator [35] and a 8-DOF Fetch robotic manipulator. To test the efficiency of the proposed modified GANs, all training processes were performed on a limited real-world dataset (collected over a period of 40 mins). The performance of the training process was also evaluated using different sizes of the partial dataset and different deviations for the generator in the GANs.

The rest of paper is organized as follows. Section 2 introduces the IK and ID of the robotic manipulators used in this paper. A brief introduction of generative adversarial networks (GANs) is also presented. Section 3 describes the modified GANs for learning the IK and ID. Details on the design of the neural network and sampling methods are presented. Section 4 discusses the simulation and experiment results. Finally, Section 5 concludes the work with directions for future research.

Section snippets

Inverse kinematics and inverse dynamics

Kinematics describe the relationship between the coordinates in the joint space, $q$ and the ones in the task space, $x$ . The forward kinematics map the joint space to the task space, $F K : q \to x$ while the inverse kinematics presents the opposite mapping, $I K : x \to q$ . Many methods have been developed to solve the kinematics problem, such as the geometric method and the Denavit–Hartenberg (DH) method. The closed loop equations have singularities and nonlinearities and thereby make the IK solving

Proposed algorithm

This section describes the proposed GAN architecture to approximate the IK and the ID of both MICO and Fetch robotic manipulators using real-world data.

Data collection for MICO robotic manipulator

To avoid overfitting of the training model in local paths, random trajectories, instead of predefined trajectories, were generated for the end effector of the MICO robotic manipulator to follow. Each trajectory consisted of multiple waypoints, distributed over the whole actuation space (joint space). In sampling the waypoints, two criteria were used to ensure the feasibility and safety in achieving the desired motion: (1) the joint positions should fall within the feasible configuration space

Conclusion

In this paper, we have introduced a series of modified Generative Adversarial Networks for solving the inverse kinematics and dynamics of robots using real-world experimental data. Existing research has focused on learning the uncertainty along with a simplified analytical model or predicting the hindsight analytic model, which could then be used as ground truth. However, existing techniques do not allow for global estimation of the underlying model of the system. Moreover, they require

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to thank Jiteng Yang, Bijo Sebastian and Daniel Budolak for their help in this work. We also gratefully acknowledge the support of NVIDIA, USA Corporation with the donation of the Titan Xp GPU used for this research.

Hailin Ren (S’16) received the B.S Degree in Nanjing University of Science and Technology in 2013 and M.S. in Columbia University in 2014. He is currently pursuing the Ph.D. degree at the Virginia Polytechnic Institute and State University (Virginia Tech) under the supervision of Prof. P. Ben-Tzvi. His research interests include Artificial Intelligence, Computer Vision and Autonomous Robotics.

References (44)

PaneY.P. et al.
Reinforcement learning based compensation methods for robot manipulators
Eng. Appl. Artif. Intell.
(2019)
ShareefZ. et al.
Improving the inverse dynamics model of the KUKA LWR IV＋ using independent joint learning*
IFAC-PapersOnLine
(2016)
ThuK.M. et al.
Designing and modeling of quadcopter control system using L1 adaptive control
Procedia Comput. Sci.
(2017)
HawleyL. et al.
Control framework for cooperative object transportation by two humanoid robots
Robot. Auton. Syst.
(2019)
KucukS. et al.
Inverse kinematics solutions for industrial robot manipulators with offset wrists
Appl. Math. Model.
(2014)
Lopez-FrancoC. et al.
A soft computing approach for inverse kinematics of robot manipulators
Eng. Appl. Artif. Intell.
(2018)
MartínezD. et al.
Planning robot manipulation to clean planar surfaces
Eng. Appl. Artif. Intell.
(2015)
KarlikB. et al.
An improved approach to the solution of inverse kinematics problems for robot manipulators
Eng. Appl. Artif. Intell.
(2000)
ChiddarwarS.S. et al.
Comparison of RBF and MLP neural networks to solve inverse kinematic problem for 6R serial robot by a fusion approach
Eng. Appl. Artif. Intell.
(2010)
KökerR.
A genetic algorithm approach to a neural-network-based inverse kinematics solution of robotic manipulators based on error minimization
Inform. Sci.
(2013)

DukaA.-V.

Neural network based inverse kinematics solution for trajectory tracking of a robotic arm

Proc. Technol.

(2014)

ToshaniH. et al.

Real-time inverse kinematics of redundant manipulators using neural networks and quadratic programming: A Lyapunov-based approach

Robot. Auton. Syst.

(2014)

RoneW.S. et al.

Continuum robot dynamics utilizing the principle of virtual power

IEEE Trans. Robot.

(2014)

ThuruthelT.G. et al.

Model-based reinforcement learning for closed-loop dynamic control of soft robotic manipulators

IEEE Trans. Robot.

(2019)

StulpF. et al.

Reinforcement learning with sequences of motion primitives for robust manipulation

IEEE Trans. Robot.

(2012)

HuK. et al.

Learning and generalization of compensative zero-moment point trajectory for biped walking

IEEE Trans. Robot.

(2016)

GosaviA.

Simulation-based optimization

LillicrapT.P. et al.

Continuous control with deep reinforcement learning

(2015)

CsiszarA. et al.

On solving the inverse kinematics problem using neural networks

AnsariY. et al.

A multiagent reinforcement learning approach for inverse kinematics of high dimensional manipulators with precision positioning

MeierF. et al.

Towards robust online inverse dynamics learning

RayyesR. et al.

Learning inverse statics models efficiently with symmetry-based exploration

Front. Neurorobotics

(2018)

Cited by (64)

Feedback linearization control for uncertain nonlinear systems via generative adversarial networks
2024, ISA Transactions
This article presents a novel approach to leverage generative adversarial networks(GANs) techniques to learn a feedback linearization controller(FLC) for a class of uncertain nonlinear systems. By estimating uncertainty through the adversarial process, where ground truth samples are exclusively obtained from a predefined integral model, the feedback linearization controller, learned through a minimax two-player optimization framework, enhances the reference tracking performance of the input-output uncertain nonlinear system. Furthermore, we provide theoretical guarantee of convergence and stability, demonstrating the safe recovery of robust FLC. We also address the common challenge of mode collapse in GANs training through the strict convexity of our synthesized generator structure and an enhanced adversarial loss. Comprehensive simulations and practical experiments are conducted to underscore the superiority and efficacy of our proposed approach.
Adaptive learning control of robot manipulators via incremental hybrid neural network
2024, Neurocomputing
A novel hybrid neural network based learning control method is proposed to improve trajectory tracking accuracy for complex robot manipulators in this paper. Firstly, a hybrid neural network is presented to improve the model accuracy and data efficiency, which is integrated by the Differential Newton-Euler Algorithm (DiffNEA) and Radial Basis Function Neural Network (RBFNN). In this hybrid neural network, the DiffNEA takes in charge of modeling the known rigid dynamics, while RBFNN takes in charge of capturing the unmodeled phenomena and external disturbances. Secondly, an incremental design method is proposed to determine optimal structure and parameters of the hybrid neural network. Thirdly, an adaptive learning controller based on the aforementioned hybrid neural network is designed to further reject the effects of unmodeled dynamics and external disturbances on trajectory tacking performance. Finally, experimental results are presented to validate the effectiveness of the proposed method.
Artificial Neural Networks for inverse kinematics problem in articulated robots
2023, Engineering Applications of Artificial Intelligence
The inverse kinematics problem in articulated robots implies to obtain joint rotation angles using the robot end effector position and orientation tool. Unlike the problem of direct kinematics, in inverse kinematics there are no systematic methods for solving the problem. Moreover, solving the inverse kinematics problem is particularly complicated for certain morphologies of articulated robots. Machine learning techniques and, more specifically, artificial neural networks (ANNs) have been proposed in the scientific literature to solve this problem. However, there are some limitations in the performance of ANNs. In this study, different techniques that involve ANNs are proposed and analyzed. The results show that the proposed original bootstrap sampling and hybrid methods can substantially improve the performance of approaches that use only one ANN. Although all of these improvements do not solve completely the inverse kinematics problem in articulated robots, they do lay the foundations for the design and development of future more effective and efficient controllers. Therefore, the source code and documentation of this research are also publicly available to practitioners interested in adapting and improving these methods to any industrial robot or articulated robot.
Dynamics of a modular manipulator with multiple actuation modes for space applications
2023, Mechanism and Machine Theory
A manipulator, designed with $N$ double-tripod multi-loop modules, is the subject of this paper. Each module is configured with multiple actuation alternatives, thus offering both multiple operation modes and design possibilities, along with control challenges. In this paper, a dynamics model of the manipulator of interest is implemented via a novel procedure. The key issue lies in finding the relationship between various actuation modes, to establish one generic dynamics model suitable to all actuation modes. The twists of all moving rigid bodies of the manipulator are derived and represented in terms of general expressions, to be used in the subsequent analysis. A three-modular manipulator is employed to illustrate the implementation of the modeling procedure. Simulation results and prototype experiments of the same manipulator are reported, to validate the procedure. The dynamics analysis reported here is intended to support the subsequent actuation-mode optimization and design of the whole system.
An online impedance adaptation controller for decoding skill intelligence
2023, Biomimetic Intelligence and Robotics
Variable Impedance control allows robots and humans to safely and efficiently interact with unknown external environments. This tutorial introduces online impedance adaptation control (OIAC) for variable compliant joint motions in a range of control tasks: rapid ( $< 1 s$ ) movement control (i.e., whipping to hit), arm and finger impedance quantification, multifunctional exoskeleton control, and robot-inspired human arm control hypothesis. The OIAC has been introduced as a feedback control, which can be integrated into a feedforward control, e.g., learned by data-driven methods. This integration facilitates the understanding of human and robot arm control, closing a research loop between biomechanics and robotics. It shows not only a research way from biomechanics to robotics, but also another reserved one. This tutorial aims at presenting research examples and Python codes for advancing the understanding of variable impedance adaptation in human and robot motor control. It contributes to the state-of-the-art by providing an online impedance adaptation controller for wearable robots (i.e., exoskeletons) which can be used in robotic and biomechanical applications.
Forecasting crude oil risk: A multiscale bidirectional generative adversarial network based approach
2023, Expert Systems with Applications
Citation Excerpt :
Since its introduction in 2014, there have been many variants of GAN. Typical examples include Conditional GANs (CGANs), Bidirectional GAN (BiGAN), etc. (Donahue et al., 2016; Ren & Ben-Tzvi, 2019). BiGAN adds the encoder function to map the data into the latent feature space and modifies the discriminator to evaluate proximity between latent output and generator output, so that the inverse mapping between the data and the latent space function space can be learned.
The crude oil market is known to be subject to the influence of transient and extreme events. The rare and infrequent nature of these events leads to problems such as a lack of data for the estimation of reliable risk measures in the crude oil market. In this paper, an innovative MEMD-BiGAN risk forecasting methodology combining the power of multi scale analysis and Generative Adversarial Network has been proposed. BiGAN has been introduced as an innovative method from the machine learning field to produce the augmented dataset with sufficient number of observations. Then Historical Simulation method can be employed to estimate market risk level in the multiscale domain, where the final risk forecasts taking into account the transient risk factors are more accurate and reliable. MEMD-BiGAN has been applied to model the portfolio of daily trading data in the major crude oil markets including West Texas Intermediate, Brent and OPEC markets. Results suggest that the MEMD-BiGAN model can achieve improved risk coverage. This implies that the incorporation of the transient risk factors using the BiGAN model is essential to more accurate modeling of the risk measures in the turbulent market environment.

View all citing articles on Scopus

Pinhas Ben-Tzvi (S’02–M’08–SM’12) received the B.S. degree (summa cum laude) in mechanical engineering from the Technion—Israel Institute of Technology, and the M.S. and Ph.D. degrees in mechanical engineering from the University of Toronto. He is currently an Associate Professor of Mechanical Engineering at Virginia Tech. His current research interests include robotics and intelligent autonomous systems, human–robot interactions, robotic vision, machine learning, mechatronics design, systems dynamics and control,and novel sensing and actuation.

Dr. Ben-Tzvi is the recipient of the 2019 Virginia Tech Teaching Excellence Award and the 2018 Faculty Fellow Award. Dr. Ben-Tzvi is Technical Editor for the IEEE/ASME Transactions on Mechatronics, Associate Editor for ASME Journal of Mechanisms and Robotics, and an Associate Editor for IEEE Robotics and Automation Magazine, Automation and Systems and served as an Associate Editor for IEEE ICRA 2013-2018. He is a member of the American Society of Mechanical Engineers (ASME).

View full text

Learning inverse kinematics and dynamics of a robotic manipulator using generative adversarial networks

Highlights

Abstract

Introduction

Section snippets

Inverse kinematics and inverse dynamics

Proposed algorithm

Data collection for MICO robotic manipulator

Conclusion

Declaration of Competing Interest

Acknowledgments

Eng. Appl. Artif. Intell.

IFAC-PapersOnLine

Procedia Comput. Sci.

Robot. Auton. Syst.

Appl. Math. Model.

Eng. Appl. Artif. Intell.

Eng. Appl. Artif. Intell.

Eng. Appl. Artif. Intell.

Eng. Appl. Artif. Intell.

Inform. Sci.

Proc. Technol.

Robot. Auton. Syst.

Continuum robot dynamics utilizing the principle of virtual power

IEEE Trans. Robot.

Model-based reinforcement learning for closed-loop dynamic control of soft robotic manipulators

IEEE Trans. Robot.

Reinforcement learning with sequences of motion primitives for robust manipulation

IEEE Trans. Robot.

Learning and generalization of compensative zero-moment point trajectory for biped walking

IEEE Trans. Robot.

Simulation-based optimization

Continuous control with deep reinforcement learning

On solving the inverse kinematics problem using neural networks

A multiagent reinforcement learning approach for inverse kinematics of high dimensional manipulators with precision positioning

Towards robust online inverse dynamics learning

Learning inverse statics models efficiently with symmetry-based exploration

Front. Neurorobotics