Learning inverse kinematics and dynamics of a robotic manipulator using generative adversarial networks

https://doi.org/10.1016/j.robot.2019.103386Get rights and content

Highlights

  • GANs are used to solve the inverse kinematic and the inverse dynamics.

  • Methods are evaluated with different sizes of dataset and standard deviations.

  • Two types of robotic manipulators, MICO and Fetch, are used in the experiments.

Abstract

Obtaining inverse kinematics and dynamics of a robotic manipulator is often crucial for robot control. Analytical models are typically used to approximate real robot systems, and various controllers have been designed on top of the analytical model to compensate for the approximation error. Recently, machine learning techniques have been developed for error compensation, resulting in better performance. Unfortunately, combining a learned compensator with an analytical model makes the designed controller redundant and computationally expensive. Also, general machine learning techniques require a lot of data to perform the training process and approximation, especially in solving high dimensional problems. As a result, state-of-the-art machine learning applications are either expensive in terms of computation and data collection, or limited to a local approximation for a specific task or routine. In order to address the high dimensionality problem in learning inverse kinematics and dynamics, as well as to make the training process more data efficient, this paper presents a novel approach using a series of modified Generative Adversarial Networks (GANs). Namely, we use Conditional GANs (CGANs), Least Squares GANs (LSGANs), Bidirectional GANs (BiGANs) and Dual GANs(DualGANs). We trained and tested the proposed methods using real-world data collected from two types of robotic manipulators, a MICO robotic manipulator and a Fetch robotic manipulator. The data input to the GANs was obtained using a sampling method applied to the real data. The proposed approach enables approximating the real model using limited data without compromising the performance and accuracy. The proposed methods were tested in real-world experiments using unseen trajectories to validate the “learned” approximate inverse kinematics and inverse dynamics as well as to demonstrate the capability and effectiveness of the proposed algorithm over existing analytical models.

Introduction

Identification of the Inverse Kinematics (IK) and the Inverse Dynamics (ID) plays an important role in precise robot control and trajectory tracking [1], [2]. Existing literature details various approaches aimed at obtaining precise models of the system to lower feedback gain and improve adaptability in designing a stable controller [3], [4]. These techniques can be broadly classified into two categories: analytical methods, and numerical methods.

Analytical methods involve deriving an explicit mathematical model of the system under consideration from first principles. However, these methods rely on simplifying assumptions, prior knowledge, and experimental parameter estimations using the real system. Imperfections in any of the above can cause the analytical model to differ from the real system. In most cases, deriving the underlying mathematical model is unnecessarily complicated, and could suffer from singularities and nonlinearities [5], [6]. In contrast, numerical methods are data-driven and can provide approximate solutions within a desired tolerance [7]. With dedicated algorithms and sufficient data collected from real-world experiments, numerical methods can learn the uncertainty part in the real system that is difficult to model, and thereby provide better predictions of the system behavior [8].

Over the past few decades, the applicability of machine learning has improved greatly along with improvements in the computational capability of hardware. Many techniques have been developed to solve highly nonlinear problems, such as learning the sequences of motion primitives for robot manipulation [9], cleaning a table [10], and generating trajectories for biped robots to follow ZMP critics [11]. The majority of existing techniques have focused on solving high-level tasks or trajectory planning, while using a general model-based controller for the low-level actions, resulting in a hybrid control system. Reinforcement learning techniques became popular in the research community due to the applicability of physics engine simulations [12] and a replay buffer [13]. However, in many cases, relying on the analytical model behind the physics engine instead of using real-world data builds a gap between the simplified analytical model and the complex real-world system.

Applying machine learning techniques to acquire the IK and ID of a given system has a history of almost two decades in the research community. Karlik et al. worked on finding the best Artificial Neural Network (ANN) configuration to solve the IK problem for a six Degree-of-Freedom (DOF) robotic arm [14]. Comparison of Radial Basis function network (RBF) and Multilayer Perceptron Network (MLP) in solving IK of a 6-DOF arm was performed in [15]. A neural network architecture, combined with evolutionary techniques were used to solve the IK of a 6-DOF Stanford robotic manipulator in [16]. In addition to planar manipulators [17], the IK of a spatial 3-DOF structure was studied in [18]. Instead of using a single-agent neural network to solve the kinematic problem, Ansari et al. applied actor-critic architecture (two agents in one neural network architecture) to learn the IK of a 6-DOF robotic manipulator inside a reinforcement learning environment. However, this work explored only a discrete action space (joint space) instead of the continuous action space [19]. In addition to offline training techniques, an adaptive online strategy based on the Lyapunov stability theorem was presented to solve for the IK of redundant manipulators in [20]. Multiple soft computing algorithms for solving the IK of different robotic manipulators were compared in [7]. However, the majority of existing works used analytical models as ground truth or used analytic models embedded inside physics-based simulations, instead of using the dataset collected from the real world.

Compensation methods using reinforcement learning were developed for better trajectory following, and the learning process was demonstrated in real-world online conditions [1]. Even though the compensator was learned, they also used analytical models inside the controller.

Compared to the IK, learning the ID is more difficult due to the high dimensionality of the input. To address this issue, existing techniques in this domain have used analytical models along with learning approaches to handle the modeling error. To this extent, Meier et al. proposed a nonlinear function approximator to learn a constant error model in order to improve tracking performance on specific trajectories [21]. On the other hand, Rayyes et al. proposed learning the inverse statics model by taking advantage of the symmetry of the robot [22]. However, the improved efficiency offered by this method is limited to symmetric robot designs. Machine learning methods have also been used to learn rich dynamics as in the case of a soft robotic manipulator [8], [23], [24]. Deep learning networks along with physics-based simulators have also been used to study robot dynamics [25]. Reinforcement learning techniques have also been used to learn the closed-loop predictive controller for a real robot [8].

Similar to other numerical methods, the need for a large dataset plays an important role in training the neural network to approximate the target model. As such, data collection is the most time consuming and expensive part in the global estimation of the ID. The proposed approach in [8] requires real-world data collection lasting approximately two hours to develop a closed-loop controller from scratch. To overcome the problem of training a neural network with limited data, Generative Adversarial Networks (GANs) were proposed by the computer vision research community. The idea was to create additional “fake” data similar to the real world data, and thereby enlarge the total dataset available for training the target neural network [26], [27], [28]. GANs have also been used in inverse reinforcement learning to recover the reward functions embedded in training environments to perform specific tasks [29], [30]. In a similar fashion, our work aims to approximate the real model globally using a limited real-world dataset, which is augmented with fake data generated using GANs. The main contributions of this paper are as follows:

  • We extend the success of GANs used in the domain of computer vision towards learning the IK and the ID in cases where real-world data collection is expensive and highly nonlinear with high dimensional inputs.

  • Four types of popular GANs, namely, CGANs [31], LSGANs [32], BiGANs [33], and DualGANs [34] were modified for applicability towards solving the IK and the ID problems. Performance of these methods was compared over the unseen real-world trajectories.

  • Experimental evaluation of these methods was performed on a 3-DOF MICO robotic manipulator [35] and a 8-DOF Fetch robotic manipulator. To test the efficiency of the proposed modified GANs, all training processes were performed on a limited real-world dataset (collected over a period of 40 mins). The performance of the training process was also evaluated using different sizes of the partial dataset and different deviations for the generator in the GANs.

The rest of paper is organized as follows. Section 2 introduces the IK and ID of the robotic manipulators used in this paper. A brief introduction of generative adversarial networks (GANs) is also presented. Section 3 describes the modified GANs for learning the IK and ID. Details on the design of the neural network and sampling methods are presented. Section 4 discusses the simulation and experiment results. Finally, Section 5 concludes the work with directions for future research.

Section snippets

Inverse kinematics and inverse dynamics

Kinematics describe the relationship between the coordinates in the joint space, q and the ones in the task space, x. The forward kinematics map the joint space to the task space, FK:qx while the inverse kinematics presents the opposite mapping, IK:xq. Many methods have been developed to solve the kinematics problem, such as the geometric method and the Denavit–Hartenberg (DH) method. The closed loop equations have singularities and nonlinearities and thereby make the IK solving

Proposed algorithm

This section describes the proposed GAN architecture to approximate the IK and the ID of both MICO and Fetch robotic manipulators using real-world data.

Data collection for MICO robotic manipulator

To avoid overfitting of the training model in local paths, random trajectories, instead of predefined trajectories, were generated for the end effector of the MICO robotic manipulator to follow. Each trajectory consisted of multiple waypoints, distributed over the whole actuation space (joint space). In sampling the waypoints, two criteria were used to ensure the feasibility and safety in achieving the desired motion: (1) the joint positions should fall within the feasible configuration space

Conclusion

In this paper, we have introduced a series of modified Generative Adversarial Networks for solving the inverse kinematics and dynamics of robots using real-world experimental data. Existing research has focused on learning the uncertainty along with a simplified analytical model or predicting the hindsight analytic model, which could then be used as ground truth. However, existing techniques do not allow for global estimation of the underlying model of the system. Moreover, they require

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to thank Jiteng Yang, Bijo Sebastian and Daniel Budolak for their help in this work. We also gratefully acknowledge the support of NVIDIA, USA Corporation with the donation of the Titan Xp GPU used for this research.

Hailin Ren (S’16) received the B.S Degree in Nanjing University of Science and Technology in 2013 and M.S. in Columbia University in 2014. He is currently pursuing the Ph.D. degree at the Virginia Polytechnic Institute and State University (Virginia Tech) under the supervision of Prof. P. Ben-Tzvi. His research interests include Artificial Intelligence, Computer Vision and Autonomous Robotics.

References (44)

  • DukaA.-V.

    Neural network based inverse kinematics solution for trajectory tracking of a robotic arm

    Proc. Technol.

    (2014)
  • ToshaniH. et al.

    Real-time inverse kinematics of redundant manipulators using neural networks and quadratic programming: A Lyapunov-based approach

    Robot. Auton. Syst.

    (2014)
  • RoneW.S. et al.

    Continuum robot dynamics utilizing the principle of virtual power

    IEEE Trans. Robot.

    (2014)
  • ThuruthelT.G. et al.

    Model-based reinforcement learning for closed-loop dynamic control of soft robotic manipulators

    IEEE Trans. Robot.

    (2019)
  • StulpF. et al.

    Reinforcement learning with sequences of motion primitives for robust manipulation

    IEEE Trans. Robot.

    (2012)
  • HuK. et al.

    Learning and generalization of compensative zero-moment point trajectory for biped walking

    IEEE Trans. Robot.

    (2016)
  • GosaviA.

    Simulation-based optimization

  • LillicrapT.P. et al.

    Continuous control with deep reinforcement learning

    (2015)
  • CsiszarA. et al.

    On solving the inverse kinematics problem using neural networks

  • AnsariY. et al.

    A multiagent reinforcement learning approach for inverse kinematics of high dimensional manipulators with precision positioning

  • MeierF. et al.

    Towards robust online inverse dynamics learning

  • RayyesR. et al.

    Learning inverse statics models efficiently with symmetry-based exploration

    Front. Neurorobotics

    (2018)
  • Cited by (64)

    • Artificial Neural Networks for inverse kinematics problem in articulated robots

      2023, Engineering Applications of Artificial Intelligence
    • Forecasting crude oil risk: A multiscale bidirectional generative adversarial network based approach

      2023, Expert Systems with Applications
      Citation Excerpt :

      Since its introduction in 2014, there have been many variants of GAN. Typical examples include Conditional GANs (CGANs), Bidirectional GAN (BiGAN), etc. (Donahue et al., 2016; Ren & Ben-Tzvi, 2019). BiGAN adds the encoder function to map the data into the latent feature space and modifies the discriminator to evaluate proximity between latent output and generator output, so that the inverse mapping between the data and the latent space function space can be learned.

    View all citing articles on Scopus

    Hailin Ren (S’16) received the B.S Degree in Nanjing University of Science and Technology in 2013 and M.S. in Columbia University in 2014. He is currently pursuing the Ph.D. degree at the Virginia Polytechnic Institute and State University (Virginia Tech) under the supervision of Prof. P. Ben-Tzvi. His research interests include Artificial Intelligence, Computer Vision and Autonomous Robotics.

    Pinhas Ben-Tzvi (S’02–M’08–SM’12) received the B.S. degree (summa cum laude) in mechanical engineering from the Technion—Israel Institute of Technology, and the M.S. and Ph.D. degrees in mechanical engineering from the University of Toronto. He is currently an Associate Professor of Mechanical Engineering at Virginia Tech. His current research interests include robotics and intelligent autonomous systems, human–robot interactions, robotic vision, machine learning, mechatronics design, systems dynamics and control,and novel sensing and actuation.

    Dr. Ben-Tzvi is the recipient of the 2019 Virginia Tech Teaching Excellence Award and the 2018 Faculty Fellow Award. Dr. Ben-Tzvi is Technical Editor for the IEEE/ASME Transactions on Mechatronics, Associate Editor for ASME Journal of Mechanisms and Robotics,  and an Associate Editor for IEEE Robotics and Automation Magazine, Automation and Systems and served as an Associate Editor for IEEE ICRA 2013-2018. He is a member of the American Society of Mechanical Engineers (ASME).

    View full text