Elsevier

Neurocomputing

Volume 390, 21 May 2020, Pages 260-267
Neurocomputing

A robot learning framework based on adaptive admittance control and generalizable motion modeling with neural network controller

https://doi.org/10.1016/j.neucom.2019.04.100Get rights and content

Abstract

Robot learning from demonstration (LfD) enables robots to be fast programmed. This paper presents a novel LfD framework involving a teaching phase, a learning phase and a reproduction phase, and proposes methods in each of these phases to guarantee the overall system performance. An adaptive admittance controller is developed to take into account the unknown human dynamics so that the human tutor can smoothly move the robot around in the teaching phase. The task model in this controller is formulated by the Gaussian mixture regression to extract the human-related motion characteristics. In the learning and reproduction phases, the dynamic movement primitive is employed to model a robotic motion that is generalizable. A neural network-based controller is designed for the robot to track the trajectories generated from the motion model, and a radial basis function neural network is used to compensate for the effect caused by the dynamic environments. Experiments have been performed using a Baxter robot and the results have confirmed the validity of the proposed robot learning framework.

Introduction

Robot learning from demonstration (LfD) has recently drawn much attention due to its high efficiency in robot programming [1]. Robots can learn variable skills from human tutor to complete tasks in complex industrial environment [2]. Compared to conventional programming methods using a teaching pendant, LfD is an easier and more intuitive way for people who are unfamiliar with programming. Besides, human characteristics involved in the demonstrations are available for robots to further improve the flexibility and compliance of motions.

The applicability of LfD frameworks can be assessed according to the criteria defined in [3], which includes learning fatigue, adaptability, generality, accuracy and so on. It is difficult to satisfy all the criteria simultaneously. Thus, we can focus on one of the criteria in each phase of LfD.

The entire process of LfD includes the teaching phase, the learning phase and the reproduction phase. In the teaching phase, the human tutor demonstrates how to perform a task and the motion of the robot or human will be recorded. The criterion that needs to be satisfied in this phase is learning fatigue. In order to reduce learning fatigue, demonstrating should occur in an intuitive and easy way [3]. There are many methods to accomplish demonstration, such as directly guiding the robot or using visual devices to capture and transmit the human motion. In this paper, the teaching phase is accomplished by directly moving the end-effector of the robot because it is straightforward and usually causes less loss of motion information. The learning phase is usually ignored in traditional industrial environment and the motion is directly used for reproduction. This will cause massive repetition of demonstrations when the tasks are similar, for example, pick-and-place tasks with different place targets. If a demonstration can be generalized to adapt to similar situations, the teaching process will be more efficient. Thus, the learning phase that models generalizable motion is necessary. The reproduction phase involves the trajectory tracking; thus, the tracking accuracy of the robot dynamics controller should be guaranteed.

Admittance control has been widely used in human-robot interaction, which can generate robot motions based on the human force [4], [5], [6]. Thus, in this paper we use the admittance control to achieve the human-guided teaching. The admittance control exploits the end-effector position controller to track the output of an admittance model. Most studies on admittance control have not considered the human factor, which is an important part of this control loop. The interaction force between robot and human can be used to recognize the human intent, and to further improve the interaction safety and user experience [7]. In [8], the human force was employed to compute the desired movement trajectory of the human, which is used in the performance index of the admittance model. In [9], the unknown human dynamics was considered in the control loop. The human transfer function and the admittance model were formulated as a Wiener filter, and a task model was used to estimate the human intent. The Kalman filter estimate law was used to tune the parameters of the admittance model. However, the task model in this work is assumed as a certain linear system, which is unreasonable because the estimated human motions should be different for each individual due to different motion habits. Thus, a task model that involves the human characteristics need to be developed.

Gaussian mixture regression (GMR) is an effective algorithm to encode the human characteristics based on the human demonstrations [10], [11]. In [12], the human motion was analyzed using Gaussian mixture model and a new motion that involves the human motion distribution was generated using GMR. This algorithm shows great feasibility to develop a task model that involves the human characteristics.

In the learning phase of LfD, the robotic motion caused by the human guiding will be modeling. Dynamic system (DS) has been widely used to achieve the generalization of the motion model [12], [13], [14]. Dynamic movement primitive (DMP) is a powerful method to model generalizable motion based on DS[15], [16], [17]. It exploits a spring-damper system to guarantee the stability of the model, and uses a nonlinear function to motivate the model to generate motion that keeps the characteristics of origin motion. It can be used to effectively model a series of primitive templates that are decomposed from a demonstration [18]. In [19], the DMP was used to model striking motion in robot table tennis. The learned model was used to generate motions that has different targets to hit the ball. To achieve trajectory joining and insertion effectively, a method called linearly decayed DMP+ extended the origin DMP by using truncated kernels and removing the problem of vanishing exponential time decay [20]. In this paper, the DMP is also integrated in our framework to improve the efficiency of LfD. Through adjusting the goal parameter of the DMP, we can generate a group of similar motions so that unnecessary repetition of demonstrations can be reduced.

The generated motion is finally used in reproduction, the accuracy of which depends on the performance of trajectory tracking controller. The controller design methods can be classified into the model-based methods and the model-free methods. The model-based method has better tracking accuracy because the robot dynamics is considered. However, the accurate robot model is difficult to obtain. The function approximation methods such as neural network (NN) have been used to solve this problem [21], [22]. In [23], the backpropagation (BP) NN was employed to approximate the unknown model of the vibration suppression device, which achieved better control result. Compared to BPNN, the radial basis function (RBF) NN has a faster learning procedure and is more suitable for controller design. In this paper, we use the RBF NN to approximate the robot dynamics so that the robot can complete the reproduced motion accurately without the knowledge of the robot manipulator dynamics.

The contributions of this paper are as follows:

  • 1)

    An adaptive admittance controller is developed to takes into account the unknown human dynamics. The task model in this controller is formulated by the GMR to extract the human motion characteristics.

  • 2)

    A complete LfD framework that considers the teaching phase, the learning phase and the reproduction phase is developed, as shown in Fig. 1. In the learning phase, the adaptive admittance controller described in 1) is employed so that the human tutor can smoothly guide the robot to accomplish the demonstration. In the learning phase, the DMP is used to model the robotic motion. The learned model can generalize the motion to adapt to different situations. In the reproduction phase, the RBF-NN-based trajectory track controller is developed to achieve accurate motion reproduction.

This paper is organized as follows. In Section 2, the methodology including the adaptive admittance control, the DMP and the NN-based controller will be introduced. The experimental study is then presented in Sections 3 and 4 finally concludes this paper.

Section snippets

Adaptive admittance control with demonstration-based task model

In this section, an adaptive task-specific admittance controller is developed. This adapts the parameters of the prescribed robot admittance model so that the robot system assists the human to achieve task-specific objectives. The task information is modeled by GMR so that the controller can adapt to the human tutor characteristics. After designing, the adaptive admittance controller will be used in the teaching phase for human tutor to demonstrate.

The prescribed admittance model is defined as

Experiments

Experiments are performed by using a Baxter robot that has two 7-DOF arms, as shown in Fig. 3. An ATI force sensor is attached on the end of the left arm to detect the human force. We verify the proposed framework by testing the method in each phase.

Conclusion

In this paper, a novel robot learning framework based on adaptive admittance control and generalizable motion modeling is developed. This framework considers the performance of the methodology in each phase of LfD. A demonstration-based task model is developed using GMR to integrate the human characteristics into the adaptive admittance model. The DMP is used to model generalizable motion in the learning phase and a RBF-NN-based controller is developed to track the reproduced motion accurately.

Declaration of Competing Interest

None.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant No. 61803039, and Engineering and Physical Sciences Research Council (EPSRC) under Grant EP/S001913.

Ning Wang is a Senior Lecturer in Robotics at Bristol Robotics Laboratory at University of the West of England, United Kingdom. She received the Ph.D degree in Electronics Engineering at The Chinese University of Hong Kong in 2011. She worked as Post-doc research fellow on machine learning at the Chinese University of Hong Kong, as Research fellow on Human-Robot Interaction with the Centre for Robotics and Neural Systems, University of Plymouth. Ning has rich project experience, she was key

References (28)

  • I. Ranatunga et al.

    Adaptive admittance control for human-robot interaction using model reference design and adaptive inverse filtering.

    IEEE Trans. Control Syst. Technol.

    (2017)
  • S. Calinon et al.

    Statistical learning by imitation of competing constraints in joint space and task space

    Adv. Robot.

    (2009)
  • S. Calinon et al.

    On learning, representing, and generalizing a task in a humanoid robot

    IEEE Trans. Syst. Man Cybern. B Cybern.

    (2007)
  • S. Calinon et al.

    Statistical dynamical systems for skills acquisition in humanoids

    Proceedings of IEEE-RAS International Conference on Humanoid Robots

    (2012)
  • Cited by (29)

    • A robotic learning and generalization framework for curved surface based on modified DMP

      2023, Robotics and Autonomous Systems
      Citation Excerpt :

      Human demonstration is when a human expert teaches a robot how to perform certain specialized skills. The motion trajectory will be recorded and used to train the skill model [6]. Robotic manipulators not only repeat learned skills, but are often expected to generalize to new tasks and situations.

    • Recurrent neural network with noise rejection for cyclic motion generation of robotic manipulators

      2021, Neural Networks
      Citation Excerpt :

      Furthermore, RNNs can be improved with additional functions such as finite-time convergence (Jin, Zhang, Li, & Zhang, 2016; Li, Wang, & Rafique, 2018; Xiao, Liao, Li, & Chen, 2018), which provides feasibility for applying to robotics. Lots of research have been done to obtain the relevant real-time joint data (i.e., joint angle, joint velocity, and joint acceleration) using neural control laws (Jarzebowska, 2008; Jin, Li, La, & Luo, 2017; Wang, Chen, & Yang, 2020; Zhang, Jin, Liu, Liu, Zhu, & Zhao, 2020b). In detail, through a control scheme with an optimization index, RNN gets the joint data, and then drive the end-effector to efficiently move along the predetermined trajectory (Guo, Li, & Stanimirović, 2020; Xia, Feng, & Wang, 2005; Zang & Constantinides, 1992; Zhang, Li, & Zhou, 2019; Zhang, Tan, Yang, & Lv, 2008).

    View all citing articles on Scopus

    Ning Wang is a Senior Lecturer in Robotics at Bristol Robotics Laboratory at University of the West of England, United Kingdom. She received the Ph.D degree in Electronics Engineering at The Chinese University of Hong Kong in 2011. She worked as Post-doc research fellow on machine learning at the Chinese University of Hong Kong, as Research fellow on Human-Robot Interaction with the Centre for Robotics and Neural Systems, University of Plymouth. Ning has rich project experience, she was key member of EU FP7 Project ROBOT-ERA, EU Regional Development Funded Project ASTUTE 2020 and industrial projects with UK companies. She has been awarded several awards including best paper award of ICIRA'15, best student paper award nomination of ISCSLP'10, and award of merit of 2008 IEEE Signal Processing Postgraduate Forum, etc. Her research interests lie in Intelligent Systems and Robotics, in particular on Human-Robot Interaction.

    Chuize Chen received the B.Eng. degree in automation from the South China University of Technology, Guangzhou, China, in 2017. His current research interests include human–robot interaction and machine learning and robotics.

    Chenguang Yang received the B.Eng. degree in measurement and control from Northwestern Polytechnical University, Xi’an, China, in 2005, and the Ph.D. degree in control engineering from the National University of Singapore, Singapore, in 2010. He was a Post-Doctoral Fellow with Imperial College London, London, U.K. His current research interest includes robotics and automation. Dr. Yang was a recipient of the Best Paper Award from the IEEE TRANSACTIONS ON ROBOTICS and a number of international conferences.

    1

    Ning Wang and Chuize Chen contributed equally to this work.

    View full text