Elsevier

Neurocomputing

Volume 275, 31 January 2018, Pages 2093-2103
Neurocomputing

Robot teaching by teleoperation based on visual interaction and extreme learning machine

https://doi.org/10.1016/j.neucom.2017.10.034Get rights and content

Abstract

Compared with traditional robot teaching methods, robots can learn various human-like skills in a more efficient and natural manner through teleoperation. In this paper, we propose a teleoperation method based on human-robot interaction (HRI), which mainly uses visual information. With only one teleoperation, the robot can reproduce a trajectory. There is a certain error between this trajectory and the optimal trajectory due to the cause of the human demonstrator or the robot. So we use an extreme learning machine (ELM) based algorithm to transfer the demonstrator’s motions to the robot. To verify the method, we use a Microsoft KinectV2 to capture the human body motion and the hand state, according to which a Baxter robot in Virtual Robot Experimentation Platform (V-REP) will be controlled by the command. Through learning and training by the ELM, the robot in V-REP can complete a certain task autonomously and the robot in reality can reproduce this trajectory well. The experimental results show that the developed method has achieved satisfactory performance.

Introduction

With the recent rapid advances in robotics, the application of robots in industries has been extended to various fields. Through a teaching by demonstration (TbD) method, a robot can perform a task which is different from the previous one in a new working environment [1], [2]. Traditionally, only after the professionals spend a lot of time for programming by keyboards or joysticks [3], the industrial robots can learn fixed skills on the assembly line. Apparently this approach is usually time consuming and not flexible to adapt to modern manufacturing. While a robot can be directly programmed by learning human-like manipulation skills from a skilful demonstrator through teleoperation. Therefore, this method can enable the robot to adapt to different tasks or environment efficiently.

Teleoperation based on HRI has been recently attracted much attention due to the advantages described above [4]. In [5], a TbD method is presented, enhanced by transferring the stiffness profile during HRI. In their work, muscle surface electromyography (sEMG) is collected and processed to extract the demonstrator’s variable stiffness and hand grasping patterns. In [6], various hand gestures are recognized through the proposed HRI method based on hand guided demonstration. In [7], a teleoperation-based robot programming method is proposed. To verify the method, they develop a master-slave teleoperation system and an exoskeleton device is used as the HRI device.

There are many techniques or devices that are applied to HRI for enhanced performance. Generally, visual interaction has been one of the most widely utilized techniques [8]. Because visual interaction based on body motions tracking is comparative easy to implement, most of them are applied to capture human motion [9].

While in the TbD method for robot, neural networks have been widely applied. In [10], a TbD method for building an adaptive control system is presented. And the robot will improve its work performance by repeated a task with the help of the neural network. In [11], a neural learning scheme which can be used in estimating stable dynamical systems is presented. The result shows that the method is able to evaluate systems accurately. Slightly less than the robot teaching based on neural learning is too complex to cost much time.

In this paper, we put forward a robot teaching method which uses a virtual teleoperation system based on visual interaction, and uses a neural learning method based on ELM. More specifically, in our work, Microsoft second generation of motion capture device, which is called Kinect V2, is used to track human body motion and the hand state. A simulation experiment has been conducted based on the V-REP platform where the Baxter robot is guided to learn the demonstrator’s motion skills. And a learning algorithm based on ELM [12], which can teach the robot to learn some skills from human, is developed. Compared with robot teaching based on other neural network, this ELM requires less training samples, and has a high generalization capacity [13].

The rest of the paper is structured as follows: In Section 2, we introduce Kinect sensor, V-REP and its remote API. In Section 3, we present system included a virtual teleoperation system and a learning and training system. In Section 4, the design methodology about the system is introduced. In the Section 5, a space vector approach, a data processing method and a TbD method based ELM is presented. Finally, the experimental results are revealed in Section 6 followed by the conclusion given in Section 7.

Section snippets

Kinect sensor

The second-generation Kinect for Windows is used in our work. Kinect has an RGB color camera, an IR emitter, and a depth sensor which is composed of an IR camera and an IR projector. With these devices above, Kinect Sensor provides full-body 3D motion capture, facial recognition and other capabilities [14]. Compared with the Kinect V1, Kinect V2 allows us to track up down 25 body joints [15], included the fists and thumbs. Because of such an advantage, the Kinect V2 can recognize the hand

The architecture of the system

As is shown in Fig. 4, the system we design contains a virtual teleoperation system and a training and learning system. In the first stage, a human demonstrator controls the Baxter in V-REP by Kinect. In the second stage, a neural network will be used to train and learn the data, which is recorded in the first stage. And then the output data is sent to the Baxter, to make it complete the previous task.

The virtual teleoperation system, which is the simulation of the real one, can verify the

Acquiring information from Kinect

Kinect skeletal tracking is not affected by ambient lighting because of the infrared information. 3D depth images can be captured by the Kinect due to the mechanism of binocular vision [26].

There are three steps for the Kinect to capture the demonstrator’s body information: At first, Kinect adopts the method of image segmentation to distinguish the human body from the complex background. Then Kinect finds the object in the image that is more likely to be human and evaluates depth of field image

Space vector approach

The key of controlling the Baxter by Kinect is how to calculate the human joint angle. Kinect is able to get the 3D Cartesian coordinates of the joints of a human body. In a 3D space, the distance between two points A(x1, y1, z1) and B(x2, y2, z2) can be calculated by the following equation: d=(x2x1)2+(y2y1)2+(z2z1)2And the Vector AB can be expressed as AB=(x2x1,y2y1,z2z1),d=|AB| And in 3D space, the law of Cosines can calculate the angle between two joints. A joint in Kinect

The Effectiveness of the virtual teleoperation system

In order to verify the validity of the proposed method, we build a virtual teleoperation platform which mainly consists of Kinect and V-REP. At first we verify the effectiveness of controlling the robot by human body motions. Four motions are designed to verify that the Baxter robot arms can be moved flexibly in the virtual space.

As shown in Fig. 11, the first two motions show that the Baxter robot arms can swing up and down controlled by the human demonstrator. And the rest motions show that

Conclusion

In this paper, we have developed a virtual teleoperation system based on visual interaction. The human body motion is used to control the robot’s arms, and gesture are used to control the beginning and end of the simulation. In addition, through a TbD method based on ELM, the system can transfer the human motions to the robot. We use Kinect to acquire the body skeleton data and hand states. Then we use V-REP, to build a Baxter robot and its work environment. To verify the effectiveness of the

Acknowledgement

This work was partially supported by National Nature Science Foundation (NSFC) under Grant 61473120, Guangdong Provincial Natural Science Foundation 2014A030313266 and International Science and Technology Collaboration Grant 2015A050502017, Science and Technology Planning Project of Guangzhou 201607010006, State Key Laboratory of Robotics and System (HIT) Grant SKLRS-2017-KF-13, and the Fundamental Research Funds for the Central Universities 2017ZD057.

Yang Xu received the B.Eng. degree in automation from the South China University of Technology, Guangzhou, China, in 2016, and is currently pursuing the M.S. degree in the South China University of Technology, Guangzhou, China. His research interests include human-robot interaction, robot imitation learning, and machine learning.

References (27)

  • C Yang et al.

    Human-Robot Interaction Interface[M]

    Adv. Technol. Mod. Rob. Appl.

    (2016)
  • C.D. Mutto et al.

    Time-of-Flight Cameras and Microsoft Kinect (TM)

    (2012)
  • LiuS. et al.

    Teaching and learning of deburring robots using neural networks

    Proceedings of IEEE International Conference on Robotics and Automation, 1993

    (1993)
  • Cited by (44)

    • State of charge estimation techniques of Li-ion battery of electric vehicles

      2023, e-Prime - Advances in Electrical Engineering, Electronics and Energy
    • An incremental cross-modal transfer learning method for gesture interaction

      2022, Robotics and Autonomous Systems
      Citation Excerpt :

      The interactive control for industrial robots, thanks to the high precision of the novel contact-free sensors, such as LeapMotion and Kinect, there have been increasing tele-operating robotic applications. Particularly, in the case of direct human–robot cooperation or collaboration, the gesture-based user interface is more straightforward and safe (e.g. [29,30]). Using multi-modal signals such as the speech commands, hand gestures as well as body position provides a complementary way to order the robot, thus the users do not have to explicitly tell the instructions [31].

    • Self-supervised learning of monocular depth using quantized networks

      2022, Neurocomputing
      Citation Excerpt :

      Absolute depth estimation from a single image is attractive because it has numerous applications in different fields such as autonomous driving [1], robotics [2] and augmented reality [3].

    • A combined machine learning algorithms and DEA method for measuring and predicting the efficiency of Chinese manufacturing listed companies

      2021, Journal of Management Science and Engineering
      Citation Excerpt :

      After decades of continuous development (Turing, 1950; Rosenblatt, 1958; Werbos, 1981; Schapire, 1990; Cortes et al., 1995), nowadays ML is a well-known method that “using algorithms to parse data, learn from it, and then make decisions or predictions about something unknown in the world”. ML has been widely used in many applications, such as data mining (Kavakiotis et al., 2017), computer vision (Brunetti et al., 2018), biometric recognition (Chen et al., 2016), stock market analysis (Lee et al., 2019) and robotic applications (Xu et al., 2018), and so on. Generally, the key to ML is using algorithms to parse data, learn from it, and then make decisions or predictions about something unknown.

    • Broad learning extreme learning machine for forecasting and eliminating tremors in teleoperation[Formula presented]

      2021, Applied Soft Computing
      Citation Excerpt :

      Difficult missions such as stitching in minimally invasive surgery and high-precision processing in industry can be performed by controlling robots. With the rapid development and application of teleoperation robots, the most critical teleoperation control system has attracted extensive attention of researchers [4,5]. Recently, various filtering models have been proposed for estimating and eliminating the tremor signals.

    • Robot recognizing humans intention and interacting with humans based on a multi-task model combining ST-GCN-LSTM model and YOLO model

      2021, Neurocomputing
      Citation Excerpt :

      Specifically, the robots are required to track the human’s motion, identify the context of interaction, and predict how the human would behave subsequently to accomplish a certain task [3–6]. In fact, human intention recognition [7,8] has many important practical applications such as robots learning various human-like skills through teleoperation [9] based on human-robot interaction, or human impedance adaptive skill transfer in a physical human-robot interaction system [10]. In this paper, we mainly develop a human intention recognition system for HRI.

    View all citing articles on Scopus

    Yang Xu received the B.Eng. degree in automation from the South China University of Technology, Guangzhou, China, in 2016, and is currently pursuing the M.S. degree in the South China University of Technology, Guangzhou, China. His research interests include human-robot interaction, robot imitation learning, and machine learning.

    Chenguang Yang received the B.Eng. degree in measurement and control from Northwestern Polytechnical University, Xi’an, China, in 2005, and the Ph.D. degree in control engineering from the National University of Singapore, Singapore, in 2010. He received postdoctoral training at Imperial College London, UK. He is the recipient of the Best Paper Award from the IEEE Transactions on Robotics and a number of international conferences. His research interests lie in robotics, automation and computational intelligence.

    Junpei Zhong is currently a research scientist at Artificial Intelligence Research Center of National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan. He received the B.Eng degree from South China University of Technology in 2006, M.Phil from the Hong Kong Polytechnic University in 2010 and doctoral degree (with “magna cum laude”) from University of Hamburg in 2015. He has been awarded the Marie-Curie fellowship for his doctoral study from 2010 to 2013. From 2014 to 2016, he has participated in different EU and Japanese funded projects at University of Hertfordshire, Plymouth University and Waseda University before joining AIST. His research interests are machine learning, computational intelligence and cognitive robotics.

    Ning Wang received the B.Eng. degree in measurement and control technologies and devices from the College of Automation, Northwestern Polytechnical University, Xi’an, China, in 2005, the M.Phil. and Ph.D degree in electronic engineering from the Department of Electronic Engineering, The Chinese University of Hong Kong, Hong Kong, China, in 2007 and 2011, respectively. She was working as Post-doc fellow at the Department of Computer Science & Engineering, The Chinese University of Hong Kong from 2011 to 2013, and was research fellow at the School of Computing, Electronics and Mathematics, Plymouth University, United Kingdom from 2014 to 2015. Her research interests lie in signal processing and machine learning, with applications in robust speaker recognition, biomedical pattern recognition, intelligent data analysis, and human-robot interaction.

    Lijun Zhao received the bachelor degree from Beijing Institute of Technology in 1996, Beijing, China, master and Ph.D. degrees from Harbin Institute of Technology(HIT), Harbin, Heilongjiang, China, in 2002 and 2009, respectively, all in Mechatronics Engineering. He is current the supervisor of masters with State Key Laboratory of Robotics and Systems, Robotics Institute, Harbin Institute of Technology, China. His research interests include mobile robot 3D environment mapping, perception, navigation and planning in robotics.

    View full text