Elsevier

Robotics and Autonomous Systems

Volume 54, Issue 9, 30 September 2006, Pages 766-778
Robotics and Autonomous Systems

Robot learning through task identification

https://doi.org/10.1016/j.robot.2006.04.015Get rights and content

Abstract

The operation of an autonomous mobile robot in a semi-structured environment is a complex, usually non-linear and partly unpredictable process. Lacking a theory of robot–environment interaction that allows the design of robot control code based on theoretical analysis, roboticists still have to resort to trial-and-error methods in mobile robotics.

The RobotMODIC project aims to develop a theoretical understanding of a robot’s interaction with its environment, and uses system identification techniques to identify the system robot–task–environment. In this paper, we present two practical examples of the RobotMODIC process: mobile robot self-localisation and mobile robot training to achieve door traversal.

In both examples, a transparent mathematical function is obtained that maps inputs–sensory perception in both cases–to output — location and steering velocity respectively. Analysis of the obtained models reveals further information about the way in which a task is achieved, the relevance of individual sensors, possible ways of obtaining more parsimonious models, etc.

Introduction

The behaviour of a mobile robot is the result of properties of the robot itself (physical aspects–the “embodiment”), the environment (“situatedness”), and the control program (the “task”) the robot is executing (see Fig. 1). This triangle of robot, task and environment constitutes a complex, interacting and typically nonlinear system that is difficult to analyse, and whose behaviour is only partially predictable. Because of this lack of theoretical understanding, it is hard to develop robot control code without resorting to costly and imprecise trial-and-error methods.

The RobotMODIC project [1] investigates this relationship between robot, task and environment, and aims to develop an analytical description of the interdependency between the factors that govern robot behaviour. The approach we have taken so far is to model aspects of robot behaviour, using transparent mathematical models which, we argue, retain the essential properties of the robot’s behaviour, while being analysable through established mathematical procedures.

Essentially, we view the interaction between a mobile robot and its environment as a kind of “computation”, in which the robot “computes” behaviour (the output) from the three inputs: robot morphology, environmental characteristics and executed task (see Fig. 2).

Similar to a cylindrical lens, which can be used to perform an analog computation, highlighting vertical edges and suppressing horizontal ones, or a camera lens computing a Fourier transform by analog means, a robot’s behaviour–commonly the mobile robot’s trajectory–can be seen as emergent from the three components shown in Fig. 1: the robot “computes” its behaviour from its own makeup, the world’s makeup, and taking into account the program it is currently running (the task).

The RobotMODIC procedure, therefore, consists of four main steps:

  • (1)

    Observe the robot’s behaviour in the target environment, by logging sensory perception, motor response and location in the environment,

  • (2)

    model the relationship under investigation, using a transparent mathematical modelling process that expresses the relationship as a closed, analysable function,

  • (3)

    establish the validity of the obtained model by comparing the originally observed behaviour with that of the model,

  • (4)

    analyse the model using standard mathematical techniques, thus gaining understanding of the interaction between robot and environment.

To obtain models of input–output relationships we use the ARMAX (linear autoregressive, moving average model with exogenous inputs, [2]) or NARMAX (nonlinear ARMAX) system identification methods; we therefore refer to this process as “robot identification”.

Both ARMAX and NARMAX model the input–output relationship as a mathematical function, in our case linear or nonlinear polynomials. ARMAX is available through standard programming packages such as Matlab or Scilab.

The NARMAX modelling approach is a parameter estimation methodology for identifying both the important model terms and the parameters of unknown non-linear dynamic systems. For multiple input, single output noiseless systems this model takes the form: y(n)=f(u1(n),u1(n1),u1(n2),,u1(nNu),u1(n)2,u1(n1)2,u1(n2)2,,u1(nNu)2,,u1(n)l,u1(n1)l,u1(n2)l,,u1(nNu)l,u2(n),u2(n1),u2(n2),,u2(nNu),u2(n)2,u2(n1)2,u2(n2)2,,u2(nNu)2,,u2(n)l,u2(n1)l,u2(n2)l,,u2(nNu)l,,,ud(n),ud(n1),ud(n2),,ud(nNu),ud(n)2,ud(n1)2,ud(n2)2,,ud(nNu)2,,ud(n)l,ud(n1)l,ud(n2)l,,ud(nNu)l,y(n1),y(n2),,y(nNy),y(n1)2,y(n2)2,,y(nNy)2,,y(n1)l,y(n2)l,,y(nNy)l) where y(n) and u(n) are the sampled output and input signals at time n respectively, Ny and Nu are the regression orders of the output and input respectively and d is the input dimension. f() is a non-linear function and it is typically taken to be a polynomial or wavelet multi-resolution expansion of the arguments. The degree l of the polynomial is the highest sum of powers in any of its terms.

The NARMAX methodology breaks the modelling problem into the following steps:

  • (1)

    Structure detection,

  • (2)

    parameter estimation,

  • (3)

    model validation,

  • (4)

    prediction, and

  • (5)

    analysis.

A detailed procedure of how these steps are done is presented in [3], [4], [5]; nevertheless a brief explanation of them is given below.

Any data set that we intend to model is first split in two sets (usually of equal size). The first, referred to as the estimation data set, is used to calculate the model parameters. The remaining data set, referred to as the test set, is used to test and evaluate the model.

The structure of the NARMAX polynomial is determined by the inputs u, the output y, the input and output orders Nu and Ny respectively and the degree l of the polynomial. The number of terms of the NARMAX model polynomial can be very large depending on these variables, but not all of them are significant contributors to the computation of the output, in fact often most terms can be safely removed from the model equation without introducing any significant errors.

The calculation of the NARMAX model parameters is an iterative process. Each iteration involves three steps: (i) estimation of model parameters, (ii) model validation and (iii) removal of non-contributing terms.

In the first step the NARMAX model is used to compute an equivalent auxiliary model whose terms are orthogonal, thus allowing their associated parameters to be calculated sequentially and independently from each other. Once the parameters of the auxiliary model are obtained, the NARMAX model parameters are computed from the auxiliary model.

The NARMAX model is then tested using the validation data set. If the error between the model predicted output and the actual output is below a user-defined threshold, non-contributing model terms are removed in order to reduce the size of the polynomial.

To determine the contribution of a model term to the output, the so-called Error Reduction Ratio (ERR) [4] is computed for each term. The ERR of a term is the percentage reduction in the total mean-squared error (i.e. the difference between model predicted and true system output) as a result of including (in the model equation) the term under consideration. The bigger the ERR is, the more significant the term. Model terms with ERR under a certain threshold are removed from the model polynomial.

In the following iteration, if the error is higher as a result of the last removal of model terms then these are re-inserted back into the model equation and the model is considered as final.

The RobotMODIC process offers some very practical benefits to the robotics engineer. It is possible, for instance, to represent robot control code in a very condensed and “universal” format that allows the easy transfer of robot control code between different robot platforms [6] (see, for instance, Section 3 of this paper). It is also possible to represent one sensor modality in terms of another, for example sonar sensor perceptions can be expressed as functions of laser perceptions (this “sensor identification” is obviously only possible if both sensor modalities contain similar information — to model laser perception as a function of bumper signals, for example, is of course impossible! [7]).

In this paper, we present two practical examples of how this process can be applied to mobile robotics:

  • (1)

    Robot navigation: self-localisation of a mobile robot.

  • (2)

    Robot programming: obtaining robot control code through operator-supervised training.

Section snippets

Application 1: Mobile robot self-localisation

The experimental scenario we are interested in analysing here is that of a physical mobile robot moving through a real-world environment, and determining its position in continuous Cartesian coordinates x,y, based on external signals (sensory perception P), as well as previous hypotheses regarding the robot’s position. This scenario, which is similar to evidence-based localisation [8], [9], [10] and simultaneous localisation and mapping [11], [12] scenarios, is shown in Fig. 3.

In contrast to

Application 2: Robot training

In a second set of experiments our objective was to obtain transparent, analysable models of a reactive sensor–motor task. As before in the localisation task, our prime objective was the aspect of transparent modelling, as well as achieving a specific task.

To obtain the data needed for modelling the sensor–motor mapping, we employed the experimental method of robot teaching, sometimes referred to as “behavioural cloning” (see [14] for a review article). This method has successfully been used in

Summary

In this paper we present two applications of the RobotMODIC process. In the first case, a linear ARMAX polynomial for robot self-localisation is estimated, which gives the robot’s position as a function of its laser and sonar sensor perceptions.

Initially, no refinement was made to the estimated model polynomial, i.e. all sensors were taken into account to produce the model. By looking at the contribution of each model term in the resulting polynomial it could be seen that not all sensor inputs

Acknowledgements

The RobotMODIC project is supported by the British Engineering and Physical Sciences Research Council under grant GR/S30955/01. Roberto Iglesias is supported through the Spanish research grants PGIDIT04TIC206011PR, TIC2003-09400-C04-03, and TIN2005-03844.

Ulrich Nehmzow is Reader in Analytical and Cognitive Robotics in the Department of Computer Science at the University of Essex.

He obtained his Diploma in Electrical Engineering and Information Science from the University of Technology at Munich in 1988, and his Ph.D. in Artificial Intelligence from the University of Edinburgh in 1992. He is a chartered engineer (CEng) and a member of the IEE. After a postdoctoral position at Edinburgh University he became Lecturer in Artificial Intelligence at

References (22)

  • D. Fox et al.

    Active Markov localisation for mobile robots

    Robotics and Autonomous Systems

    (1998)
  • N. Tomatis et al.

    Hybrid simultaneous localization and map building: a natural integration of topological and metric

    Robotics and Autonomous Systems

    (2003)
  • U. Nehmzow et al.

    Robotmodic: Modelling, identification and characterisation of mobile robots

  • R. Pearson

    Discrete-time Dynamic Models

    (1999)
  • S.A. Billings et al.

    The determination of multivariable nonlinear models for dynamical systems

  • M. Korenberg et al.

    Orthogonal parameter estimation algorithm for non-linear stochastic systems

    International Journal of Control

    (1988)
  • S.A. Billings et al.

    Correlation based model validity tests for non-linear models

    International Journal of Control

    (1986)
  • T. Kyriacou, U. Nehmzow, R. Iglesias, S.A. Billings, Cross-platform programming through system identification, in:...
  • U. Nehmzow

    Scientific Methods in Mobile Robotics—Quantitative Analysis of Agent Behaviour

    (2006)
  • S. Thrun

    Bayesian landmark learning for mobile robot localisation

    Machine Learning

    (1998)
  • F. Dellaert, D. Fox, W. Burgard, S. Thrun, Monte carlo localization for mobile robots, in: Proceedings of the IEEE...
  • Cited by (11)

    • Learning robot reaching motions by demonstration using nonlinear autoregressive models

      2018, Robotics and Autonomous Systems
      Citation Excerpt :

      System identification techniques using autoregressive models were applied before to teach wheeled mobile robots by demonstration. Autoregressive moving average model with exogenous inputs (ARMAX) and nonlinear ARMAX (NARMAX) were used in [10] and [11] to model sensor–motor mappings, where the sensory information of the demonstrations are used to obtain a model that is used directly to control the robot. The learned models, which requires a considerable amount of data and demonstrations to be identified, were used in reactive tasks such as wall following, corridor following, door traversal and route learning.

    • Model identification and model analysis in robot training

      2008, Robotics and Autonomous Systems
      Citation Excerpt :

      Through the use of a system identification approach the behaviour of the robot is modelled through a polynomial representation that is easily and accurately transferable to any robot platform with similar sensor configuration [2]. Moreover, this polynomial representation can be analysed to understand the main aspects involved in robot behaviour: we can for instance identify the most relevant hardware components of the robot (e.g. sensors) [3,1], or predict the robot’s response to particular inputs [4]. The robot-training process we proposed works in two stages: first, the robot is driven under manual control demonstrating the target behaviour.

    • Task identification and characterisation in mobile robotics through non-linear modelling

      2007, Robotics and Autonomous Systems
      Citation Excerpt :

      An important part of the project has been aimed towards the “identification”–in the sense of mathematical modelling–of the mobile robot’s motion responses to perceptual stimuli (task identification). In this context, the transparent analysable modelling approaches ARMAX and NARMAX have been applied to solve different tasks, like door traversal [13–15], route learning [16], or even a visual task like teaching a mobile robot how to track a moving football [17]. Besides the fact that the models we obtained were able to control the robot properly, the most interesting aspect is that we were able to analyse these models to predict the behaviour of the robot [17], and even to formulate new hypotheses concerning the robot’s operation–for example concerning the relevance and irrelevance of certain sensors for a given task.

    View all citing articles on Scopus

    Ulrich Nehmzow is Reader in Analytical and Cognitive Robotics in the Department of Computer Science at the University of Essex.

    He obtained his Diploma in Electrical Engineering and Information Science from the University of Technology at Munich in 1988, and his Ph.D. in Artificial Intelligence from the University of Edinburgh in 1992. He is a chartered engineer (CEng) and a member of the IEE. After a postdoctoral position at Edinburgh University he became Lecturer in Artificial Intelligence at the University of Manchester in 1994, and Senior Lecturer in Robotics at the University of Essex in 2001. His research interests are scientific methods in robotics, robot learning, robot navigation, simulation and modelling and novelty detection for industrial applications of robotics.

    Ulrich Nehmzow is the co-chair of the British conference on mobile robotics (TAROS) and member of the editorial board of Journal of Connection Science and the AISB journal. He is secretary of the International Society for Adaptive Behavior.

    Roberto Iglesias received the B.S. and Ph.D. degrees in physics from the University of Santiago de Compostela, Spain, in 1996 and 2003, respectively. He is currently an Associate Professor in the Department of Electronics and Computer Science at the University of Santiago de Compostela, Spain. His research interests focus on control and navigation in mobile robotics and machine learning, mainly reinforcement learning and artificial neural networks.

    Theocharis Kyriacou is currently a Senior Research Officer at the University of Essex (UK). He obtained his Ph.D. in Vision-Based Urban Navigation Procedures for Verbally Instructed Robots from the University of Plymouth (UK) in 2004. He earned his B.Eng. (Honours) degree in Electronic Engineering (Systems) from the University of Sheffield in 2000. Prior to studying for his undergraduate degree he obtained a Higher Engineering diploma in Electrical Engineering from the Higher Technical Institute (H.T.I.) of Cyprus.

    Stephen A. Billings received the B.Eng. degree in Electrical Engineering with first class honours from the University of Liverpool in 1972, the degree of Ph.D. in Control Systems Engineering from the University of Sheffield in 1976, and the degree of D.Eng. from the University of Liverpool in 1990. He is a Chartered Engineer (CEng), Chartered Mathematician (CMath), Fellow of the IEE (UK) and Fellow of the Institute of Mathematics and its Applications. He was appointed as Professor in the Department of Automatic Control and Systems Engineering, University of Sheffield, UK in 1990 and leads the Signal Processing and Complex Systems research group. His research interests include system identification and information processing for nonlinear systems, NARMAX methods, model validation, prediction, spectral analysis, adaptive systems, nonlinear systems analysis and design, neural networks, wavelets, fractals, machine vision, cellular automata, spatio-temporal systems, fMRI and optical imagery of the brain.

    View full text