Elsevier

Information Sciences

Volumes 352–353, 20 July 2016, Pages 79-97
Information Sciences

Grounding humanoid visually guided walking: From action-independent to action-oriented knowledge

https://doi.org/10.1016/j.ins.2016.02.053Get rights and content

Abstract

In the context of humanoid and service robotics, it is essential that the agent can be positioned with respect to objects of interest in the environment. By relying mostly on the cognitivist conception in artificial intelligence, the research on visually guided walking has tended to overlook the characteristics of the context in which behavior occurs. Consequently, considerable efforts have been directed to define action-independent explicit models of the solution, often resulting in high computational requirements. In this study, inspired by the embodied cognition research, our interest has focused on the analysis of the sensory-motor coupling. Notably, on the relation between embodiment, information, and action-oriented representation. Hence, by mimicking human walking, a behavior scheme is proposed and endowed the agent with the skill of approaching stimuli. A significant contribution to object discrimination was obtained by proposing an efficient visual attention mechanism, that exploits the redundancies and the statistical regularities induced in the sensory-motor coordination, thus the information flow is anticipated from the fusion of visual and proprioceptive features in a Bayesian network. The solution was implemented on the humanoid platform Nao, where the task was accomplished in an unstructured scenario.

Introduction

The automation of visually guided walking has arguably adopted in its infancy the so called cognitivist approach to artificial intelligence (AI), which, under the Cartesian dualist influence, has tended to look at physical and mental processes as belonging to different realms. Significant progress has been obtained from this view, though the gains are still distant from the sophistication observed in natural behavior.

Among the several challenges reported in the literature, one is undoubtedly to achieve reliable perception from noisy sensory data. Since the sensory input goes through a process of symbolization, and cognition involves computations over symbols, the physical context at which the latter emerged is no longer available to the cognitive process. In other words, in abstracting cognition from the context, information is inevitably lost.

To cope with the difficulties of perceiving the object while moving, several methodologies that flourished in the machine learning research have been employed (e.g. Markovian models, support vector machines, among others). These attempts have produced impressive results, although, by keeping intact the fundamental premise of decoupling between bodily and mental processes, they have relied on expensive resources in the form of context-free explicit models, knowledge databases, and intensive computation. Thus, the processing bottleneck has impacted the autonomy and the reactivity of the agent, and extraneous variables (i.e. unmodeled phenomena) have been controlled by adapting the scene to the task, which has compromised the generality of the solution.

In the last decades, a different perspective has been adopted to study natural behavior from the multi-disciplinary research on embodied cognition (EC), where knowledge representation is thought to be grounded in the physical interaction with the environment. Unlike the cognitivist approach, emergent behavior would not be explicitly represented nor planed in advance. The analysis of the sensory-motor coupling in natural tasks, from a dynamic system perspective, appears as a promising research direction that can provide more efficient, robust, and autonomous solutions.

However, adopting the EC methodology also poses important challenges to robotists, in particular, when fulfilling the requirements underlying the physical grounding hypothesis. Firstly, the autonomous evolution of the system, as it happens in natural beings, would ideally occur under a phylogenetic architecture that can modify itself. Secondly, the ontology of the system must ensure knowledge acquisition for diverse purposes, by fusing information from different sensory modalities. Lastly, enactment conditions the development of cognitive skills to the sensory-motor coupling and the interaction with the environment, thus knowledge acquisition is a slow process analogous to the natural one.

In view of the advantages and the challenges encountered in the aforementioned research paradigms, we have opted for an intermediate perspective, aiming at obtaining both generality and autonomy for applications in service robotics. Thereby, in this work we study the task of approaching and positioning in relation to visual stimuli. For this, we adopt the cognitivist assumption that human employs action-independent knowledge for localizing the object in the scene. But, as an embodied being, human can resort to contextual information to discriminate and perceive the object. Therefore, we explore action-oriented representations in the form of embodied features that capture bodily sensations emerging in the task. Furthermore, we examine the redundancies and the statistical regularities induced in the sensory-motor coordination for obtaining a more efficient perceptive processing. We describe a behavior scheme that includes a proposal for resources organization, and, through a case study, we consider the information fusion within a Bayesian network in charge of discriminating the object.

This document is organized as follows. In Section 2 some related contributions and challenges encountered from the cognitivist approach are discussed. Section 3 starts by the conceptualization of our research within the literature of embodied cognition, it then proceeds with the methodological analysis of the task where the behavior scheme is proposed and described. A case-study has been designed including simulations and experiments with the robot Nao, which is presented in Section 4. The results obtained are discussed in Section 5 along with the research perspectives. Finally, the conclusions of the study are given in Section 6.

Section snippets

Related work

Vision-based locomotion is a challenging task for walking robots. Unlike natural beings, which are in possession of extremely sophisticated sensory organs, the vast majority of the research in robotic vision has been carried out with quite inferior equipment, usually employing general purpose cameras. Moreover, the body structure and the actuation system utilized is much less stable, fine, and accurate, when compared to the natural musculo-skeletal system. In view of such limitations, several

Grounding vision-based locomotion

The term embodied cognition reunites co-existing research interests with diverse subject matters. A thorough review on the conflicting views in EC is beyond the scope of this paper (we refer the reader to the works of Shapiro [30] and Wilson [36] for a discussion on this topic). Thus, we are in agreement with Anderson [2] when he identifies in the physical grounding hypothesis (Brooks [4]) the distinctive aspect of EC as opposed to a situated but cognitivist view of embodiment. Accordingly,

Case study

The case study has been conducted in three stages. At first, a proposal for the different components of the behavior scheme described in Section 3.1.3 has been developed and implemented. Then, simulations were carried out in order to adjust the parameters of the tasks, and to verify that the proposed scheme is able to produce the desired behavior. Finally, a real approaching task was designed in a unstructured scenario with the robot Nao. Bellow, the results obtained are reported.

Discussion

Starting from the analysis of the dynamic aspects of human visually guided locomotion, the first-order descriptions proposed (in Eqs. (3) and (4)), and implemented in parallel motor tasks, allowed the agent to mimic human style of approaching objects (see Fig. 14). Despite this aspect is a non-functional requirement, and only ensures the aesthetics of motion, it is of crucial importance for the acceptance of the solution in a human-machine interaction context. The study also showed that when

Conclusions and perspectives

This work has started by the interest in applications of humanoid service robotics. In particular, by the behavior of visually guided approaching a certain face of an object in the scene. From the difficulties encountered in cognitivist automation research involving locomotion and vision, we have concentrated our efforts in exploiting the emergence of contextual information, notably, the redundancies and the statistical regularities induced in the sensory-motor coupling. For this, we conducted

Acknowledgments

This research has been funded by the Ecole Centrale de Nantes (ECN) and EQUIPEX ROBOTEX, France; and the CAPES Foundation, Ministry of Education of Brazil, Brasília - DF 700040-020, Brazil.

References (36)

  • M. Anderson

    Embodied cognition: a field guide

    Artif. Intell.

    (2003)
  • M. Proetzsch et al.

    Development of complex robotic systems using the behavior-based control architecture ib2c

    Robot. Auton. Syst.

    (2010)
  • B.F. Allen et al.

    Localizing a mobile robot with intrinsic noise

    3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2012

    (2012)
  • M.A. Arbib et al.

    Neurorobotics: from vision to action

  • R.A. Brooks

    Cambrian Intelligence: The Early History of the New AI

    (1999)
  • H.F. Chame et al.

    Embodied localization in visually-guided walk of humanoid robots.

    ICINCO (2)

    (2014)
  • F. Chaumette et al.

    Visual servo control, part i: basic approaches

    IEEE Robot. Autom. Mag.

    (2006)
  • A. Clark

    Being There: Putting Brain, Body, and World Together Again

    (1998)
  • A. Comport et al.

    Real-time markerless tracking for augmented reality: the virtual visual servoing framework

    IEEE Trans. Visualiz. Comput. Graph.

    (2006)
  • P.I. Corke

    Robotics, Vision & Control: Fundamental Algorithms in Matlab

    (2011)
  • A.J. Davison et al.

    Monoslam: Real-time single camera slam

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • C. Dune et al.

    Vision based control for humanoid robots

    in ”IROS Workshop on Visual Control of Mobile Robots (ViCoMoR

    (2011)
  • W. Ertel

    Introduction To Artificial Intelligence

    (2011)
  • B.R. Fajen et al.

    Behavioral dynamics of steering, obstacle avoidance, and route selection

    J. Exp. Psychol. Hum. Percept. Perform.

    (2003)
  • M. Hoffmann et al.

    The implications of embodiment for behavior and cognition: animal and robotic case studies

    Clin. Orthop. Relat. Res.

    (2012)
  • A. Hornung et al.

    Humanoid robot localization in complex indoor environments

    Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on

    (2010)
  • L. Itti et al.

    A model of saliency-based visual attention for rapid scene analysis

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1998)
  • B. Jähne

    Digital Image Processing

    (2002)
  • Cited by (7)

    • A dynamic computational model of motivation based on self-determination theory and CANN

      2019, Information Sciences
      Citation Excerpt :

      This was illustrated in CS2 where the motivation of the agent was affected by the perception of discrepancy between the expectations and the performance on the musical playing activity. In related works (e.g. learning triggering in [44], visual attention in [45], and reliable localization in [46]), it was shown how taking into account the correspondence between the top-down aspects of behavior expectations and the bottom-up emergence, conditioned to the interaction with the environment, can lead to significant improvements in the computational efficiency of the task. We believe that the correspondence to expectations can be informative in a more general sense on the quality of mediation experienced during the activity, leading to motivation dynamics.

    • Neural network for black-box fusion of underwater robot localization under unmodeled noise

      2018, Robotics and Autonomous Systems
      Citation Excerpt :

      The results obtained pointed out the relevance of contextual anticipation of information for obtaining a more informed and reliable fusion. This principle has been explored in a previous research for humanoid robotics localization (Chame & Chevallerau [17]) and could be successfully extended to the field of underwater robotics localization. Regarding the experimental study the scan-matching algorithm provided interesting results.

    • Cognitive and motor compliance in intentional human-robot interaction

      2020, Proceedings - IEEE International Conference on Robotics and Automation
    • Reliable fusion of black-box estimates of underwater localization

      2018, IEEE International Conference on Intelligent Robots and Systems
    View all citing articles on Scopus
    View full text