Grounding humanoid visually guided walking: From action-independent to action-oriented knowledge

doi:10.1016/j.ins.2016.02.053

Information Sciences

Volumes 352–353, 20 July 2016, Pages 79-97

https://doi.org/10.1016/j.ins.2016.02.053 Get rights and content

Abstract

In the context of humanoid and service robotics, it is essential that the agent can be positioned with respect to objects of interest in the environment. By relying mostly on the cognitivist conception in artificial intelligence, the research on visually guided walking has tended to overlook the characteristics of the context in which behavior occurs. Consequently, considerable efforts have been directed to define action-independent explicit models of the solution, often resulting in high computational requirements. In this study, inspired by the embodied cognition research, our interest has focused on the analysis of the sensory-motor coupling. Notably, on the relation between embodiment, information, and action-oriented representation. Hence, by mimicking human walking, a behavior scheme is proposed and endowed the agent with the skill of approaching stimuli. A significant contribution to object discrimination was obtained by proposing an efficient visual attention mechanism, that exploits the redundancies and the statistical regularities induced in the sensory-motor coordination, thus the information flow is anticipated from the fusion of visual and proprioceptive features in a Bayesian network. The solution was implemented on the humanoid platform Nao, where the task was accomplished in an unstructured scenario.

Introduction

The automation of visually guided walking has arguably adopted in its infancy the so called cognitivist approach to artificial intelligence (AI), which, under the Cartesian dualist influence, has tended to look at physical and mental processes as belonging to different realms. Significant progress has been obtained from this view, though the gains are still distant from the sophistication observed in natural behavior.

Among the several challenges reported in the literature, one is undoubtedly to achieve reliable perception from noisy sensory data. Since the sensory input goes through a process of symbolization, and cognition involves computations over symbols, the physical context at which the latter emerged is no longer available to the cognitive process. In other words, in abstracting cognition from the context, information is inevitably lost.

To cope with the difficulties of perceiving the object while moving, several methodologies that flourished in the machine learning research have been employed (e.g. Markovian models, support vector machines, among others). These attempts have produced impressive results, although, by keeping intact the fundamental premise of decoupling between bodily and mental processes, they have relied on expensive resources in the form of context-free explicit models, knowledge databases, and intensive computation. Thus, the processing bottleneck has impacted the autonomy and the reactivity of the agent, and extraneous variables (i.e. unmodeled phenomena) have been controlled by adapting the scene to the task, which has compromised the generality of the solution.

In the last decades, a different perspective has been adopted to study natural behavior from the multi-disciplinary research on embodied cognition (EC), where knowledge representation is thought to be grounded in the physical interaction with the environment. Unlike the cognitivist approach, emergent behavior would not be explicitly represented nor planed in advance. The analysis of the sensory-motor coupling in natural tasks, from a dynamic system perspective, appears as a promising research direction that can provide more efficient, robust, and autonomous solutions.

However, adopting the EC methodology also poses important challenges to robotists, in particular, when fulfilling the requirements underlying the physical grounding hypothesis. Firstly, the autonomous evolution of the system, as it happens in natural beings, would ideally occur under a phylogenetic architecture that can modify itself. Secondly, the ontology of the system must ensure knowledge acquisition for diverse purposes, by fusing information from different sensory modalities. Lastly, enactment conditions the development of cognitive skills to the sensory-motor coupling and the interaction with the environment, thus knowledge acquisition is a slow process analogous to the natural one.

In view of the advantages and the challenges encountered in the aforementioned research paradigms, we have opted for an intermediate perspective, aiming at obtaining both generality and autonomy for applications in service robotics. Thereby, in this work we study the task of approaching and positioning in relation to visual stimuli. For this, we adopt the cognitivist assumption that human employs action-independent knowledge for localizing the object in the scene. But, as an embodied being, human can resort to contextual information to discriminate and perceive the object. Therefore, we explore action-oriented representations in the form of embodied features that capture bodily sensations emerging in the task. Furthermore, we examine the redundancies and the statistical regularities induced in the sensory-motor coordination for obtaining a more efficient perceptive processing. We describe a behavior scheme that includes a proposal for resources organization, and, through a case study, we consider the information fusion within a Bayesian network in charge of discriminating the object.

This document is organized as follows. In Section 2 some related contributions and challenges encountered from the cognitivist approach are discussed. Section 3 starts by the conceptualization of our research within the literature of embodied cognition, it then proceeds with the methodological analysis of the task where the behavior scheme is proposed and described. A case-study has been designed including simulations and experiments with the robot Nao, which is presented in Section 4. The results obtained are discussed in Section 5 along with the research perspectives. Finally, the conclusions of the study are given in Section 6.

Section snippets

Related work

Vision-based locomotion is a challenging task for walking robots. Unlike natural beings, which are in possession of extremely sophisticated sensory organs, the vast majority of the research in robotic vision has been carried out with quite inferior equipment, usually employing general purpose cameras. Moreover, the body structure and the actuation system utilized is much less stable, fine, and accurate, when compared to the natural musculo-skeletal system. In view of such limitations, several

Grounding vision-based locomotion

The term embodied cognition reunites co-existing research interests with diverse subject matters. A thorough review on the conflicting views in EC is beyond the scope of this paper (we refer the reader to the works of Shapiro [30] and Wilson [36] for a discussion on this topic). Thus, we are in agreement with Anderson [2] when he identifies in the physical grounding hypothesis (Brooks [4]) the distinctive aspect of EC as opposed to a situated but cognitivist view of embodiment. Accordingly,

Case study

The case study has been conducted in three stages. At first, a proposal for the different components of the behavior scheme described in Section 3.1.3 has been developed and implemented. Then, simulations were carried out in order to adjust the parameters of the tasks, and to verify that the proposed scheme is able to produce the desired behavior. Finally, a real approaching task was designed in a unstructured scenario with the robot Nao. Bellow, the results obtained are reported.

Discussion

Starting from the analysis of the dynamic aspects of human visually guided locomotion, the first-order descriptions proposed (in Eqs. (3) and (4)), and implemented in parallel motor tasks, allowed the agent to mimic human style of approaching objects (see Fig. 14). Despite this aspect is a non-functional requirement, and only ensures the aesthetics of motion, it is of crucial importance for the acceptance of the solution in a human-machine interaction context. The study also showed that when

Conclusions and perspectives

This work has started by the interest in applications of humanoid service robotics. In particular, by the behavior of visually guided approaching a certain face of an object in the scene. From the difficulties encountered in cognitivist automation research involving locomotion and vision, we have concentrated our efforts in exploiting the emergence of contextual information, notably, the redundancies and the statistical regularities induced in the sensory-motor coupling. For this, we conducted

Acknowledgments

This research has been funded by the Ecole Centrale de Nantes (ECN) and EQUIPEX ROBOTEX, France; and the CAPES Foundation, Ministry of Education of Brazil, Brasília - DF 700040-020, Brazil.

References (36)

M. Anderson
Embodied cognition: a field guide
Artif. Intell.
(2003)
M. Proetzsch et al.
Development of complex robotic systems using the behavior-based control architecture ib2c
Robot. Auton. Syst.
(2010)
B.F. Allen et al.
Localizing a mobile robot with intrinsic noise
3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2012
(2012)
M.A. Arbib et al.
Neurorobotics: from vision to action
R.A. Brooks
Cambrian Intelligence: The Early History of the New AI
(1999)
H.F. Chame et al.
Embodied localization in visually-guided walk of humanoid robots.
ICINCO (2)
(2014)
F. Chaumette et al.
Visual servo control, part i: basic approaches
IEEE Robot. Autom. Mag.
(2006)
A. Clark
Being There: Putting Brain, Body, and World Together Again
(1998)
A. Comport et al.
Real-time markerless tracking for augmented reality: the virtual visual servoing framework
IEEE Trans. Visualiz. Comput. Graph.
(2006)
P.I. Corke
Robotics, Vision & Control: Fundamental Algorithms in Matlab
(2011)

A.J. Davison et al.

Monoslam: Real-time single camera slam

IEEE Trans. Pattern Anal. Mach. Intell.

(2007)

C. Dune et al.

Vision based control for humanoid robots

in ”IROS Workshop on Visual Control of Mobile Robots (ViCoMoR

(2011)

W. Ertel

Introduction To Artificial Intelligence

(2011)

B.R. Fajen et al.

Behavioral dynamics of steering, obstacle avoidance, and route selection

J. Exp. Psychol. Hum. Percept. Perform.

(2003)

M. Hoffmann et al.

The implications of embodiment for behavior and cognition: animal and robotic case studies

Clin. Orthop. Relat. Res.

(2012)

A. Hornung et al.

Humanoid robot localization in complex indoor environments

Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on

(2010)

L. Itti et al.

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.

(1998)

B. Jähne

Digital Image Processing

(2002)

Cited by (7)

A dynamic computational model of motivation based on self-determination theory and CANN
2019, Information Sciences
Citation Excerpt :
This was illustrated in CS2 where the motivation of the agent was affected by the perception of discrepancy between the expectations and the performance on the musical playing activity. In related works (e.g. learning triggering in [44], visual attention in [45], and reliable localization in [46]), it was shown how taking into account the correspondence between the top-down aspects of behavior expectations and the bottom-up emergence, conditioned to the interaction with the environment, can lead to significant improvements in the computational efficiency of the task. We believe that the correspondence to expectations can be informative in a more general sense on the quality of mediation experienced during the activity, leading to motivation dynamics.
The hierarchical model of intrinsic and extrinsic motivation (HMIEM) is a framework based on the principles of self-determination theory (SDT) which describes human motivation from a multilevel perspective, and integrates knowledge on personality and social psychological determinants of motivation and its consequences. Although over the last decades HMIEM has grounded numerous correlational studies in diverse fields, it is conceptually defined as a schematic representation of the dynamics of motivation, that is not suitable for human and artificial agents research based on tracking. In this work we propose an analytic description named dynamic computational model of motivation (DCMM), inspired by HMIEM and based on continuous attractor neural networks, which consists in a computational framework of motivation. In DCMM the motivation state is represented within a self-determination continuum with recurrent feedback connections, receiving inputs from heterogeneous layers. Through simulations we show the modeling of complete scenarios in DCMM. A field study with faculty subjects illustrates how DCMM can be provided with data from SDT constructs observations. We believe that DCMM is relevant for investigating unresolved issues in HMIEM, and potentially interesting to related fields, including psychology, artificial intelligence, behavioral and developmental robotics, and educational technology.
Neural network for black-box fusion of underwater robot localization under unmodeled noise
2018, Robotics and Autonomous Systems
Citation Excerpt :
The results obtained pointed out the relevance of contextual anticipation of information for obtaining a more informed and reliable fusion. This principle has been explored in a previous research for humanoid robotics localization (Chame & Chevallerau [17]) and could be successfully extended to the field of underwater robotics localization. Regarding the experimental study the scan-matching algorithm provided interesting results.
The research on autonomous robotics has focused on the aspect of information fusion from redundant estimates. Choosing a convenient fusion policy, that reduces the impact of unmodeled noise, and is computationally efficient, is an open research issue. The objective of this work is to study the problem of underwater localization which is a challenging field of research, given the dynamic aspect of the environment. For this, we explore navigation task scenarios based on inertial and geophysical sensory. We propose a neural network framework named B-PR-F which heuristically performs adaptable fusion of information, based on the principle of contextual anticipation of the localization signal within an ordered processing neighborhood. In the framework black-box unimodal estimations are related to the task context, and the confidence on individual estimates is evaluated before fusing information. A study conducted in a virtual environment illustrates the relevance of the model in fusing information under multiple task scenarios. A real experiment shows that our model outperforms the Kalman Filter and the Augmented Monte Carlo Localization algorithms in the task. We believe that the principle proposed can be relevant to related application fields, involving the problem of state estimation from the fusion of redundant information.
A Hybrid Human-Neurorobotics Approach to Primary Intersubjectivity via Active Inference
2020, Frontiers in Psychology
Towards hybrid primary intersubjectivity: A neural robotics library for human science
2020, arXiv
Cognitive and motor compliance in intentional human-robot interaction
2020, Proceedings - IEEE International Conference on Robotics and Automation
Reliable fusion of black-box estimates of underwater localization
2018, IEEE International Conference on Intelligent Robots and Systems

View all citing articles on Scopus

View full text

Grounding humanoid visually guided walking: From action-independent to action-oriented knowledge

Abstract

Introduction

Section snippets

Related work

Grounding vision-based locomotion

Case study

Discussion

Conclusions and perspectives

Acknowledgments

Artif. Intell.

Robot. Auton. Syst.

Localizing a mobile robot with intrinsic noise

3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2012

Neurorobotics: from vision to action

Cambrian Intelligence: The Early History of the New AI

Embodied localization in visually-guided walk of humanoid robots.

ICINCO (2)

Visual servo control, part i: basic approaches

IEEE Robot. Autom. Mag.

Being There: Putting Brain, Body, and World Together Again

Real-time markerless tracking for augmented reality: the virtual visual servoing framework

IEEE Trans. Visualiz. Comput. Graph.

Robotics, Vision & Control: Fundamental Algorithms in Matlab

Monoslam: Real-time single camera slam

IEEE Trans. Pattern Anal. Mach. Intell.

Vision based control for humanoid robots

in ”IROS Workshop on Visual Control of Mobile Robots (ViCoMoR

Introduction To Artificial Intelligence

Behavioral dynamics of steering, obstacle avoidance, and route selection

J. Exp. Psychol. Hum. Percept. Perform.

The implications of embodiment for behavior and cognition: animal and robotic case studies

Clin. Orthop. Relat. Res.

Humanoid robot localization in complex indoor environments

Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.

Digital Image Processing