Grounding humanoid visually guided walking: From action-independent to action-oriented knowledge
Introduction
The automation of visually guided walking has arguably adopted in its infancy the so called cognitivist approach to artificial intelligence (AI), which, under the Cartesian dualist influence, has tended to look at physical and mental processes as belonging to different realms. Significant progress has been obtained from this view, though the gains are still distant from the sophistication observed in natural behavior.
Among the several challenges reported in the literature, one is undoubtedly to achieve reliable perception from noisy sensory data. Since the sensory input goes through a process of symbolization, and cognition involves computations over symbols, the physical context at which the latter emerged is no longer available to the cognitive process. In other words, in abstracting cognition from the context, information is inevitably lost.
To cope with the difficulties of perceiving the object while moving, several methodologies that flourished in the machine learning research have been employed (e.g. Markovian models, support vector machines, among others). These attempts have produced impressive results, although, by keeping intact the fundamental premise of decoupling between bodily and mental processes, they have relied on expensive resources in the form of context-free explicit models, knowledge databases, and intensive computation. Thus, the processing bottleneck has impacted the autonomy and the reactivity of the agent, and extraneous variables (i.e. unmodeled phenomena) have been controlled by adapting the scene to the task, which has compromised the generality of the solution.
In the last decades, a different perspective has been adopted to study natural behavior from the multi-disciplinary research on embodied cognition (EC), where knowledge representation is thought to be grounded in the physical interaction with the environment. Unlike the cognitivist approach, emergent behavior would not be explicitly represented nor planed in advance. The analysis of the sensory-motor coupling in natural tasks, from a dynamic system perspective, appears as a promising research direction that can provide more efficient, robust, and autonomous solutions.
However, adopting the EC methodology also poses important challenges to robotists, in particular, when fulfilling the requirements underlying the physical grounding hypothesis. Firstly, the autonomous evolution of the system, as it happens in natural beings, would ideally occur under a phylogenetic architecture that can modify itself. Secondly, the ontology of the system must ensure knowledge acquisition for diverse purposes, by fusing information from different sensory modalities. Lastly, enactment conditions the development of cognitive skills to the sensory-motor coupling and the interaction with the environment, thus knowledge acquisition is a slow process analogous to the natural one.
In view of the advantages and the challenges encountered in the aforementioned research paradigms, we have opted for an intermediate perspective, aiming at obtaining both generality and autonomy for applications in service robotics. Thereby, in this work we study the task of approaching and positioning in relation to visual stimuli. For this, we adopt the cognitivist assumption that human employs action-independent knowledge for localizing the object in the scene. But, as an embodied being, human can resort to contextual information to discriminate and perceive the object. Therefore, we explore action-oriented representations in the form of embodied features that capture bodily sensations emerging in the task. Furthermore, we examine the redundancies and the statistical regularities induced in the sensory-motor coordination for obtaining a more efficient perceptive processing. We describe a behavior scheme that includes a proposal for resources organization, and, through a case study, we consider the information fusion within a Bayesian network in charge of discriminating the object.
This document is organized as follows. In Section 2 some related contributions and challenges encountered from the cognitivist approach are discussed. Section 3 starts by the conceptualization of our research within the literature of embodied cognition, it then proceeds with the methodological analysis of the task where the behavior scheme is proposed and described. A case-study has been designed including simulations and experiments with the robot Nao, which is presented in Section 4. The results obtained are discussed in Section 5 along with the research perspectives. Finally, the conclusions of the study are given in Section 6.
Section snippets
Related work
Vision-based locomotion is a challenging task for walking robots. Unlike natural beings, which are in possession of extremely sophisticated sensory organs, the vast majority of the research in robotic vision has been carried out with quite inferior equipment, usually employing general purpose cameras. Moreover, the body structure and the actuation system utilized is much less stable, fine, and accurate, when compared to the natural musculo-skeletal system. In view of such limitations, several
Grounding vision-based locomotion
The term embodied cognition reunites co-existing research interests with diverse subject matters. A thorough review on the conflicting views in EC is beyond the scope of this paper (we refer the reader to the works of Shapiro [30] and Wilson [36] for a discussion on this topic). Thus, we are in agreement with Anderson [2] when he identifies in the physical grounding hypothesis (Brooks [4]) the distinctive aspect of EC as opposed to a situated but cognitivist view of embodiment. Accordingly,
Case study
The case study has been conducted in three stages. At first, a proposal for the different components of the behavior scheme described in Section 3.1.3 has been developed and implemented. Then, simulations were carried out in order to adjust the parameters of the tasks, and to verify that the proposed scheme is able to produce the desired behavior. Finally, a real approaching task was designed in a unstructured scenario with the robot Nao. Bellow, the results obtained are reported.
Discussion
Starting from the analysis of the dynamic aspects of human visually guided locomotion, the first-order descriptions proposed (in Eqs. (3) and (4)), and implemented in parallel motor tasks, allowed the agent to mimic human style of approaching objects (see Fig. 14). Despite this aspect is a non-functional requirement, and only ensures the aesthetics of motion, it is of crucial importance for the acceptance of the solution in a human-machine interaction context. The study also showed that when
Conclusions and perspectives
This work has started by the interest in applications of humanoid service robotics. In particular, by the behavior of visually guided approaching a certain face of an object in the scene. From the difficulties encountered in cognitivist automation research involving locomotion and vision, we have concentrated our efforts in exploiting the emergence of contextual information, notably, the redundancies and the statistical regularities induced in the sensory-motor coupling. For this, we conducted
Acknowledgments
This research has been funded by the Ecole Centrale de Nantes (ECN) and EQUIPEX ROBOTEX, France; and the CAPES Foundation, Ministry of Education of Brazil, Brasília - DF 700040-020, Brazil.
References (36)
Embodied cognition: a field guide
Artif. Intell.
(2003)- et al.
Development of complex robotic systems using the behavior-based control architecture ib2c
Robot. Auton. Syst.
(2010) - et al.
Localizing a mobile robot with intrinsic noise
3DTV-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2012
(2012) - et al.
Neurorobotics: from vision to action
Cambrian Intelligence: The Early History of the New AI
(1999)- et al.
Embodied localization in visually-guided walk of humanoid robots.
ICINCO (2)
(2014) - et al.
Visual servo control, part i: basic approaches
IEEE Robot. Autom. Mag.
(2006) Being There: Putting Brain, Body, and World Together Again
(1998)- et al.
Real-time markerless tracking for augmented reality: the virtual visual servoing framework
IEEE Trans. Visualiz. Comput. Graph.
(2006) Robotics, Vision & Control: Fundamental Algorithms in Matlab
(2011)
Monoslam: Real-time single camera slam
IEEE Trans. Pattern Anal. Mach. Intell.
Vision based control for humanoid robots
in ”IROS Workshop on Visual Control of Mobile Robots (ViCoMoR
Introduction To Artificial Intelligence
Behavioral dynamics of steering, obstacle avoidance, and route selection
J. Exp. Psychol. Hum. Percept. Perform.
The implications of embodiment for behavior and cognition: animal and robotic case studies
Clin. Orthop. Relat. Res.
Humanoid robot localization in complex indoor environments
Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on
A model of saliency-based visual attention for rapid scene analysis
IEEE Trans. Pattern Anal. Mach. Intell.
Digital Image Processing
Cited by (7)
A dynamic computational model of motivation based on self-determination theory and CANN
2019, Information SciencesCitation Excerpt :This was illustrated in CS2 where the motivation of the agent was affected by the perception of discrepancy between the expectations and the performance on the musical playing activity. In related works (e.g. learning triggering in [44], visual attention in [45], and reliable localization in [46]), it was shown how taking into account the correspondence between the top-down aspects of behavior expectations and the bottom-up emergence, conditioned to the interaction with the environment, can lead to significant improvements in the computational efficiency of the task. We believe that the correspondence to expectations can be informative in a more general sense on the quality of mediation experienced during the activity, leading to motivation dynamics.
Neural network for black-box fusion of underwater robot localization under unmodeled noise
2018, Robotics and Autonomous SystemsCitation Excerpt :The results obtained pointed out the relevance of contextual anticipation of information for obtaining a more informed and reliable fusion. This principle has been explored in a previous research for humanoid robotics localization (Chame & Chevallerau [17]) and could be successfully extended to the field of underwater robotics localization. Regarding the experimental study the scan-matching algorithm provided interesting results.
A Hybrid Human-Neurorobotics Approach to Primary Intersubjectivity via Active Inference
2020, Frontiers in PsychologyCognitive and motor compliance in intentional human-robot interaction
2020, Proceedings - IEEE International Conference on Robotics and AutomationReliable fusion of black-box estimates of underwater localization
2018, IEEE International Conference on Intelligent Robots and Systems