Attention-based navigation in mobile robots using a reconfigurable sensor
Introduction
Behaving systems are confronted with large amounts of sensory input. System complexity studies show that for processing all these data, a huge amount of neural circuitry would be required, much more than what the neural system contains [1], [20]. In other words, some kind of data reduction mechanisms must exist in sensory systems to reduce the large flow of data to be processed. Studies on visual processing in humans and primates have revealed that they focus on only a part of the entire visual field [4], [6], [17]. In this manner, the number of necessary operations for detecting and analyzing objects in the environment is decreased significantly. Such a computational property is called “attention”. The scientific field dealing with attention is very diverse since there are many different ways of selecting targets. For example, some neural mechanisms are “tuned” to particular motion or color stimuli. Others exploit temporal and spatial characteristics. In the field of cognitive science, attention is an important research topic, and issues as the interaction of attention with memory [7], and the integration of visual features with attention-based object perception [19] are studied.
Especially if there are a number of similar objects in the environment, an attention mechanism can provide additional cues for selecting a particular object. In the brain, selective attention is generated by at least two distinctive methods. First, the brain enhances the processing of the selected stimulus relative to the other stimuli present. Otherwise all stimuli would be processed to a similar degree. Second, the selected stimulus signal is actively oriented towards the appropriate neural processing circuitry for a given task. Recent brain-imaging and neurophysiological data indicate that the mechanisms underlying attentional computation share a common principle at all levels of the neural circuitry: amplification of that part of the neural area responsible for processing the region in the visual image to attend to. In other words, brain activity in a particular region is selectively amplified or suppressed as a function of attentional preference [4], [17]. Currently, the underlying mechanisms for amplification of a particular visual region and their locations in the visual hierarchy are subject to investigation. Selection of a particular visual region under the occurrence of a number of stimuli in parallel is called “spatial attention” [15]. In this paper, we will consider this type of attention.
For autonomous mobile robots, attention-based vision is just as crucial as it is for natural behaving systems. In the view of Ballard [2], active vision for robots (he calls it “animate vision”) using an attention frame can help to reduce the combinatorial explosion of necessary representations that are relevant for the given task. This principle is for example, exploited in the architecture for the robot Polly, which finds particular spatial edges in the environment for obstacle avoidance [11]. This can be interpreted as an attention mechanism for this feature in the visual image. In [18], haptic categories are learned in a mobile robot, encoding visual features from a camera such as intensity levels and edges for attentional clues.
Most visual sensory configurations for robotic systems use CCD cameras, extracting salient features using software. This approach is effective if the system responsible for image processing is powerful enough to meet the real-time performance requirements. To ensure high-speed data processing, the required computation is typically performed off-board (i.e. not on the robot platform) on a powerful computer. However, for high-speed autonomous mobile robots, such an approach causes difficulties because of the transmitting delays between the robot and the external computer. For such robots it is usually necessary to perform all signal processing on the moving platform. Ideally, much peripheral signal processing is performed close to the robot’s periphery itself [12]. This method is often found in peripheral structures of natural systems (e.g. retinas and cochleas) to ensure high-speed information extraction at a low energy cost. In other words, a beneficial approach for robotics is to study and implement such natural structures.
The scientific research field concerned with building electronic devices based on signal-processing principles similar to those found in natural systems is called “neuromorphic engineering” [8]. Expectations are that such neuromorphic analog VLSI devices will provide robotic systems of the future with adequate high-speed peripheral signal processing capability to deal with the real-time requirements of these systems. A number of neuromorphic devices have already been developed. Many of them incorporate visual signal processing including motion detection [9], [10]. A device which deals with attention processing is presented in [14]. This chip, capable of detecting covert attentional shifts, is used for tracking applications.
Most realizations that focus on attention processing assume one target to attend to. Typically, they do not address the phenomenon of visual region amplification for selecting a particular stimulus among others. Nonetheless, this seems to be an important feature in biological systems, as mentioned before. Equipped with such a mechanism, the agent is capable of distinguishing between multiple stimuli and can select the most appropriate one merely by amplifying the corresponding visual region.
In this paper, a method for attention processing is proposed, based on amplification of a particular visual stimulus, and an electronic implementation of the method is presented. As will be explained, much signal processing is performed peripherally on an aVLSI retina chip. In this manner, relevant signal processing occurs at the outer peripheral level, exploiting typical properties induced by the visual stimuli. On the chip, there is a contrast-sensitive retina followed by a winner-take-all (WTA) stage for extracting maximum pixel activation. Visual region amplification is achieved by adjusting the local inhibitory strength between adjacent pixels in the retina. The device was tested on an autonomous robot (MorphoII) which was given the task of selecting a line to follow while there were two alternatives. The robot develops directional preference by associating its visual stimulus with an energy stimulus, i.e. a solar cell.
Section snippets
Retina with integrated attentional processing
As mentioned in the previous section, visual attention-based signal processing relies on selective amplification of a part of the visual stimulus. In this paper it is shown that peripheral structures, in this case an artificial retina, can play an active role in this selection process. For the robot, it is not necessary to further analyze the incoming signals for extracting visual attention features at signal processing stages following the retina stage. It must be stated however that it is yet
Electronic visual attention processing
To integrate a selective mechanism for focusing on a particular visual region, the circuit of Fig. 2 was used. The circuit is based on the contrast retina presented in [3]. In this circuit, there are two important control parameters, VI and VD, for inhibition and diffusion of the activation of the retina cells, respectively. With these two parameters, the response characteristics of the retina can be adjusted globally, i.e. they influence the retina’s response to the same extent for all pixels.
Chip performance
On the chip, following the retina stage, there is a “winner-take-all” which calculates the position of the retina cell with the largest output current. The output of the WTA (Vwta) is in fact a stream of discrete binary values. The output value is “high” at the location of maximum visual contrast (edge) activation and zero at the other locations. This is illustrated in Fig. 7, in which maximum edge activation is detected at position 11. For an elaborate description of the WTA circuit, see [13],
Robot experiments
The chip was mounted on the robot MorphoII (Fig. 9). This movable platform consists of a robust base using differential steering. Speed control is performed by a microprocessor (196KD, Intel) using feedback from the wheel-encoders of the two motors. In autonomous mobile robot research, one objective is to create systems that can sustain themselves over an extended period of time. Hence, besides accomplishing a given task, the robot must be able to find an energy source and steer towards it when
Discussion
In this paper, an approach to attention-based visual processing is proposed exploiting peripheral analog signal properties. The innovative contribution is to bias the artificial retina of an autonomous robot for focusing on a certain region in its visual field.
This approach serves two technical merits. First, since the output of the retina already contains the necessary information for the robot to bias its visual attention, the processing layers succeeding the retina (such as the WTA) can
Acknowledgements
The author would like to thank Professor Rolf Pfeifer and Professor Rodney Douglas for the numerous valuable discussions, and MOSIS for chip fabrication. This work was performed within the Swiss SPP Biotechnology Program, module Neuroinformatics, grant No. 5002-044889.
Marinus Maris received his master’s degree in Electrical Engineering from the Technical University of Delft in The Netherlands in 1988 and Ph.D. from the University of Zurich in Switzerland in 1998. During his graduate work at both the Artificial Intelligence Laboratory of the Institute for Informatics of the University of Zurich and the Institute of Neuroinformatics of the Swiss Federal Institute of Technology, Zurich, he was deeply involved in the study of artificial intelligence and
References (20)
Animate vision
Artificial Intelligence
(1991)- et al.
Attentional networks
Trends in Neuroscience
(1994) Cortical connections and parallel processing: structure and function
Behavioral and Brain Sciences
(1986)- K.A. Boahen, A.G. Andreou, A contrast sensitive silicon retina with reciprocal synapses, in: Advances in Neural...
- et al.
Selective and divided attention during visual discriminations of shape, color and speed functional anatomy by positron emission topography
Journal of Neuroscience
(1991) - T. Delbrück, Bump circuits for computing similarity and dissimilarity of analog voltages, CNS Memo 26, Pasadena, CA,...
- et al.
Neural mechanism of selective visual attention
Annual Review of Neuroscience
(1995) - et al.
Configural processing in memory retrieval: multiple cues and ensemble representations
Cognitive Psychology
(1997) - et al.
Neuromorphic analogue VLSI
Annual Review of Neuroscience
(1995) - R. Etienne-Cummings, J. van der Spiegel, P. Mueller, A visual smooth pursuit tracking chip, in: Advances in Neural...
Cited by (3)
Top-down attention control for device communication manager on mobile robot platform
2015, International Conference on Robotics and Mechatronics, ICROM 2015Attentive robot vision
2009, Robot Vision: New ResearchDemosaicing low resolution QVGA Bayer pattern
2007, VISAPP 2007 - 2nd International Conference on Computer Vision Theory and Applications, Proceedings
Marinus Maris received his master’s degree in Electrical Engineering from the Technical University of Delft in The Netherlands in 1988 and Ph.D. from the University of Zurich in Switzerland in 1998. During his graduate work at both the Artificial Intelligence Laboratory of the Institute for Informatics of the University of Zurich and the Institute of Neuroinformatics of the Swiss Federal Institute of Technology, Zurich, he was deeply involved in the study of artificial intelligence and biorobotics as well as neuromorphic engineering. He is currently with The Netherlands Organization for Applied Scientific Research (TNO) in The Hague, and is involved in defining application areas and projects in the fields of network intelligence and robotics. His main research interest is in distributed intelligent systems.