Visual motion pattern extraction and fusion for collision detection in complex dynamic scenes
Introduction
The ability to detect and avoid collision is very important for animals and mobile intelligent machines. However, many artificial vision systems are not yet able to quickly and cheaply extract the wealth information in the visual scenes [1]. Detecting colliding objects in complex dynamic scenes has been a difficult task for current conventional robotics vision technologies, especially with a limited computing source [2].
The visual collision avoidance systems in insects have evolved over millions of years, and are efficient and reliable in the insect’s visual environments. The neural circuits processing visual information in insects are relatively simple compared to those in the human brain and can be a good model for optical sensors for collision detection [3]. The visual processing mechanisms in insects revealed by neurobiologists over the past decades have already begun to provide solutions for collision avoidance or visual based robotic navigation (for example, [4], [5], a review is available [6]).
The lobula giant movement detector (LGMD) is a large visual interneuron in the optic lobe of the locust that responds most strongly to approaching objects [7], [8], [9]. The functional model of the LGMD’s input circuitry showed the same selectivity as the LGMD neuron [10] and has also been used for collision avoidance and detection in a mobile robot and a car [5], [11], [12]. The high efficiency in detecting collisions using vision has made it possible to use a LGMD based neural network to detect collision in real time applications. However, in complex driving scenes, the tuned LGMD based neural network also responds briefly to nearby fast translating objects in its visual field [11]. Unfortunately, fast translating visual events can occur in road scenes, for examples, at a roundabout, T-junction or a cross road.
To deal with these fast translating objects, the outputs of four directionally sensitive neurons, based on crossed strips of correlated EMDs (elementary movement detectors), were used to either directly suppress the LGMD’s spiking response or to combine with the LGMD output to arrive at a decision [13]. One disadvantage of the direct integration is that the mistakes made by these EMDs can directly affect a correct collision detection by the LGMD. An independent decision on translating events made by an independent specialized neural network could provide a better solution. To increase the chance of detecting translating events, whole field direction selective neural networks (DSNNs) were used to form that specialized translating sensitive neural network (TSNN).
Direction selective neurons have been found in animals for decades, for example, in insects, i.e., locust [14], [15], beetles [16], and flies [17], and in vertebrates, i.e., rabbits [18], [19], [20], [21] and cats [22], [23]. A recent survey is available [24]. There are many ways to form a computational DSNN (for example, [25], [26]). Recent results suggest that asymmetric lateral inhibition ensures robust directional selectivity in the rabbit’s retina [21]. In this paper, we use an asymmetric lateral inhibitory mechanism to form the whole-field DSNNs. These DSNNs have a similar network structure to the LGMD neural network but with asymmetric lateral inhibition. A TSNN, as a further level of organisation of these DSNNs, fuses the extracted visual motion cues from the DSNNs so that it only responds to fast translating objects.
In our system, the LGMD and the TSNN make their own decisions based on the visual cues extracted simultaneously and independently. The LGMD plays a key role in detecting imminent collision. The TSNNs decision is based on translating cues and becomes useful only when the LGMD has issued a collision alarm. The system checks the TSNNs decision in order to eliminate a possible false collision alarm resulting from fast translating objects. We demonstrate the system’s reliability in detecting dangerous imminent collision by challenging it with driving scenes.
Section snippets
Formulation of the system
The system for detecting colliding objects has three main parts: an LGMD based neural network for extracting looming cues in depth, a TSNN for translating cues and a decision making mechanism to fuse the looming and translating cues (Fig. 1). Details of the three parts will be given in the following sub-sections.
Parameter setting for driving scenes
We use driving scenes to test the proposed colliding objects detection system. The input video images (720 × 576 pixels) provided by Volvo Car Corporation were taken at 25 frames per second and resized to 100 (in horizontal) times 80 (in vertical) pixels by using image resize function in Matlab
Test results and discussions
Before testing the whole collision detection system, we checked the responses of the asymmetric lateral inhibition based DSNNs when challenged with a right moving black bar and a left walking pedestrian (Fig. 4a). We then compared these responses with those of an Elementary Motion Detector (EMD) based DSNN to the same stimuli (Fig. 4b and c). The EMD based DSNN had been used in a previous study [13]. The asymmetric lateral inhibition based DSNNs can distinguish the translating cue clearly in
Conclusions
In the above sections, we proposed a collision detection system which consists of two specialised neural networks to extract and fuse different visual cues—the LGMD based neural network responding to impending objects in depth and the TSNN responding to fast translating visual movement. With the decision making mechanism to integrate the two neural networks together, the collision detection system works reliably without false alarms as demonstrated by challenging it with driving scenarios. This
Acknowledgments
This work is supported by EU IST-2001-38097. We thank M. Soininen of Volvo Car Corporation for providing the video clips used in the paper, Dr. R. Stafford for his comments in internal review and Mr. M. Bendall for proof reading the paper. We thank the anonymous reviewers for their invaluable comments.
References (32)
- et al.
Collision avoidance using a model of the locust LGMD neuron
Robot. Autom. Syst.
(2000) - et al.
Direction selectivity of excitation and inhibition in simple cells of the cat primary visual cortex
Neuron
(2005) Direction inhibition: a new slant on an old question
Neuron
(2005)- et al.
Direction selectivity in the retina
Curr. Opin. Neurobiol.
(2002) Self-organizing neural networks for perception of visual motion
Neural Netw.
(1990)- et al.
Modeling direction selectivity using self-organizing delay-adaptation maps
Neurocomputing
(2002) - et al.
Neuromorphic vision sensors
Science
(2000) - et al.
Vision for mobile robot navigation: a survey
IEEE Trans. Pattern Anal. Mach. Intell.
(2002) - et al.
Locust’s looming detectors for robot sensors
- et al.
A silicon implementation of the fly’s optomotor control system
Neural Comput.
(2000)
Bioinspired sensors: from insect eyes to robot vision
The anatomy of a locust visual interneurone: the descending contralateral movement detector
J. Exp. Biol.
The neuronal basis of a sensory analyser, the acridid movement detector system. IV. The preference for small field stimuli
J. Exp. Biol.
Orthopteran DCMD neuron: a reevaluation of responses to moving objects. I. Selective responses to approaching objects
J. Neurophysiol.
Neural network based on the input organization of an identified neuron signaling impending collision
J. Neurophysiol.
Cited by (35)
Artificial fly visual joint perception neural network inspired by multiple-regional collision detection
2021, Neural NetworksCitation Excerpt :Each video sequence is recorded at a frame rate of 50fps, and later separated into 8-bit grayscale images. After parameters tuning, we define FVSCDM’s parameters given in Table 1, based on the previous works (Fu, Hu et al., 2018; Missler & Kamangar, 1995; Yue & Rind, 2006a, 2006b; Zhang et al., 2015). To sufficiently examine whether AFVJPNN can effectivelyadapt to different kinds of visual stimuli, eight video sequences, acquired by computer-generated visual stimulation are hereby taken to analyze whether they can respond to preferential visual regions.
Bio-plausible visual neural network for spatio-temporally spiral motion perception
2018, NeurocomputingCitation Excerpt :Correspondingly, there are special visual motion selective neurons in animals’ cerebral cortex with different motion preference characteristics, i.e., translation, radial, and rotation neurons [3–13]. These neurons whose functional properties have been explained by computational models can respond preferentially to translational, radial, and rotational motion stimuli [14–23]. Interestingly, on the basis of these works, Duffy and Wurtz [24,25] reported a fact that some neurons can respond to multiple types of complex visual motion patterns, e.g., translational, translational-radial and translational-rotational motion.
Shaping the collision selectivity in a looming sensitive neuron model with parallel ON and OFF pathways and spike frequency adaptation
2018, Neural NetworksCitation Excerpt :These works have demonstrated the effectiveness and robustness of LGMDs-based models in collision detection, however, shaping the collision selectivity to looming objects only, is still an open challenge to computational modellers, as these collision detectors are easily affected by irrelevant movements like translational and recessive motion, especially in complex and dynamic environments. To improve the collision selectivity of these looming detectors, some solutions have been proposed either to monitor the gradient change of LGMD1 neural responses (Hongying, Shigang, Andrew, Kofi, Mervyn, Nigel, Peter & Cy, 2009), or to combine the functionality of LGMD1 model with a translating sensitive neural network (Yue & Rind, 2006b). In addition, Badia et al. proposed a seminal work of a non-linear LGMD1 model which can discriminate approaching from receding stimuli well (Bermudez i Badia et al., 2010).
An integrated neuromimetic architecture for direct motion interpretation in the log-polar domain
2014, Computer Vision and Image UnderstandingPostsynaptic organisations of directional selective visual neural networks for collision detection
2013, NeurocomputingCitation Excerpt :They have also been found in animals for decades (for example, in insects [16,33,34,7]; in vertebrates [2,3,44,47,12,32,24]), and have been thought to be involved in signalling looming as well [19]. Recent research showed that whole field directional selective neurons can be organised for extracting translating visual cues [52], and can be organised with a special structure for collision detection [53]. However, it is not clear whether these directional selective neurons can only be integrated with that special structure for collision detection.
A modified model for the Lobula Giant Movement Detector and its FPGA implementation
2010, Computer Vision and Image Understanding