Keywords

1 Introduction

Since American and Europe began their ‘Brain Project’ in 2013 respectively, brain-inspired intelligence has drawn a lot of attention all around the world. China has carried out its brain project titled “Brain Science and Brain-inspired Intelligence” since 2016 [1]. How to apply brain-inspired intelligence into pattern recognition and image processing? It becomes a research frontier. Edge is a basis feature of object. Human visual system, the most important sensory system, processes over 80% of the information that our brain receives [2]. Exploring the information processing mechanism of human visual system, and constructing a brain-inspired visual perception model are of great significance to the development of artificial intelligent. A large number of biological experiments show that the primary visual pathway, containing retina, lateral geniculate nucleus (LGN) and primary visual cortex (V1), can detective the edges [3]. It plays a key role in the overall perception. On the basis of mechanism in primary visual pathway, we propose a feedforward LGN-V1 model, which can effectively detect the edge, and lay the foundation for further research.

When recognizing objects, the human brain first detective the edges. The edges are formed by such factors as discontinuity in intensity and changes of surface direction. It is a basic visual feature and plays a key role in object recognition [4]. Edge detection can retain important structural information of images as well as reduce the amount of data greatly. Edge detection, as a basic subject of image perception and pattern recognition, has been widely used in unmanned driving and military target detection. At present, differential operator method are used for edge detection. And the edge is obtained by convolution of image and template [5]. The common operators include Canny-operator, Prewitt-operator and so on. Although methods of computer vision are effective in edge detection to some degree, it is far from the high property of human brain. With the exploration of brain, people have a preliminary understanding of human visual system. To build brain-inspired models for image processing has become a new direction in the field of computer vision.

In view of this, we establish a brain-inspired model with the mechanism of primary visual pathway. Compared with Canny-operator [6] and Prewitt-operator [7], our model has better performance and robustness in edge detection.

2 Primary Visual Pathway

Biological experiments have shown that human visual system possesses a multi-layered complex neural structure. It perceives most of the external information. The visual pathways are divided into dorsal stream and ventral stream. They are responsible for perceiving motion information and static information respectively [8]. The primary visual pathway, consisting of retina, LGN and V1, is in the early stage of visual system.

In primary visual pathway, firstly, the retina receives optical signals, and transforms those into bio-electrical signals to sends to LGN; then LGN processes the obtained information, and finally passes those to V1, the beginning of visual cortex. Primary visual pathway senses a lot of static information. LGN has a 6-layers structure: 2 magnocellular layers and 4 parvocellular layers. Magnocellular layers, the dorsal area of LGN, deal with the motion information; while parvocellular layers, the ventral area, are sensitive to color and contour [9]. The vast majority of LGN cells have a center-surround receptive field (RF), which shows a concentric antagonism at different areas of the RF. If the center of RF is sensitive to visual stimulus and surround causes the inhibition of response, a cell is called a ‘center-on’ cell, as shown in Fig. 1(a); else, we call it a ‘center-off’ cell, as shown in Fig. 1(b). Experiments indicate that RFs of LGN cells are under Gaussian distribution. Such LGN cells can detect contrast changes easily [10].

Fig. 1.
figure 1

LGN cells of different polarity

V1, sensitive to both static and dynamic visual stimulus, is the beginning of visual cortex [11]. Similar to LGN, V1 is also divided into six layers. Cells in V1 are generally divided into simple cells, complex cells and Hypercomplex cells [12]. Hubel and Wiesel found that the RFs of simple cells are small, long and narrow [13]. The RFs of simple cells have a property called orientation selectivity, which means that a cell is sensitive to the contrast of bright and dark at its preferred orientation [14]. If the stimulus’ orientation is not consistent with the preferred orientation, the response decreases and is almost zero when the displacement is 45° or more. This property determines the high sensitivity to the edge at preference orientation. The RFs of complex cells and super complex cells are far broader and wider than the ones of simple cells. They are more sensitive to the bar stimulus. Simple cells receive visual stimulus from LGN, and their RFs are integration of ones of LGN cells. The size is positively correlated with the number of LGN cells in the local area. The feedforward model based on primary visual pathway is called the Hubel-Wiesel model, which lays the foundation for our edge detection model.

3 Brain-Inspired Computational Model

3.1 The Model Based on Bio-Visual System

According to the mechanism of LGN and V1 in primary visual pathway, we put forward a brain-inspired model as shown in Fig. 2.

Fig. 2.
figure 2

Procedure of proposed model

Figure 2 illustrates the procedure the computational model with the mechanism of LGN and V1 simple cells in primary visual pathway.

Response of a cell group is got by computing the sum of the weighted responses of the LGN cells with same polarity and RF size. Similar to a single LGN cell, a cell group can detect contrast changes as well, but it does so in a broader and wider RF. A group can be seen as a branch of a simple cell which obtained input from a pool of adjacent LGN cells.

As illustrated in Fig. 2, by combining the responses of cell groups with appropriate spatial alignment of their RFs, a V1 simple cell possessing orientation selectivity is got. By calculating the weighted geometric mean of the group responses, the response of the all V1 simple cells is got, and that’s the output of our model.

Because of the opposite polarity between center-on LGN cells and center-off ones, we define a property parameter \( \delta \) and property function \( h \). For a center-on cell, \( \delta \) is ‘\( + \)’ and \( h(x) = x \); else, \( \delta \) is ‘\( - \)’ and \( h(x) = - x \). No matter a center-on LGN cell and a center-off one, its RF is under Gaussian distribution, so a difference of Gaussian functions is adopted. That is expressed as:

$$ DoG_{\sigma }^{\delta } (x,y)\mathop { = }\limits^{\text{def}} h\left[ {\frac{1}{{2\pi \sigma^{2} }}e^{{\left( { - \left( {x^{2} + y^{2} } \right)/2\sigma^{2} } \right)}} - \frac{1}{{2\pi (k\sigma )^{2} }}e^{{\left( { - \left( {x^{2} + y^{2} } \right)/2(k\sigma )^{2} } \right)}} } \right] $$
(1)

where \( \sigma \) is the standard deviation of the outer Gaussian function, which means the range of surround RF. \( k \), the ratio of the standard deviations, means the ratio of the ranges RF shows excitation or inhibition. \( DoG_{\sigma }^{ + } (x,y) \), \( DoG_{\sigma }^{ - } (x,y) \) represent the RFs of a center-on cell and a center-off one respectively.

For a LGN cell with a RF centered at image coordinates \( (x,y) \), its response can obtained by calculating the convolution of intensity distribution function and RF of a cell.

$$ l_{\sigma }^{\delta } (x,y)\mathop { = }\limits^{\text{def}} \varvec{I}*DoG_{\sigma }^{\delta } $$
(2)

where \( l_{\sigma }^{\delta } (x,y) \) may be negative because of the difference of \( \delta \), however, a firing rate of a cell should be positive. Inspired by simoncelli [15] and Zou [11], we adopted a rectification. So the response of a single cell is as followed:

$$ D_{\sigma }^{\delta } (x,y) = \hbox{max} \left\langle {0,l_{\sigma }^{\delta } (x,y)} \right\rangle $$
(3)

A cell group is consisted of some cells with the same polarity. That is:

$$ S = \left\{ {\left( {\delta_{i} ,\sigma_{i} ,\rho_{i} ,\phi_{i} } \right)\left| {i = 1,2, \cdots ,m} \right.} \right\} $$
(4)

where the parameter \( m \) is the number of cells in a cell group. Parameter \( \delta \) represents the polarity; \( \sigma \) means the range of RF; \( \rho \) reflects the radius and \( \phi \) means the polar angle.

For a cell group, the center of whose RF is at coordinates \( (x,y) \), we compute the convolution of \( D_{\sigma }^{\delta } (x,y) \) and RF of its center so that its response is worked out:

$$ Z_{{\delta_{i} ,\sigma_{i} ,\rho_{i} ,\phi_{i} }} (x,y)\mathop { = }\limits^{\text{def}} \sum\limits_{{x^{{\prime }} }} {\sum\limits_{{y^{{\prime }} }} {\left\{ {D_{{\sigma_{i} }}^{{\delta_{i} }} (x - \Delta x_{i} - x^{{\prime }} ,y - \Delta y_{i} - y^{{\prime }} )G_{{\sigma^{{\prime }} }} (x^{{\prime }} ,y^{{\prime }} )} \right\}} } $$
(5)

In the equation above,

$$ \begin{array}{*{20}l} {\Delta x_{i} = - \rho_{i} \cos \phi_{i} ;\Delta y_{i} = - \rho_{i} \sin \phi_{i} } \hfill \\ {{ - }3\sigma^{{\prime }} \le x^{{\prime }} ;y^{{\prime }} \le 3\sigma^{{\prime }} } \hfill \\ \end{array} $$
(6)

where \( \sigma^{{\prime }} = b{ + }\lambda \sigma \).

We define \( Q_{{S,\phi_{i} }} \), the response of a V1 simple cell whose preferred orientation is \( \phi_{i} \). It is determined by all cell groups it receives input from. According to the biological experiments by Hubel and Wiesel, only if all groups’ responses are exited, can V1 simple cell be excited at visual stimulus. In order to represent the biological property, we compute the product of all groups’ responses, and then work out the weighted geometric mean to get the responses of all simple cells. That is:

$$ Q_{{S,\phi_{i} }} (x,y)\mathop { = }\limits^{\text{def}} \left( {\prod\limits_{{i{ = }1}}^{\left| S \right|} {\left( {Z_{{\delta_{i} ,\sigma_{i} ,\rho_{i} ,\phi_{i} }} (x,y)} \right)^{{\upomega_{i} }} } } \right)^{{1/\sum\nolimits_{{i{ = }1}}^{\left| S \right|} {\upomega_{i} } }} $$
(7)

where

$$ \begin{aligned}\upomega_{i} & = e^{{{{ - \rho_{i} } \mathord{\left/ {\vphantom {{ - \rho_{i} } {2\sigma^{{{\prime }^{\text{2}} }} }}} \right. \kern-0pt} {2\sigma^{{{\prime }^{\text{2}} }} }}}} \\ \sigma^{{\prime }} & = \frac{1}{3}\mathop {\hbox{max} }\limits_{{i \in \left\{ {1 \ldots \left| S \right|} \right\}}} \left\{ {\rho_{i} } \right\} \\ \end{aligned} $$
(8)

We define orientation function \( \beta = \left\{ {{{2\pi i} \mathord{\left/ {\vphantom {{2\pi i} {\text{n}}}} \right. \kern-0pt} {\text{n}}}\left| {0 \le i\text{ < }{\text{n}}} \right.} \right\} \). 0 to \( 2\pi \) is averagely divided into \( {\text{n}} \), to represent different preferred orientation of V1 simple cells. While \( i = 0,1,2 \ldots \ldots {\text{n}} \), we calculate the integration of responses of all simple cells, and the whole response is got:

$$ Q_{all} (x,y) = \sum\limits_{\phi \in \beta } {Q_{S,\phi } (x,y)} $$
(9)

3.2 Related Parameters Setting

To qualitatively match the physiological data [14, 16], the parameters’ values are as followed: \( \sigma = 2.5 \), \( k = 0.5 \), \( b = 0.33 \), \( \lambda = 0.15 \). Based on control variable method, we set \( m = 16 \) to get better orientation selectivity. Tuning curves of simple cells (preferred orientation is 0°) are shown in Fig. 3.

Fig. 3.
figure 3

Tuning curves of simple cells (preferred orientation is 0°)

From Fig. 3, we can see that the model simple cells have biological properties of real cells, which demonstrates the validity of the model.

With parameters above determined, we need to find the best value of \( {\text{n}} \) to detect edge better and reduce the data amount. We take the image named mainbuilding in USF (University of South Florida) database as an example. As shown in Fig. 4 (a) is original image; (d) is ground truth of edge; (b), (c) (e), (f) are edge detected by proposed model while \( {\text{n}} = 4 \), \( {\text{n}} = 8 \), \( {\text{n}} = 1 2 \), \( {\text{n}} = 1 6 \), respectively.

Fig. 4.
figure 4

Edge detected by the model with different values of n

From Fig. 4, we can see that, our model exist missing detection greatly while \( {\text{n}} = 4 \). The detected edges are incomplete and we can hardly have a satisfied identification with them; while \( {\text{n}} = 8 \), our model improves its property of edge to some degree. It detects more edges but the result is far from satisfactory; while \( {\text{n}} = 1 2 \), our model detect more edge, with which we can recognize the picture well. The edges detected achieve the expected effect; while \( {\text{n}} = 1 6 \), we achieve nearly the same property of edge detection as \( {\text{n}} = 1 2 \). So, in order to get expected result and reduce the data amount at the same time, we set \( {\text{n}} = 1 2 \). It means with 12 simple cells with different preferred orientations together, we can achieve the properties of all cells while detecting edges. Edges detected by 12 model simple cells are as shown in Fig. 5. From Fig. 5, we can see a model cell is only sensitive to the visual stimulus at its preferred orientation, which reflect the real orientation selectivity in biology.

Fig. 5.
figure 5

Edges detected by cells with different orientation

Generally speaking, this set of values above was retained for all of simulation results later, except where specifically noted.

4 Analysis of Experimental Results

To examine the effectiveness of our model, we take images in the USF (University of South Florida) database as examples. We get the results as shown in Fig. 6.

Fig. 6.
figure 6

Edge detection of images with no noise

From Fig. 6, we can see that the model preserves the complete edges of the image, and the object can be identified from the extracted result. Canny-operator can extract the basic edges. But it also regards some non-edge as the edges, which causes some false detection. The property of Prewitt-operator is between our model and Canny-operator. In a word, our model is better than Canny-operator and Prewitt-operator.

Anti-noise performance and robustness is an important indicator of edge detection performance. In order test the robustness of our model, we analyze the detected edges of three methods after adding salt and pepper noise (density is 0.1), Gaussian noise (mean value is 0 and variance is 0.1) into the original images, respectively. The results are as shown in Figs. 7 and 8.

Fig. 7.
figure 7

Edge detection of images with salt & pepper noise

Fig. 8.
figure 8

Edge detection of images with Gaussian noise

It can be seen from Fig. 7 that after the adding salt and pepper noise, the images have a large number of pixel points with gray mutation, so that the edges are weakened. And the edge detection becomes more difficult because most methods utilize sudden changes of pixel values in images. The Canny-operator and Prewitt-operator have a large number of false detections in this case. It is difficult to identify the objects. Instead, our model can extract most edges and contain the basic appearance at the same time.

From Fig. 8, we find that after adding Gaussian noise, there is a certain gray gradient everywhere. The noise created blurring at the edge. Detecting edge has become more difficult as well. In this case, Canny-operator is more sensitive to Gaussian noise and brings a lot of noise into the edges detected. And we can hardly identify the basic shape of the object. The Prewitt-operator and our model both have good performance in detecting edges in images with Gaussian noise. But Prewitt-operator regards some obvious edges as discontinuous points, and our model is better than it.

Based on the above analysis, we can see that the our model is better at edge detection than Canny-operator and Prewitt-operator. Our model can model the properties in perceiving edges as V1 simple cells. It is effective and has good robustness in edge detection.

5 Conclusion

In this paper, a brain-inspired edge detection model is established based on the receptive fields of LGN and V1 simple cells in the brain’s primary visual pathway. The model reflects the real biological properties of LGN and V1 simple cells. We created mathematical model to simulate visual system, and bring it into edge detection, a important part in image processing. By comparing with the traditional edge detection operators, the robustness of the model proposed is demonstrated.

The model proposed in this paper simulates the primary visual pathway. It realizes the ability of perception of static edge information, and explores the application of brain cognitive mechanism in image processing. In view of the fact that the human brain vision system can perceive complex static and dynamic information, we will continue to explore the visual system for further research. Next, We are going to explore the mechanisms of V1 complex cells, and building spatial and temporal model to simulate more properties of primary visual pathway. With brain-inspired intelligence and motivation from ‘Brain Project’, pattern recognition and image processing will achieve greater development in the future.