Elsevier

Neurocomputing

Volumes 44–46, June 2002, Pages 703-708
Neurocomputing

Invariant encoding of spatial stimulus topology in the temporal domain

https://doi.org/10.1016/S0925-2312(02)00461-7Get rights and content

Abstract

Invariant representations emerge as a central topic in vision research. Here we investigate a transformation of visual spatial patterns by an ensemble of laterally coupled neurons into the temporal domain. The central property of the network leading to this transformation is the monotonic relationship of transmission delays between neurons with increasing distance. We show that stimuli become encoded in the temporal population activity and that this representation is invariant with respect to translations, rotations and small distortions. Furthermore, the proposed model provides a rapid encoding in accordance with physiological results.

Introduction

Primates excel at tasks like visual object recognition, tolerating considerable changes in images, for instance, due to different viewing angles and deformations. Elucidating the mechanisms of such invariant pattern recognition is an active field of research in neuroscience [10], [12], [6], [4]. However, still very little is known about the underlying algorithms and mechanisms. A number of models have been proposed which aim to reproduce capabilities of the biological visual system, such as invariance to shifts in position, rotation, scaling and distortion [9], [19], [7].

Recently, however, the importance of the temporal dynamics of neuronal activity in representing visual stimuli has shifted into the focus of neuroscience [2], [11], [8]. Several modeling studies have addressed the properties of temporal codes [1], [17], [5]. For instance, Buonomano and Merzenich [3] proposed a model for position-invariant pattern recognition, using temporal coding. Here, we build on these concepts and investigate the formation of invariant representations by the dynamics of activity of neuronal populations.

We investigate a simplified model of primary visual cortex consisting of a map of integrate-and-fire neurons with local excitatory interactions. A central feature of this model is the monotonic relationship between transmission delays in this lateral coupling and the distance between pre- and post-synaptic neurons. We hypothesize that this network property induces dynamics of neuronal activity which is specific to the geometrical shape of a stimulus and invariant with respect to translations and rotations. Such a representation would emerge naturally without the need for training a stimulus repeatedly at different positions or orientations. In order to investigate the validity of this approach we determine the amount of information contained in the temporal population responses of this network for different parameters and stimulus sets. This is done statistically, using a clustering algorithm combined with temporal correlation as a similarity measure. The resulting temporal structure of network activity provides a representation which is position as well as rotation invariant. Moreover, it is robust with respect to synaptic noise and under local and global stimulus distortions. These results suggest that for invariant pattern recognition temporal coding is important, however, not at the level of single neurons, but at the population level.

Section snippets

Methods

The investigated network consists of a two-dimensional array of 40×40 standard conductance-based leaky integrate-and-fire neurons, which are modeled with a spike-triggered potassium conductance yielding frequency adaptation. Each neuron connects to a circular neighborhood of fixed size, such that only neurons with Euclidean distance ⩽9 cells were connected. The synapses are of equal strength and modeled as instantaneous excitatory conductances, while transmission delays are proportional to the

Results

In the first experiment, the network's performance in encoding stimuli invariant to distortion was investigated by presenting the six stimulus classes shown in Fig. 1a. The arrangement reflects an intuitive notion of topology, i.e. class 1 is visually more similar to class 2 or 4 than to class 5 or 6. For each stimulus class, 24 hand-drawn examples were presented to the network (Fig. 1b). For a quantitative analysis, the average responses of the network were clustered into six classes using

Discussion

We have shown that in a model of a cortical network, the interaction of network- and stimulus-topology induces stimulus-specific but transformation-invariant temporal dynamics. Thus, stimuli are represented by a temporal population code. This representation is position- and rotation-invariant as well as invariant to moderate distortions. The stimulus encoding preserves an intuitive notion of visual similarity; classification errors predominantly occur by confusing visually similar stimuli. In

Acknowledgements

The authors thank Manuel Sanchez-Montanes, Mark Blanchard and Arik Zucker for valuable discussions and contributions to this study. This work was supported by SPP-SNF.

References (19)

There are more references available in the full text version of this article.

Cited by (2)

View full text