Elsevier

Neural Networks

Volume 48, December 2013, Pages 109-124
Neural Networks

Categorization and decision-making in a neurobiologically plausible spiking network using a STDP-like learning rule

https://doi.org/10.1016/j.neunet.2013.07.012Get rights and content

Abstract

Understanding how the human brain is able to efficiently perceive and understand a visual scene is still a field of ongoing research. Although many studies have focused on the design and optimization of neural networks to solve visual recognition tasks, most of them either lack neurobiologically plausible learning rules or decision-making processes. Here we present a large-scale model of a hierarchical spiking neural network (SNN) that integrates a low-level memory encoding mechanism with a higher-level decision process to perform a visual classification task in real-time. The model consists of Izhikevich neurons and conductance-based synapses for realistic approximation of neuronal dynamics, a spike-timing-dependent plasticity (STDP) synaptic learning rule with additional synaptic dynamics for memory encoding, and an accumulator model for memory retrieval and categorization. The full network, which comprised 71,026 neurons and approximately 133 million synapses, ran in real-time on a single off-the-shelf graphics processing unit (GPU). The network was constructed on a publicly available SNN simulator that supports general-purpose neuromorphic computer chips. The network achieved 92% correct classifications on MNIST in 100 rounds of random sub-sampling, which is comparable to other SNN approaches and provides a conservative and reliable performance metric. Additionally, the model correctly predicted reaction times from psychophysical experiments. Because of the scalability of the approach and its neurobiological fidelity, the current model can be extended to an efficient neuromorphic implementation that supports more generalized object recognition and decision-making architectures found in the brain.

Introduction

Object recognition in monkeys has traditionally been associated with an anatomically distinct pathway termed the “what” (or ventral) visual stream (Ungerleider & Haxby, 1994), which consists of at least V1, V2, V4, and various regions in the inferior and anterior temporal cortices (e.g., TEO, TE1–TE3, TEa, and TEm) (Rolls, 2012, Rolls and Deco, 2002). While traveling along this pathway, the characteristics of the stimuli to which neurons respond become more complex (Rolls, 2012, Ungerleider and Haxby, 1994), ranging from rather simple stimuli with small receptive fields such as oriented bars in V1 (Hubel & Wiesel, 1965) to relatively large and more abstract objects such as faces in the inferotemporal cortex (IT) (Bruce, Desimone, & Gross, 1981). These empirical observations have led to a number of classic studies modeling the ventral stream as a hierarchical feed-forward network, such as the Neocognitron (Fukushima, 1980), HMAX (Riesenhuber & Poggio, 1999), or VisNet (Rolls, 2012, Wallis and Rolls, 1997)—although it should be noted that the notion of a strictly hierarchical or feed-forward network has been questioned by recent anatomical studies that reserve a more important functional role for bi-directional and non-hierarchical connections (Markov et al., 2012, Markov et al., 2011). Inspired by these classic models, a variety of more conventional machine learning algorithms have emerged that demonstrate the extraordinary performance in certain recognition tasks, such as convolutional neural networks (CNNs) in handwriting recognition (Ciresan et al., 2011, LeCun et al., 1998, Simard et al., 2003)—or for that matter, adaptive boosting in face recognition (Viola & Jones, 2001). Although CNNs implement a network topology that is biologically-inspired, they often rely on the error backpropagation (gradient descent), which has been criticized for being biologically unrealistic because it involves variables that cannot be computed locally (Rolls & Deco, 2002). Part of the challenge is thus to discover how comparably hard problems can be solved by more biologically plausible networks relying on local learning rules that operate on the abstraction level of a synapse.

A potential candidate for such a mechanism is spike-timing-dependent plasticity (STDP), (Bi and Poo, 2001, Sjöström et al., 2001, Song et al., 2000), a paradigm which modulates the weight of synapses according to their degree of causality. Many different variants of STDP seem to exist in the brain, and many different models to explain them have emerged over the years (Morrison, Diesmann, & Gerstner, 2008). In an effort to implement STDP-like learning rules using only information locally available at the synapse without algorithmically storing the spike timings, several models have proposed to pair presynaptic spiking with postsynaptic voltage, determining the weight change by using either the temporal change of postsynaptic voltage (Porr, Saudargiene, & Worgotter, 2004), a piece-wise linear function to approximate postsynaptic voltage (Gorchetchnikov, Versace, & Hasselmo, 2005) or postsynaptic Calcium concentration (Brader et al., 2007, Graupner and Brunel, 2012). Networks with STDP have been shown to be able to learn precise spike times through supervised learning (Legenstein et al., 2005, Pfister et al., 2006), to implement reinforcement learning (Florian, 2007, Izhikevich, 2007b, O’Brien and Srinivasa, 2013), to develop localized receptive fields (Clopath, Busing, Vasilaki, & Gerstner, 2010), or to classify highly correlated patterns of neuronal activity (Brader et al., 2007).

Once an internal representation of a visual object is built in the brain, the question then remains how this memory can be retrieved from the system in order to make a perceptual decision. A general mechanism has been suggested to involve the temporal integration and comparison of the outputs of different pools of sensory neurons in order to compute a decision variable (Heekeren, Marrett, Bandettini, & Ungerleider, 2004). This temporal integration might be performed in one of several regions such as the dorsolateral prefrontal cortex (dlPFC) (Heekeren et al., 2004, Kim and Shadlen, 1999), lateral intraparietal area (LIP) (Shadlen & Newsome, 2001), superior colliculus (SC) (Horwitz & Newsome, 1999), frontal eye fields (FEF) (Schall, 2002, Schall and Thompson, 1999, Thompson et al., 1996) or intraparietal sulcus (IPS) (Colby & Goldberg, 1999), which all cooperate in order to translate the accumulated evidence into an action (Heekeren et al., 2008, Rorie and Newsome, 2005). Neuronal activity in integrator areas gradually increases and then remains elevated until a response is given, with the rate of increase being slower during more difficult trials. A successful approach to explaining these kinds of neurophysiological data has been through the means of drift–diffusion or race models (Bogacz et al., 2006, Schall and Thompson, 1999, Smith and Ratcliff, 2004), in which the noisy sensory information is integrated over time until a decision threshold is reached.

Here we present a large-scale model of a hierarchical spiking neural network (SNN) that integrates a low-level memory encoding mechanism with a higher-level decision process to perform a visual classification task in real-time. The model consists of Izhikevich neurons and conductance-based synapses for realistic approximation of neuronal dynamics (Dayan and Abbott, 2001, Izhikevich, 2003, Izhikevich et al., 2004), a STDP synaptic learning rule with additional synaptic dynamics for memory encoding (Brader et al., 2007), and an accumulator model for memory retrieval and categorization (Smith & Ratcliff, 2004). Grayscale input images were fed through a feed-forward network consisting of V1 and V2, which then projected to a layer of downstream classifier neurons through plastic synapses that implement the STDP-like learning rule mentioned above. Population responses of these classifier neurons were then integrated over time to make a perceptual decision about the presented stimulus. The full network, which comprised 71,026 neurons and approximately 133 million synapses, ran in real-time on a single off-the-shelf graphics processing unit (GPU).

In order to evaluate the feasibility of our model, we applied it to the extensively studied MNIST database of handwritten digits (LeCun et al., 1998). Due to the large variability within a given class of digits and a high level of correlation between members of different classes, the database provides stimuli whose categorization might span a wide range of difficulty levels, and as such is well-suited as a first benchmark for our model. However, it should be noted that MNIST does not pose many of the challenges of biological vision, such as distractors, occluders or translation invariance. Moreover, all the images are static and isolated in their receptive field. The network achieved 92% correct classifications, which is comparable to other SNN approaches (Brader et al., 2007, Querlioz et al., 2011) and simple machine learning algorithms (such as linear classifiers, k-Nearest Neighbor classifiers and simple artificial neural networks LeCun et al., 1998), but not to state-of-the-art models whose performance is close to 99.8% (Ciresan et al., 2011, Niu and Suen, 2012).

Additionally, our network produces reaction time (RT) distributions that are comparable to the behavioral RT distributions reported in psychophysical experiments. For example, we show that when the network makes an error, its RT is significantly slower than when making a correct class prediction; and that RTs do not decrease when the target stimulus has become familiar, which has also been observed in a rapid categorization study (Fabre-Thorpe, Richard, & Thorpe, 1998).

Although the present model does not reach the performance of specialized classification systems (Ciresan et al., 2011, Niu and Suen, 2012), our model represents a first step towards the construction of a general-purpose neurobiologically inspired model of visual recognition and perceptual decision-making. The model includes many neurobiologically inspired details not found in the algorithms described above. The present network was constructed on a publicly available SNN simulator that uses design principles, data structures, and process flows that are in compliance with general-purpose neuromorphic computer chips, and that allows for real-time execution on off-the-shelf GPUs (Richert, Nageswaran, Dutt, & Krichmar, 2011); its neuron model, synapse model, and address-event representation (AER) are compatible with recent neuromorphic hardware (Srinivasa & Cruz-Albrecht, 2012). Because of the scalability of our approach, the current model can readily be extended to an efficient neuromorphic implementation that supports the simulation of more generalized object recognition and decision-making regions found in the brain. Ultimately, understanding the neural mechanisms that mediate perceptual decision-making based on sensory evidence will further our understanding of how the brain is able to make more complex decisions we encounter in everyday life (Lieberman, 2007), and could shed light on phenomena like the misperception of objects in neuropsychiatric disorders such as schizophrenia (Persaud and Cutting, 1991, Summerfield et al., 2006).

Section snippets

Methods

We performed all simulations in a large-scale SNN simulator which allows the execution on both generic x86 central processing units (CPUs) and standard off-the-shelf GPUs (Richert et al., 2011). The simulator provides a PyNN-like environment (PyNN is a common programming interface developed by the neuronal simulation community) in C/C++ and is publicly available at http://www.socsci.uci.edu/~jkrichma/Richert-FrontNeuroinf-SourceCode.zip. The simulator’s API allows for details and parameters to

Results

We addressed the question of how many training samples are needed to allow good classification by varying the size of the training set between ten patterns (one per digit) and 2000 patterns (200 per digit). The testing set always consisted of 1000 patterns the network had not seen before. We ran a total of four experiments, where each experiment featured a set of ntrain training samples, and each experiment was run 100 times. The number of training cycles was adjusted such that overall 2000

Discussion

The main contributions of the present study are as follows. First, we modified the original model (Brader et al., 2007) to be more biologically plausible most notably by (i) implementing a SNN using Izhikevich spiking neurons and conductance-based synapses, (ii) implementing the different dynamics seen in excitatory and inhibitory neurons, (iii) incorporating a pre-processing stage that approximates the spatiotemporal tuning properties of simple and complex cells in the primary visual cortex,

Conclusion

We have shown the experimental results from a neurobiologically plausible spiking network that is able to rapidly categorize highly correlated patterns of neural activity. Our approach demonstrates how a STDP-like learning rule (previously described in Brader et al. (2007)) can be utilized to store object information in a SNN, and how a simple decision-making paradigm is able to retrieve this memory in a way that allows the network to generalize to a large number of MNIST exemplars.

Acknowledgments

We thank four anonymous reviewers for their feedback, which has greatly improved the manuscript. This work was supported by the Defense Advanced Research Projects Agency (DARPA) subcontract 801888-BS.

References (74)

  • Y. Amit et al.

    Attractor networks for shape recognition

    Neural Computation

    (2001)
  • D.J. Amit et al.

    Selective delay activity in the cortex: phenomena and interpretation

    Cerebral Cortex

    (2003)
  • G. Bi et al.

    Synaptic modification by correlated activity: Hebb’s postulate revisited

    Annual Review of Neuroscience

    (2001)
  • R. Bogacz et al.

    The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks

    Psychological Review

    (2006)
  • J.M. Brader et al.

    Learning real-world stimuli in a neural network with spike-driven synaptic dynamics

    Neural Computation

    (2007)
  • V. Braitenberg et al.

    Cortex: statistics and geometry of neuronal connectivity

    (1998)
  • C. Bruce et al.

    Visual properties of neurons in a polysensory area in superior temporal sulcus of the macaque

    Journal of Neurophysiology

    (1981)
  • Ciresan, D. C., Meier, U., Masci, J., & Schmidhuber, J. (2011). Flexible, high-performance convolutional neural...
  • C. Clopath et al.

    Connectivity reflects coding: a model of voltage-based STDP with homeostasis

    Nature Neuroscience

    (2010)
  • C.L. Colby et al.

    Space and attention in parietal cortex

    Annual Review of Neuroscience

    (1999)
  • P. Dayan et al.

    Theoretical neuroscience: computational and mathematical modeling of neural systems

    (2001)
  • C.A. Erickson et al.

    Responses of macaque perirhinal neurons during and after visual stimulus association learning

    Journal of Neuroscience

    (1999)
  • M. Fabre-Thorpe et al.

    Rapid categorization of natural images by rhesus monkeys

    Neuroreport

    (1998)
  • R.V. Florian

    Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity

    Neural Computation

    (2007)
  • K. Fukushima

    Neocognitron: a self organizing neural network model for a mechanism of pattern recognition unaffected by shift in position

    Biological Cybernetics

    (1980)
  • S. Fusi et al.

    Spike-driven synaptic plasticity: theory, simulation, VLSI implementation

    Neural Computation

    (2000)
  • S. Fusi et al.

    Eluding oblivion with smart stochastic selection of synaptic updates

    Chaos

    (2006)
  • M. Graupner et al.

    Calcium-based plasticity model explains sensitivity of synaptic changes to spike pattern, rate, and dendritic location

    Proceedings of the National Academy of Sciences of the United States of America

    (2012)
  • S. Grossberg

    How does a brain build a cognitive code

    Psychological Review

    (1980)
  • H.R. Heekeren et al.

    A general mechanism for perceptual decision-making in the human brain

    Nature

    (2004)
  • H.R. Heekeren et al.

    The neural systems that mediate human perceptual decision making

    Nature Reviews Neuroscience

    (2008)
  • G.D. Horwitz et al.

    Separate signals for target selection and movement specification in the superior colliculus

    Science

    (1999)
  • D.H. Hubel et al.

    Receptive fields and functional archichitecture in 2 nonstriate visual areas (18 and 19) of cat

    Journal of Neurophysiology

    (1965)
  • R. Huerta et al.

    Fast and robust learning by reinforcement signals: explorations in the insect brain (vol. 21, p. 8, 2009)

    Neural Computation

    (2009)
  • Indiveri, G., & Fusi, S. (2007). Spike-based learning in VLSI networks of spiking neurons. In IEEE international...
  • E.M. Izhikevich

    Simple model of spiking neurons

    IEEE Transactions on Neural Networks

    (2003)
  • E.M. Izhikevich

    Which model to use for cortical spiking neurons?

    IEEE Transactions on Neural Networks

    (2004)
  • Cited by (0)

    View full text