# Event-driven implementation of Deep Spiking Convolutional Neural Networks for Supervised Classification using the SpiNNaker neuromorphic platform

Alberto Patino-Saucedo<sup>a,\*</sup>, Horacio Rostro-Gonzalez<sup>a</sup>, Teresa Serrano-Gotarredona<sup>b</sup>, Bernabé Linares-Barranco<sup>b</sup>

<sup>a</sup>Department of Electronics Engineering, University of Guanajuato, Salamanca, Mexico <sup>b</sup>Instituto de Microelectrónica de Sevilla (IMSE-CNM), CSIC, Seville, Spain

### Abstract

Neural networks have enabled great advances in recent times due mainly to improved parallel computing capabilities in accordance to Moore's Law, which allowed reducing the time needed for the parameter learning of complex, multi-layered neural architectures. However, with silicon technology reaching its physical limits, new types of computing paradigms are needed to increase the power efficiency of learning algorithms, especially for dealing with deep spatio-temporal knowledge on embedded applications. With the goal of mimicking the brain's power efficiency, new hardware architectures such as the SpiNNaker board have been built. Furthermore, recent works have shown that networks using spiking neurons as learning units can match classical neural networks in supervised tasks. In this paper, we show that the implementation of state-of-the-art models on both the MNIST and the event-based NMNIST digit recognition datasets is possible on neuromorphic hardware. We use two approaches, by directly converting a classical neural network to its spiking version and by training a spiking network from scratch. For both cases, software simulations and implementations into a SpiNNaker 103 machine were performed. Numerical results approaching the state of the

Preprint submitted to Neural Networks

<sup>\*</sup>Corresponding author

*Email addresses:* alberto.patino@ugto.mx (Alberto Patino-Saucedo),

hrostrog@ugto.mx (Horacio Rostro-Gonzalez), terese@imse-cnm.csic.es (Teresa Serrano-Gotarredona), bernabe@imse-cnm.csic.es (Bernabé Linares-Barranco)

art on digit recognition are presented, and a new method to decrease the spike rate needed for the task is proposed, which allows a significant reduction of the spikes (up to 34 times for a fully connected architecture) while preserving the accuracy of the system. With this method, we provide new insights on the capabilities offered by networks of spiking neurons to efficiently encode spatio-temporal information.

#### Keywords:

Neuromorphic Hardware, Artificial Neural Networks, Spiking Neural Networks, MNIST, SpiNNaker, Event Processing

# 1. Introduction

In the last few years, progress in the field of Artificial Neural Networks (ANNs) has led it to take a central role in solving Artificial Intelligence problems, outperforming other machine learning approaches such as kernel machines in highly complex tasks of computer vision, speech recognition, natural language processing, among others [1]. Even though ANNs have been studied for decades, their widespread use and development was restricted by their high computational cost. Their late success came on a par with the sustained exponential growth in computing capacity as predicted by Moore [2]. This allowed the cornerstone learning algorithm of ANNs, back-propagation [3], to solve the weight assignment problem across multiple computational stages, or layers, giving rise to Deep Learning. The success of Deep ANNs lies in their ability to discover increasingly optimal representations of data, encoded in hierarchical structures, with no need for humans to specify all the knowledge needed by the system [4].

Many commercial, medical and scientific applications of deep ANNs can be found nowadays. One clear example is the face, fingerprint and voice recognition performed by smart-phones through the inference of deep ANN models, generally trained on external servers. However, with the upcoming end of Moore's Law, and to push the capabilities of deep learning forward, more power-efficient ways to train and deploy deep ANNs need to be achieved. Currently, training relies on massively parallel computing. The number of connections, power consumption and the required memory and computation time have limited the use of resource-intensive deep learning algorithms directly in embedded systems [5]. With this in mind, new ways to learn with more efficiency are being discussed and new paradigms of computation such as quantum and neuromorphic computing have emerged.

Neuromorphic computing is a novel technology that seeks to emulate the wiring and processes of the human brain in hardware. It has been boosted by two major projects: the Human Brain Project (Europe) [6], which uses the SpiNNaker [7] and the BrainScaleS [24] neuromorphic chips, and the BRAIN initiative (USA) [8]. Likewise, major hardware companies conduct active research on neuromorphic chips, with IBM (TrueNorth) [9] and Intel (Loihi) [10] standing out. One trend is to emulate cortical cell structures as accurately as possible and expect for emergent properties of intelligence to arise. Other more straightforward approach is to seek a convergence between neuromorphic and deep learning technologies. Both views agree on the use of Spiking Neurons, models for neural simulations that capture a fundamental property of biological neurons missing in ANNs: the use of spikes, or binary events, which enable an efficient way of modeling spatio-temporal data [11].

Spiking Neural Networks (SNNs) are being studied with the hope to get energy-efficient representations of the world, inspired in the brain's high memory capacity, noise robustness, and task complexity on low power consumption. Methods for training Deep Spiking Neural Networks (DSNNs) have appeared as a natural bridge between neuromorphic computing and deep learning, and several algorithms have been proposed for implementing spiking versions of Fully Connected and Convolutional Neural Networks (CNNs) [12] [13], Restricted Boltzmann Machines [14], Deep Belief Networks [15] [16] and Recurrent Neural Networks [17]. Focusing on the most widely used learning algorithm of Deep Learning, backpropagation, it has been successfully applied to train spiking CNNs, with approximations such as SpikeProp [18], ReSuMe [19], SLAYER [20] and Spatio-Temporal Backpropagation (STBP) [21]. Furthermore, methods for direct conversion from pretrained non-spiking CNNs to SNNs have been proposed [22] [13], with results matching the state of the art in supervised classification on benchmark datasets.

Most authors of the aforementioned frameworks for training or converting DSNNs hinted on the convenience and feasibility to deploy their networks in neuromorphic hardware. Stromatias [16], Cao [12], Rueckauer [13] and Wu [21] let the deployment of their proposed frameworks as future work. Shrestha [20] pointed out the difficulty of performing the training phase of SNNs on current neuromorphic chips, leaving only room for just performing inference of their method. Cao advocates for demonstrating the power efficiency of neuromorphic implementations of DSNNs. The implicit consensus is that

the current state of development of neuromorphic chips would only allow the implementation of DSNN systems with two separate stages: one for offline training/conversion and other for online neuromorphic inference.

Among the works proposing implementations of DSNNs on neuromorphic chips, Esser [23] implemented a sparsely connected neural network on the TrueNorth chip achieving a maximum 99.42% classification accuracy on the MNIST dataset, being the best reported score in this and any neuromorphic platform. This work also reported a tradeoff between energy efficiency and classification accuracy. Schmitt [24] Implemented a DSNN on the BrainScales system, reaching a maximum accuracy of 95% on the MNIST. Regarding the SpiNNaker platform, in an early attempt, Jin [25] deployed a non-spiking Multilayer Perceptron Network, without testing on benchmark datasets. Serrano-Gotarredona [26] implemented a CNN for symbol recognition, with events as inputs, achieving 80% accuracy. Stromatias [16] deployed a spiking Deep Belief Network, reaching 95% on the MNIST dataset, and Liu [27] deployed an energy efficient non-spiking Deep Neural Network with online training, achieving 96% on the MNIST.

In this work, we show a SpiNNaker implementation of the popular LeNet architecture, including approximated pooling layers and Relu activations, by the method of direct conversion as suggested by [13]. This network reaches 98.20% on the MNIST dataset, beating the best reported accuracy on the SpiNNaker platform. Additionally, we show the first neuromorphic implementation of an event-based digit classifier, by deploying a network trained with the STBP algorithm for the N-MNIST dataset, reaching 97.92% accuracy. Both networks were simulated using PyNN, a neural simulation platform, and a comparison between ANN implementation, SNN simulation and final hardware deploying is provided. Finally, we propose a modification of the cost function of the STBP algorithm in order to reduce the average spike rate necessary for classification, achieving a 19 times reduction on the LeNet architecture and a 34 times reduction for a densely connected SNN with drops in the classification accuracy of less than 2% and 1%, respectively. The reduction on the number of spikes needed to perform the classification task is important to achieve more energy-efficient inference, as is shown by a further experiment on the SpiNNaker were the inference time per input sample is reduced by up to 6%.

Furthermore, this work presents the first implementation on neuromorphic hardware of a SNN trained with the widely used PyTorch [31] framework, with a performance comparison between software and hardware implementations, broadening the scope of the deep SNN architectures that can be tested on the SpiNNaker platform.

# 2. Matherials and Methods

#### 2.1. SNN model and simulation

The spiking neuron model used in this work is an instance of the commonly used Leaky Integrate-and-Fire (LIF), suitable for very efficient implementations. The dynamics of the membrane potential u(t) of a single neuron is given by:

$$\frac{du(t)}{dt} = \frac{u_{rest} - u(t)}{\tau_m} + \frac{I(t)}{c_m} \tag{1}$$

where  $u_{rest}$  is the resting potential,  $\tau_m$  is the membrane's time constant,  $c_m$  is the membrane capacitance and I(t) is the neuron's input current.

By injecting a small input current pulse of duration  $\Delta t$ , starting at t=0, with initial membrane potential at rest state equal to zero,  $u(0) = u_{rest} = 0$ , the neuron's membrane is 'charged' during the stimulus and 'discharged' when it ends. For discrete simulation purposes, this  $\Delta t$  is taken as the sampling period of the discrete time. The response of the neuron to a small pulse or spike in the time k corresponds to the discharge of the neuron, yielding the discrete update equation of the membrane potential:

$$u[k+1] = u[k]e^{-\frac{\Delta t}{\tau_m}} \tag{2}$$

In the case of multiple pre-synaptic connections to the neuron, its input current is computed as the cumulative effect of pre-synaptic spikes:

$$I[k] = \sum_{j=1}^{M} w_j \theta_j[k] + I_{bias}$$
(3)

where M is the number of pre-synaptic neurons (with membrane potentials  $u_j$ ),  $w_j$  is the synaptic strength from the *j*-th pre-synaptic neuron (positive if the synapse is excitatory and negative if inhibitory),  $I_{bias}$  is an offset current and  $\theta_j[k]$  denotes the occurrence of a spike on the *j*-th pre-synaptic neuron in the current timestep k. Each neuron fires whenever the membrane potential surpasses a threshold  $u_{th}$ . The spike is computed by all post-synaptic neurons

in the next time-step, and its membrane potential is reset to  $u_{reset}$ . The spike function is thus given by:

$$\theta_j[k] = \begin{cases} 1 & u_j[k-1] \ge u_{th} \\ 0 & \text{otherwise} \end{cases}$$
(4)

One important metric of a spiking neuron's state used in this work is its firing rate. By definition, the firing rate r[k] of a given neuron whose simulation started at time k = 0 is:

$$r[k] = \frac{\sum_{t} \theta[k]}{k} \tag{5}$$

For running experiments with the spiking neural model used in this work, we used PyNN [28], a high level spiking neuron interface supporting experiments across multiple simulators (e.g: BRIAN, NEST, NEURON) making their scripts highly portable. Most importantly, PyNN can be used as an interface between high level modeling and hardware implementation into the SpiNNaker platform. More details on the PC-based and SpiNNakerbased PyNN implementations are given in subsection 2.5.



Figure 1: Scheme of the different stages of the neuromorphic digit classification systems proposed in this work. The upper part depicts the conversion system and the lower part the STBP-trained system.

# 2.2. ANN to SNN conversion

Given a fully connected ANN with L layers, each with  $M^l$  neurons or units let  $W^l$ ,  $l \in \{1, ..., L\}$  denote the weight matrix connecting units between layer l-1 and layer l. The ReLU activation  $a_i^l$  for each i unit of layer l with bias  $b_i^l$  is given by:

$$a_i^{\ l} = max(\left\{0, z_i^l\right\}) \tag{6}$$

$$z_i^{\ l} = \sum_{j=1}^{M^{l-1}} W_{ij}^l a_j^{l-1} + b_i^l \tag{7}$$

The main objective of the ANN to SNN conversion is to take a pre-trained ANN and create an analogous SNN with the same connectivity (i.e one-to-one correspondence among ANN and SNN units) where the firing rate  $r_i^l$  of every spiking neuron is proportional to the value of the activation  $a_i^l$  of the corresponding artificial neuron. This is performed by exploiting the fact that ReLU activations in ANNs, such as spiking rates in SNNs, are always positive. The conversion is performed by computing a scaled version of  $W^l$  and taking it as the synaptic weight matrix of the corresponding layers of the SNN. The scaling is needed to ensure that for every layer of the ANN, two conditions are satisfied: 1) if  $z_i^{\ l} < 0$ , the effect of presynaptic spikes to the membrane potential of neuron *i* doesn't make it fire, and 2)  $\max(\{a_{1}^{l}, ..., a_{M^{l}}^{l}\})$  do not surpass the maximum firing rate of the spiking neuron simulation, set to 1 kHz.

In [29], authors proposed a method to convert ANNs into SNNs for image classification, and released a toolbox supporting the conversion from ANN models defined in different platforms such as Keras, Lasagne and Caffe, returning the synaptic weights to be used on SNN simulators such as PyNN. It supports a number of commonly used features of ANNs, such as Convolutional and Batch-Normalization layers, and ReLU, Softmax and binary activations, among others. A summary of the main features of the conversion toolbox is provided by the authors in the website <sup>1</sup> and presented in Table 1.

This toolbox has been used here to convert a modified LeNet CNN for digit recognition into a its corresponding SNN to evaluate its performance

<sup>&</sup>lt;sup>1</sup>http://snntoolbox.readthedocs.io/en/latest/guide/intro.html



Figure 2: Responses to pre-synaptic spikes for both kind of neurons considered in this work. Dashed horizontal lines represent the thresholds. Note that both neurons emit a spike between 6 and 7 ms.

in PC simulation and hardware implementation by running the equivalent PyNN codes into the SpiNNaker platform. Numerical results are presented in section 3.

| Supported features                          | $\begin{array}{c c} {\rm Keras} & [{\rm K}] \\ {\rm Lasagne} & [{\rm L}] \\ {\rm Caffe} & [{\rm C}] \\ {\rm (input)} \end{array}$ | Brian2 [B] pyNN<br>[P] MegaSim [M]<br>INIsim [I] (output) |
|---------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------|
| Fully connected                             | All                                                                                                                               | All                                                       |
| Convolutional                               | All                                                                                                                               | All                                                       |
| Max-Pooling                                 | All                                                                                                                               | I                                                         |
| Average-Pooling                             | All                                                                                                                               | All                                                       |
| Batch-Normalization                         | All                                                                                                                               | All                                                       |
| Dropout                                     | All                                                                                                                               | All                                                       |
| Flatten                                     | All                                                                                                                               | All                                                       |
| Merge/Concatenate (Inception modules)       | K, L                                                                                                                              | Ι                                                         |
| Linear activation                           | All                                                                                                                               | Replaced by ReLU                                          |
| ReLU activation                             | All                                                                                                                               | All                                                       |
| Softmax activation                          | All                                                                                                                               | I                                                         |
| Binary activation $\{-1, 1\}$ or $\{0, 1\}$ | L                                                                                                                                 | Ι                                                         |
| Binary weights {-1, 1}                      | L                                                                                                                                 | All                                                       |
| Non-zero biases                             | All                                                                                                                               | Ι                                                         |

Table 1: Summary of the main features of the toolbox.

The MNIST dataset [30] for digit recognition was used to test the ANN to SNN conversion approach, consisting of 60000 training and 10000 test

images of handwritten digits. This dataset is a standard benchmark to test the performance of machine learning algorithms, and it has also been used for classification tasks on neuromorphic chips. See table 4 for a survey of the neuromorphic chip implementations of SNNs for the MNIST dataset.

In this work, the ANN to SNN conversion process was made by taking a pre-trained LeNet CNN for MNIST and converting it to a Convolutional SNN. The architecture of the LeNet network consists of one input layer, three convolutional layers and two fully connected layers (including the output layer), with pooling operations in between. A scheme of the architecture is shown in figure 3. The converted equivalent SNN network was simulated in PyNN and implemented on the SpiNNaker chip following the process shown in figure 1.



Figure 3: LeNet CNN Architecture used for direct conversion into Deep Convolutional SNN.

#### 2.3. SNN Training with Spatio Temporal Back Propagation

Conversion methods such as the one discussed above force the SNN model to focus its attention to the spatial domain information. Spiking neuron parameters of the simulator for the standard conversion method as proposed by [29], neglect the temporal or "memory" effect of the membrane by setting a high value on its time constant  $\tau_m$  (refer to table 2). In contrast, the method proposed by [21], denominated Spatio Temporal Back Propagation (STBP) allows a more complete treatment of the temporal domain by training the SNN with a time-dependent generalization of the ANN's backpropagation algorithm. Here, as in the conversion method, the neuron's activity is determined by its firing rate. The algorithm uses a loss function  $\ell$  across S training

| Parameter              | Conversion | STBP Training |
|------------------------|------------|---------------|
| $u_{reset}~({ m mV})$  | 0.0        | 0.0           |
| $u_{rest}~({ m mV})$   | 0.0        | 0.0           |
| $u_{th}~({ m mV})$     | 1.0        | 0.3           |
| $	au_m \ ({ m ms})$    | 1000       | 0.8325        |
| $c_m ~({ m nF})$       | 0.09       | 0.001         |
| $\Delta_t \ ({ m ms})$ | 1.0        | 1.0           |

Table 2: Simulation parameters for the converted and STBP trained SNN model.

samples and a time window T:

$$\ell = \frac{1}{S} \sum_{s=1}^{S} \left\| \boldsymbol{y}_{s} - \frac{1}{T} \sum_{t=1}^{T} \boldsymbol{\theta}_{s,L} \right\|_{2}^{2}$$
(8)

where  $y_s$  and  $\theta_{s,L}$  are the label vector of the *s*-th training sample and its corresponding spike activity vector in the output layer (last layer *L*) after forward propagation, respectively.

We used an implementation of this algorithm provided by the authors. The neural network model is described in PyTorch [31], an open source deep learning platform. After training any deep SNN model, the platform allows the extraction of the final weights and biases. These are used for reproducing the results from PyTorch with PyNN and the subsequent implementation on SpiNNaker, provided the connectivity, i.e. populations and projections in PyNN are equivalent to the PyTorch tensor operations pertaining the whole model. This way, the extracted weights are used as synaptic weights in PyNN, and the extracted biases are used as offsets for the threshold  $u_{th}$ , with exact same values. A comparison of the neuron parameters for the PyNN implementations of both methods, conversion and training, is given in table 2. As seen in figure 2, the decay of the membrane potential in the model used by the STBP method is more realistic than that used by the conversion method.

The N-MNIST dataset [32] is an event-based version of the MNIST dataset, where each sample was displayed in a monitor and recorded with a Dynamic Vision Sensor (DVS) mounted in a motorized pan-tilt unit performing a saccade movement. The sensor records a spike whenever a change of illumination is detected. The spatial dimension is the same as that of the MNIST dataset, 28x28 pixels. According to the work by [21], for every sample we take both, positive and negative change of illumination, as two different channels and feed them to a 400-400-10 fully connected SNN trained with the STBP algorithm. Afterward, the trained weights are used for a simulation of the SNN on PyNN and implementation on SpiNNaker. Results for both experiments are presented in section 3.

## 2.4. SNN Training with Spike Regularization

One of the premises that has boosted research on SNNs is the hope to make computations more energy-efficient when implemented on event-driven neuromorphic hardware in comparison with their frame-based counterparts. This would be possible due to the characteristics of neuromorphic devices, which allow to keep and update the state of every neuron independently, without the need for a general clock, i.e., computing spikes asynchronously. It is known that the conditional multiply-accumulate operation in each synapse is the driver of neural computations in neuromorphic hardware [33]. This means that the spike rates and the number of active synapses can be used to estimate the energy consumption of such devices. By taking this into account, Cao [12] conducted an analysis of power consumption for its spiking CNN module, assuming a direct relation between the spike count and the consumed power. In this work, we adopt this approach and propose a modification of the cost function of the STBP algorithm to decrease the number of spikes. This modification acts as an spike activity regularization, analog to the weight regularization that's commonly used when training classical neural networks. With the goal of not only to achieve better generalization in the classification task, but a reduced spike activity, we introduce the new loss function:

$$\ell = \frac{1}{S} \sum_{s=1}^{S} \left( \left\| \boldsymbol{y}_{\boldsymbol{s}} - \frac{1}{T} \sum_{t=1}^{T} \boldsymbol{\theta}_{\boldsymbol{s},\boldsymbol{L}} \right\|_{2}^{2} + \frac{\lambda_{sr}}{NT} \sum_{t=1}^{T} \sum_{l=1}^{L-1} \boldsymbol{\theta}_{\boldsymbol{s},\boldsymbol{l}} \right)$$
(9)

This cost function computes for every sample the amount of spikes elicited in a time window T of all the neurons, except those in the output layer. The scaling factor N, equal to the total number of hidden units, is used to ensure both terms in the equation are in the same scale. The spike regularization factor  $\lambda_{sr} \epsilon$  [0, 1] is added for control.  $\lambda$  can be interpreted as a compromise between network's accuracy and spike economy. We conduct experiments with six different values of spike regularization on both the fully connected and the convolutional network and report the results in section 3.2.

# 2.5. Spiking Neural Network Architecture (SpiNNaker)

SpiNNaker is a massively parallel multicore computing system designed for modeling very large spiking neural networks in real time. Both the system architecture and the design of the SpiNNaker chip were developed by the Advanced Processor Technologies Research Group (APT) at the University of Manchester. Each SpiNNaker chip consists of 18 fully programmable ARM cores.

In this work, a SpiNNaker 103 machine (figure 4) was used. This board comprises 48 SpiNNaker chips, totaling 864 ARM processor cores deployed as 48 monitor processors, 768 application cores and 48 spare cores. Each application core has two types of RAM: a 32kB ITCM (instruction tightly coupled memory) for storing instructions and a 64kB DTCM (data tightly coupled memory) for storing neuron states and parameters. Additionally, each SpiNNaker chip contains a 128 MB SDRAM shared by the 18 cores for storing the synaptic weights. The communication between cores is done through a multicast packet-routing mechanism that mimics the high connectivity found in biological brains [34]. A 100Mbps Ethernet connection is used for controlling an I/O interface between the computer and the SpiNNaker board. The neurons and synapses are modeled with sPyNNaker [35], a software package for simulating PyNN-defined spiking neural networks on the SpiNNaker platform. Two SpiNNaker implementations were performed for each proposed approach, as shown in the diagram of figure 1.

# 3. Results

## 3.1. SpiNNaker implementation

We have implemented two Deep SNNs on the SpiNNaker Neuromorphic platform for a handwritten digit classification task. The first is previously trained on Keras as a classical CNN (LeNet) with a static dataset (MNIST), then converted into Deep SNN with the snntoolbox. The second is trained as a Deep SNN on PyTorch using the novel STBP algorithm. The input of the second network is an event-based equivalent of the MNIST dataset, recorded with a DVS camera. For measuring the performance of the implementation, the whole test set (10000 samples) was propagated in both software and



Figure 4: SpiNNaker 103 machine.

hardware, measuring the classification accuracy as the percentage of correctly detected digits. In the case of the SpiNNaker implementation, 15 ms of activity were recorded. An additional neural simulation on the PyNN (Nest) software was performed for both SNNs, using 1000 samples. The real-time implementation takes approximately 0.4 seconds per sample in the neuromorphic hardware and 10 seconds per sample in the neural simulation software.

An example of the activity during inference of the STBP trained network, both in software (PyTorch, PyNN) and hardware (SpiNNaker) is shown in figure 5. The figure shows the spike times of all neurons in the network for ten presented samples. An image is considered to have been classified correctly if the neuron associated with the input digit displays the highest activity of all the output layer neurons.

Numerical results are shown in table 3. A comparison with other neuromorphic MNIST(Table 4) implementations on neuromorphic hardware shows that ours achieve the second overall result, with 98.2% correct classification, and the best on the SpiNNaker platform. Furthermore, to the best of our knowledge, this work presents the first neuromorphic hardware implementation of the event-based NMNIST benchmark, with 97.92% correct classification.

|               | MNIST/Conversion | NMNIST/STBP |
|---------------|------------------|-------------|
| Keras/Pytorch | 98.96            | 98.50       |
| PyNN (Nest)   | 97.98            | 97.04       |
| SpiNNaker     | 98.2             | 97.92       |

Table 3: Comparison of classification accuracy for the proposed methods and datasets.

| Model                         | Hardware     | Network Arch.  | S   | Т   | Acc   |
|-------------------------------|--------------|----------------|-----|-----|-------|
| Merolla (2011) [33]           | Custom core  | Spiking RBM    | Yes | No  | 94.00 |
| Neil (2014) [36]              | FPGA         | Spiking DBN    | Yes | No  | 92.00 |
| Garbin (2014) [37]            | OxRAM device | DSNN (ConvNet) | Yes | No  | 94.00 |
| <b>Stromatias (2015)</b> [16] | SpiNNaker    | Spiking DBN    | Yes | No  | 95.00 |
| Esser (2015) [23]             | TrueNorth    | DSNN (Sparse)  | Yes | No  | 99.42 |
| Schmitt (2017) [24]           | BrainScales  | DSNN (Dense)   | Yes | Yes | 95.00 |
| Liu (2018) [27]               | SpiNNaker    | Deep Rewiring  | No  | Yes | 96.00 |
| Ours                          | SpiNNaker    | DSNN (ConvNet) | Yes | No  | 98.20 |

Table 4: Summary of spiking deep learning models implemented on neuromorphic hardware and their accuracy on MNIST. The column T is true if the hardware performs online training. The column S is true if the model uses spikes internally.

# 3.2. Spike Regularization

We report the effect of the proposed spike regularization of the STBP's cost function on the fully connected 400-400-10 (labeled Dense400) and the LeNet convolutional network for six different values of spike regularization  $(\lambda_{sr} \text{ in equation 9})$  ranging from  $\lambda_{sr} = 0$  (no regularization) to  $\lambda_{sr} = 1$ . After training both network architectures with PyTorch, the entire test set was propagated and measurements of the average spike rate per neuron were recorded. Additionally, the fully connected network was loaded into the SpiNNaker platform and measurements of the simulation time for 100 samples per experiment were performed. The simulation parameters of STBP training from table 2 remained unchanged. Numerical results are given in table 5. Graphical results are shown in figures 6 and 7. It is observed that as spike regularization increases, the average spike rate per neuron decreases following a logarithmic rule that gets close to rates observed in biological neurons as  $\lambda_{sr}$  approaches 1. The amount of elicited spikes is reduced almost 34 times for the fully connected and 19 times for the convolutional network, with a small drop in the classification accuracy: less than 1% and less than 2% for the Dense400 and LeNet respectively.



Figure 5: Raster plots for the Deep SNN simulation of ten NMNIST samples using PyTorch (top), PyNN (center) and SpiNNaker (bottom). Each sample was propagated 16 ms, as indicated by the vertical lines. The corresponding labels are, from left to right: 7,2,1,0,4,1,4,9,5,9.

## 4. Discussion

This paper constitutes an attempt to consistently port deep Spiking Neural Networks simulations into neuromorphic hardware. Figure 5 shows that in general the desired spiking activity is preserved in our hardware

|                | spike rate [Hz] |          | accuracy |          |  |
|----------------|-----------------|----------|----------|----------|--|
| $\lambda_{sr}$ | LeNet           | Dense400 | LeNet    | Dense400 |  |
| 0.00           | 64.88           | 266.57   | 98.4     | 97.9     |  |
| 0.01           | 32.70           | 85.82    | 98.4     | 97.8     |  |
| 0.05           | 15.63           | 41.89    | 98.6     | 98.0     |  |
| 0.10           | 10.86           | 29.28    | 98.7     | 97.4     |  |
| 0.50           | 4.92            | 12.80    | 98.2     | 97.4     |  |
| 1.00           | 3.35            | 7.86     | 96.7     | 97.2     |  |

Table 5: Effect of spike regularization in the spike rate and accuracy for both densely connected and convolutional spiking neural networks.



Figure 6: Effect of spike regularization on both densely connected and convolutional neural network for the NMNIST digit recognition. Left: Effect on spike rate, with spike regularization on log scale for better visualization. The green area indicates the spike rates commonly reported in biological neurons. Right: Effect on classification error and inference time (dotted line).

implementation even if multiple layers are used. Also, the possibility to use SNN versions of commonly used techniques of Deep Learning such as Convolutional layers, Average Pooling, Batch Normalization and Dropout, added to the aforementioned spiking reliability of the hardware implementation in SpiNNaker opens the door to deploying complex deep spiking architectures for image and video classification, natural language processing, robot navigation, etc. Currently, the main limitation seems to be the number of simulated neurons, which in the SpiNNaker 103 board is about 200 thousand, considering 255 neurons per core, the maximum recommended value. For



Figure 7: Raster plots for the Deep Fully Connected simulation of three NMNIST examples: 7,2,1; for three increasing values of  $\lambda_{sr}$ : 0, 0.1 and 1. Each sample was propagated for 16 ms, as indicated by the vertical lines. Top plot is for the first hidden layer. Bottom plot is for the output layer.

reference, the bigger implementation reported here, the converted CNN, used only 8 thousand neurons.

On the other hand, the use of neuromorphic hardware significantly reduces the time of simulations of biologically realistic DSNNs. Here we show that



Figure 8: Weight distribution shift of four layers of the spiking CNN for two extreme values of  $\lambda_{sr}$ . The number of bins for the histogram visualization is fixed to 200.

training DSNN prototypes in e.g PyTorch and deploying on SpiNNaker using the user-friendly PyNN module is possible in few steps. This can greatly contribute to research on new training algorithms and architectures for efficient machine learning systems deployed on brain-like hardware.

It's important to point out that the ultimate goal of neuromorphic systems for machine learning is to achieve better energy efficiency compared to conventional hardware, rather than perfect accuracy. Following the approach of [12], we sought to reduce the number of spikes while preserving a high classification accuracy. The proposed modification of the cost function of the STBP algorithm achieves this goal, yielding other interesting results worth further exploration:

• Spike rates similar to biological neurons are achievable with Deep SNNs.



Figure 9: Animated plots of the firing patterns in both architectures are provided. Spiking LeNet: https://photos.app.goo.gl/h2DytAutZ2pFK5rd8. Dense400: https: //photos.app.goo.gl/uMq7dJbA7pmSwYyt9. Fully connected layers are arranged in 2D for better visualization.

Although exact average spike rates of human brain neurons is still a matter of discussion, works like [38] and [39] allow an estimate between 0.1 and 10 Hz for hippocampal and cortical neurons. Our work shows that with enough spike regularization, average spike rates in a DSNN can go below 10 Hz for a digit classification task, being the lower reported average 3.35 Hz.

- Forcing the spike regularization factor to yield lower spike rates than 0.1 Hz leads to a significant drop in accuracy, especially for the spiking CNN. Minimum spike rate achievable for this task is left to future work.
- The best accuracy is achieved with a small spike regularization. By observing the right-hand plot in figure 6, for both tested architectures the best accuracy was, surprisingly, with a small regularization factor  $\lambda_{sr}$ , between 0.05 and 0.1. A monotonic increase in the error was expected. One possible explanation for this behavior is that both networks learn the best balance between generalization of the data and expressive power (given by the spike rate) in this range.

- The spikes in Spiking CNNs are more sparse than in their densely connected counterparts, hence the observed lower spike regularization effect on the Spiking LeNet was expected. Nonetheless, the observed spike reduction on Spiking CNNs is not as uniform as in Fully Connected networks, but expressed in the fact that fewer feature maps are allowed to learn. Refer to the provided animations (links in caption of figure 9) to observe this effect.
- In figure 8, the synaptic weight distribution for different layers of Spiking LeNet is displayed, with and without regularization. As expected, regularization decreases the weights of excitatory (positive) synapses and increases those of inhibitory (negative) synapses. Interestingly, the effect is more visible in the first layer, responsible for processing lower level, faster spatio-temporal features.

One limiting factor of using the SpiNNaker platform is that it is difficult to assess its energy consumption. The reported time (seen in figure 6) can constitute an indication that processing less spikes do indeed require less energy in this neuromorphic platform, but this time difference is mostly due to the memory footprint left while recording the spikes and membrane potentials.

While an energy efficiency analysis was not in the scope of this paper, we believe that SNNs are naturally more suited for energy-efficient event-based processing than traditional ANNs. In this regard, we consider that the use in this work of a DVS recorded dataset such as NMNIST and the introduction of spike regularization mechanisms in the training phase are steps in the right direction. Spatial event processing in neuromorphic hardware such as the performed in this and other recent works form part of a promising alternative to represent and harness spatio-temporal data: without the need for a global time axis. Traditional temporal processing relies in recording and processing snapshots of the data at a given rate. Instead, we advocate for the use interconnected units that locally react to changes in the stimuli as they occur and are able to keep track of previous states. This way knowledge is represented in the connectivity, internal states and firing activity of such deeply connected units, augmenting the memory capacity and energy efficiency of the systems as background information is not represented. In particular, this work presents a way to extract knowledge from spatio-temporal data with SNNs by penalizing the excessive firing activity observed in previous

systems. In future work we aim to deploy more complex architectures and investigate in how to efficiently perform online neuromorphic hardware learning.

#### 5. Aknowledgements

This research has been supported by the CONACYT project FC2016-1961 'Neurociencia Computacional: de la teoría al desarrollo de sistemas neuromórficos'. This work was also partially supported by the EU H2020 grant 824164 "HERMES", and by the Spanish grant TEC2015-63884-C2-1-P "COGNET" (with support from the European Regional Development Fund).

#### References

- J. Schmidhuber, Deep learning in neural networks: An overview, Neural Networks 61 (2015) 85 – 117.
- [2] G. E. Moore, Cramming more components onto integrated circuits, Proceedings of the IEEE 86 (1) (1998) 82–85.
- [3] D. E. Rumelhart, G. E. Hinton, R. J. Williams, Learning internal representations by error propagation, Tech. rep., California Univ San Diego La Jolla Inst for Cognitive Science (1985).
- [4] I. Goodfellow, Y. Bengio, A. Courville, Deep learning, MIT press, 2016.
- [5] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, nature 521 (7553) (2015) 436.
- [6] H. Markram, The human brain project, Scientific American 306 (6) (2012) 50–55.
- [7] S. B. Furber, F. Galluppi, S. Temple, L. A. Plana, The spinnaker project, Proceedings of the IEEE 102 (5) (2014) 652–665.
- [8] L. A. Jorgenson, W. T. Newsome, D. J. Anderson, C. I. Bargmann, E. N. Brown, K. Deisseroth, J. P. Donoghue, K. L. Hudson, G. S. Ling, P. R. MacLeish, et al., The brain initiative: developing technology to catalyse neuroscience discovery, Philosophical Transactions of the Royal Society B: Biological Sciences 370 (1668) (2015) 20140164.

- [9] F. Akopyan, J. Sawada, A. Cassidy, R. Alvarez-Icaza, J. Arthur, P. Merolla, N. Imam, Y. Nakamura, P. Datta, G.-J. Nam, et al., Truenorth: Design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 34 (10) (2015) 1537– 1557.
- [10] M. Davies, N. Srinivasa, T.-H. Lin, G. Chinya, Y. Cao, S. H. Choday, G. Dimou, P. Joshi, N. Imam, S. Jain, et al., Loihi: a neuromorphic manycore processor with on-chip learning, IEEE Micro 38 (1) (2018) 82–99.
- [11] N. K. Kasabov, Time-Space, Spiking Neural Networks and Brain-Inspired Artificial Intelligence, Vol. 7, Springer.
- [12] Y. Cao, Y. Chen, D. Khosla, Spiking deep convolutional neural networks for energy-efficient object recognition, International Journal of Computer Vision 113 (1) (2015) 54–66.
- [13] B. Rueckauer, I.-A. Lungu, Y. Hu, M. Pfeiffer, Theory and tools for the conversion of analog to spiking convolutional neural networks, arXiv preprint arXiv:1612.04052.
- [14] E. Neftci, S. Das, B. Pedroni, K. Kreutz-Delgado, G. Cauwenberghs, Event-driven contrastive divergence for spiking neuromorphic systems, Frontiers in neuroscience 7 (2014) 272.
- [15] P. O'Connor, D. Neil, S.-C. Liu, T. Delbruck, M. Pfeiffer, Real-time classification and sensor fusion with a spiking deep belief network, Frontiers in neuroscience 7 (2013) 178.
- [16] E. Stromatias, D. Neil, M. Pfeiffer, F. Galluppi, S. B. Furber, S.-C. Liu, Robustness of spiking deep belief networks to noise and reduced bit precision of neuro-inspired hardware platforms, Frontiers in neuroscience 9 (2015) 222.
- [17] A. Shrestha, K. Ahmed, Y. Wang, D. P. Widemann, A. T. Moody, B. C. Van Essen, Q. Qiu, A spike-based long short-term memory on a neurosynaptic processor, in: 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), IEEE, 2017, pp. 631–637.

- [18] S. M. Bohte, J. N. Kok, H. La Poutre, Error-backpropagation in temporally encoded networks of spiking neurons, Neurocomputing 48 (1-4) (2002) 17–37.
- [19] F. Ponulak, A. Kasiński, Supervised learning in spiking neural networks with resume: sequence learning, classification, and spike shifting, Neural computation 22 (2) (2010) 467–510.
- [20] S. B. Shrestha, G. Orchard, Slayer: Spike layer error reassignment in time, in: Advances in Neural Information Processing Systems, 2018, pp. 1419–1428.
- [21] Y. Wu, L. Deng, G. Li, J. Zhu, L. Shi, Spatio-temporal backpropagation for training high-performance spiking neural networks, Frontiers in neuroscience 12.
- [22] P. U. Diehl, D. Neil, J. Binas, M. Cook, S.-C. Liu, M. Pfeiffer, Fastclassifying, high-accuracy spiking deep networks through weight and threshold balancing, in: 2015 International Joint Conference on Neural Networks (IJCNN), IEEE, 2015, pp. 1–8.
- [23] S. K. Esser, R. Appuswamy, P. Merolla, J. V. Arthur, D. S. Modha, Backpropagation for energy-efficient neuromorphic computing, in: Advances in Neural Information Processing Systems, 2015, pp. 1117– 1125.
- [24] S. Schmitt, J. Klähn, G. Bellec, A. Grübl, M. Guettler, A. Hartel, S. Hartmann, D. Husmann, K. Husmann, S. Jeltsch, et al., Neuromorphic hardware in the loop: Training a deep spiking network on the brainscales wafer-scale system, in: 2017 International Joint Conference on Neural Networks (IJCNN), IEEE, 2017, pp. 2227–2234.
- [25] X. Jin, M. Luján, M. M. Khan, L. A. Plana, A. D. Rast, S. R. Welbourne, S. B. Furber, Algorithm for mapping multilayer bp networks onto the spinnaker neuromorphic hardware, in: 2010 Ninth International Symposium on Parallel and Distributed Computing, IEEE, 2010, pp. 9–16.
- [26] T. Serrano-Gotarredona, B. Linares-Barranco, F. Galluppi, L. Plana,
   S. Furber, Convnets experiments on spinnaker, in: 2015 IEEE

International Symposium on Circuits and Systems (ISCAS), IEEE, 2015, pp. 2405–2408.

- [27] C. Liu, G. Bellec, B. Vogginger, D. Kappel, J. Partzsch, F. Neumärker, S. Höppner, W. Maass, S. B. Furber, R. Legenstein, et al., Memoryefficient deep learning on a spinnaker 2 prototype, Frontiers in neuroscience 12.
- [28] A. P. Davison, D. Brüderle, J. M. Eppler, J. Kremkow, E. Muller, D. Pecevski, L. Perrinet, P. Yger, Pynn: a common interface for neuronal network simulators, Frontiers in neuroinformatics 2 (2009) 11.
- [29] B. Rueckauer, I.-A. Lungu, Y. Hu, M. Pfeiffer, S.-C. Liu, Conversion of continuous-valued deep networks to efficient event-driven networks for image classification, Frontiers in neuroscience 11 (2017) 682.
- [30] Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE 86 (11) (1998) 2278–2324.
- [31] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, A. Lerer, Automatic differentiation in PyTorch, in: NIPS Autodiff Workshop, 2017.
- [32] G. Orchard, A. Jayawant, G. K. Cohen, N. Thakor, Converting static image datasets to spiking neuromorphic datasets using saccades, Frontiers in neuroscience 9 (2015) 437.
- [33] P. Merolla, J. Arthur, F. Akopyan, N. Imam, R. Manohar, D. S. Modha, A digital neurosynaptic core using embedded crossbar memory with 45pj per spike in 45nm, in: 2011 IEEE custom integrated circuits conference (CICC), IEEE, 2011, pp. 1–4.
- [34] B. Cuevas-Arteaga, J. P. Dominguez-Morales, H. Rostro-Gonzalez, A. Espinal, A. F. Jimenez-Fernandez, F. Gomez-Rodriguez, A. Linares-Barranco, A spinnaker application: design, implementation and validation of scpgs, in: International Work-Conference on Artificial Neural Networks, Springer, 2017, pp. 548–559.
- [35] O. Rhodes, P. A. Bogdan, C. Brenninkmeijer, S. Davidson, D. Fellows, A. Gait, D. R. Lester, M. Mikaitis, L. A. Plana, A. G. Rowley,

et al., spynnaker: A software package for running pynn simulations on spinnaker, Frontiers in neuroscience 12.

- [36] D. Neil, S.-C. Liu, Minitaur, an event-driven fpga-based spiking network accelerator, IEEE Transactions on Very Large Scale Integration (VLSI) Systems 22 (12) (2014) 2621–2628.
- [37] D. Garbin, O. Bichler, E. Vianello, Q. Rafhay, C. Gamrat, L. Perniola, G. Ghibaudo, B. DeSalvo, Variability-tolerant convolutional neural network for pattern recognition applications based on oxram synapses, in: 2014 IEEE International Electron Devices Meeting, IEEE, 2014, pp. 28–4.
- [38] K. Mizuseki, G. Buzsáki, Preconfigured, skewed distribution of firing rates in the hippocampus and entorhinal cortex, Cell reports 4 (5) (2013) 1010–1021.
- [39] A. Roxin, N. Brunel, D. Hansel, G. Mongillo, C. van Vreeswijk, On the distribution of firing rates in networks of cortical neurons, Journal of Neuroscience 31 (45) (2011) 16217–16226.