Abstract:
Spiking Neural Networks (SNNs), known as the third generation of the neural network, are famous for their biological plausibility and brain-like characteristics. Recent e...Show MoreMetadata
Abstract:
Spiking Neural Networks (SNNs), known as the third generation of the neural network, are famous for their biological plausibility and brain-like characteristics. Recent efforts further demonstrate the potential of SNNs in high-speed inference by designing accelerators with the parallelism of temporal or spatial dimensions. However, with the limitation of hardware resources, the accelerator designs must utilize off-chip memory to store many intermediate data, which leads to both high power consumption and long latency. In this paper, we focus on the data flow between layers to improve arithmetic efficiency. Based on the spike discrete property, we design a convolution-pooling(CONVP) unit that fuses the processing of the convolutional layer and pooling layer to reduce latency and resource utilization. Furthermore, for the fully-connected layer, we apply intra-output parallelism and inter-output parallelism to accelerate network inference. We demonstrate the effectiveness of our proposed hardware architecture by implementing different SNN models with the different datasets on a Zynq XA7Z020 FPGA. The experiments show that our accelerator can achieve about x28 inference speed up with a competitive power compared with FPGA implementation on MNIST dataset and a x15 inference speed up with low power compared with ASIC design on DVSGesture dataset.
Date of Conference: 18-23 June 2023
Date Added to IEEE Xplore: 02 August 2023
ISBN Information: