Abstract:
This paper presents a configurable, versatile and flexible architecture for hardware acceleration of convolutional neural networks (CNNs) that is based on storing and acc...Show MoreMetadata
Abstract:
This paper presents a configurable, versatile and flexible architecture for hardware acceleration of convolutional neural networks (CNNs) that is based on storing and accumulating entire feature maps in local memory inside the accelerator. This has been done while aiming to be able to process any type of CNN while consuming as low power as possible and achieving the highest possible energy efficiency. Energy efficiency refers to the number of operations per unit energy (measured in Multiply-Accumulate operations per unit energy, MACs/s/W or MACs/J). Mainly, two different versions of the architecture have been synthesized and tested using different configurations of hardware parameters. It performs well when compared to the state-of-the-art, achieving an improved energy efficiency of over a factor 5 for select CNN layers. The most energy efficient configuration achieves 175 GMACs/s/W, while consuming 2.3 mW of power and occupying 585 KGEs (Kilo Gate Equivalents) of area at 1V supply voltage and a clock frequency of 100MHz.
Published in: 2019 IEEE Nordic Circuits and Systems Conference (NORCAS): NORCHIP and International Symposium of System-on-Chip (SoC)
Date of Conference: 29-30 October 2019
Date Added to IEEE Xplore: 21 November 2019
ISBN Information: