Abstract:
Deep Neural Networks (DNNs) become the state-of-the-art in several domains such as computer vision or speech recognition. However, using DNNs for embedded applications is...Show MoreMetadata
Abstract:
Deep Neural Networks (DNNs) become the state-of-the-art in several domains such as computer vision or speech recognition. However, using DNNs for embedded applications is still strongly limited because of their complexity and the energy required to process large data sets. In this paper, we present the architecture of an accelerator for quantized neural networks and its implementation on a Nallatech 385-A7 board with an Altera Stratix V GX A7 FPGA. The accelerator's design centers around the matrix-vector product as the key primitive, and exploits bit-slicing to extract maximum performance using low-precision arithmetic.
Date of Conference: 26-29 May 2019
Date Added to IEEE Xplore: 01 May 2019
Print ISBN:978-1-7281-0397-6
Print ISSN: 2158-1525