Abstract:
Growing computational demands from deep neural networks (DNNs), coupled with diminishing returns from general-purpose architectures, have led to a proliferation of Neural...Show MoreMetadata
Abstract:
Growing computational demands from deep neural networks (DNNs), coupled with diminishing returns from general-purpose architectures, have led to a proliferation of Neural Processing Units (NPUs). This paper describes the Project Brainwave NPU (BW-NPU), a parameterized microarchitecture specialized at synthesis time for convolutional and recurrent DNN workloads. The BW-NPU deployed on an Intel Stratix 10 280 FPGA achieves sustained performance of 35 teraflops at a batch size of 1 on a large recurrent neural network (RNN).
Published in: IEEE Micro ( Volume: 39, Issue: 3, 01 May-June 2019)