Abstract:
FPGAs are commercially available off-the-shelf for implementing convolutional neural network (CNN) accelerators to trade off accuracy, performance, and power. Systolic ar...View moreMetadata
Abstract:
FPGAs are commercially available off-the-shelf for implementing convolutional neural network (CNN) accelerators to trade off accuracy, performance, and power. Systolic array architecture for CNN accelerators on FPGAs has the potential to run at a high frequency due to its regular and simple interconnections. However, current FPGA CAD tools are unable to synthesize and layout systolic arrays in high quality. In this paper, we identify the reasons for the frequency degradation of systolic array designs for CNN accelerators. We also propose two methods to improve the frequency at the front-end and the back-end, respectively. The experimental results show that our methods are able to achieve 1.29 × higher frequency and attain 1.5TOPS for the VGG16 network on the Xilinx KCU1500 platform.
Date of Conference: 26-29 May 2019
Date Added to IEEE Xplore: 01 May 2019
Print ISBN:978-1-7281-0397-6
Print ISSN: 2158-1525