Abstract:
This article presents the application-aware real-time edge acceleration of CNNs (AWARE-CNNs) accelerators, which is a novel architecture design methodology for real-time ...Show MoreMetadata
Abstract:
This article presents the application-aware real-time edge acceleration of CNNs (AWARE-CNNs) accelerators, which is a novel architecture design methodology for real-time execution of deep learning algorithms on IoT devices. AWARE leverages the reconfigurability of field-programmable gate arrays (FPGAs) to create application-specific architectures customized to match the inherent dataflow of targeted deep neural networks and user-specified real-time requirements. The customized datapath is combined with a customized memory path to guarantee deterministic latency-aware execution over streaming data. For results and evaluation, we have developed a Chisel-based implementation of AWARE-CNN with a full integrative framework for application-specific architecture generation and synthesis (AWARE-CNN architecture compiler). Our results demonstrate the ability to execute Tiny DarkNet and shallow MobileNet inference at 120 frames/s (FPS) and 75 FPS, using only 2.8 and 3.4 W, respectively, on a Xilinx XCZU9EG FPGA. In addition, AWARE-CNN framework's flexibility with respect to the targeted convolutional neural networks and user constraints is validated by targeting additional design points for AlexNet (as a baseline network) and Tiny YOLOv2.
Published in: IEEE Internet of Things Journal ( Volume: 7, Issue: 10, October 2020)