Abstract:
An energy-efficient deep-learning processor called DNPU is proposed for the embedded processing of convolutional neural networks (CNNs) and recurrent neural networks (RNN...Show MoreMetadata
Abstract:
An energy-efficient deep-learning processor called DNPU is proposed for the embedded processing of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) in mobile platforms. DNPU uses a heterogeneous multi-core architecture to maximize energy efficiency in both CNNs and RNNs. In each core, a memory architecture, data paths, and processing elements are optimized depending on the characteristics of each network. Also, a mixed workload division method is proposed to minimize off-chip memory access in CNNs, and a quantization table-based matrix multiplier is proposed to remove duplicated multiplications in RNNs.
Published in: IEEE Micro ( Volume: 38, Issue: 5, Sep./Oct. 2018)