Abstract:
Untethered computing using deep convolutional neural networks (DCNNs) at the edge of IoT with limited resources requires systems that are exceedingly power and area-effic...Show MoreMetadata
Abstract:
Untethered computing using deep convolutional neural networks (DCNNs) at the edge of IoT with limited resources requires systems that are exceedingly power and area-efficient. Analog in-memory matrix-matrix multiplications enabled by emerging memories can significantly reduce the energy budget of such systems and result in compact accelerators. In this article, we report a high-throughput RRAM-based DCNN processor that boasts 7.12\mathbf {\times } area-efficiency (AE) and 6.52\mathbf {\times } power-efficiency (PE) enhancements over state-of-the-art accelerators. We achieve this by coupling a novel in-memory computing methodology with a staggered-3D memristor array. Our variation-tolerant in-memory compute method, which performs operations on signed floating-point numbers within a single array, leverages charge domain operations and conductance discretization to reduce peripheral overheads. Voltage pulses applied at the staggered bottom electrodes of the 3D-array generate a concurrent input shift and parallelize convolution operations to boost throughput. The high density and low footprint of the 3D-array, along with the modified in-memory M2M execution, improve peak AE to 9.1TOPsmm−2 while the elimination of input regeneration improves PE to 10.6TOPsW−1. This work provides a path towards infallible RRAM-based hardware accelerators that are fast, low power, and low area.
Published in: IEEE Internet of Things Journal ( Volume: 8, Issue: 11, 01 June 2021)