Abstract:
Recent breakthroughs in Neural Networks (NNs) have made DNN accelerators ubiquitous and led to an ever-increasing quest on adopting them from Cloud to edge computing. How...Show MoreMetadata
Abstract:
Recent breakthroughs in Neural Networks (NNs) have made DNN accelerators ubiquitous and led to an ever-increasing quest on adopting them from Cloud to edge computing. However, state-of-the-art DNN accelerators pack immense computational power in a relatively confined area, inducing significant on-chip power densities that lead to intolerable thermal bottlenecks. Existing state of the art focuses on using approximate multipliers only to trade-off efficiency with inference accuracy. In this work, we present a thermal-aware approximate DNN accelerator design in which we additionally trade-off approximation with temperature effects towards designing DNN accelerators that satisfy tight temperature constraints. Using commercial multi-physics tool flows for heat simulations, we demonstrate how our thermal-aware approximate design reduces the temperature from 139^{\circ } C, in an accurate circuit, down to 79^{\circ } C. This enables DNN accelerators to fulfill tight thermal constraints, while still maximizing the performance and reducing the energy by around 75% with a negligible accuracy loss of merely 0.44% on average for a wide range of NN models. Furthermore, using physics-based transistor aging models, we demonstrate how reductions in voltage and temperature obtained by our approximate design considerably improve the circuit’s reliability. Our approximate design exhibits around 40% less aging-induced degradation compared to the baseline design.
Published in: IEEE Transactions on Computers ( Volume: 71, Issue: 10, 01 October 2022)