Abstract:
As neural network (NN) training cost red has been growing exponentially over the past decade, developing high-speed and energy-efficient training methods has become an ur...Show MoreMetadata
Abstract:
As neural network (NN) training cost red has been growing exponentially over the past decade, developing high-speed and energy-efficient training methods has become an urgent task. Fine-grained mixed-precision low-bit training is the most promising way for high-efficiency training, but it needs dedicated processor designs to overcome the overhead in control, storage, and I/O and remove the power bottleneck in floating-point (FP) units. This article presents a dynamic execution NN processor supporting fine-grained mixed-precision training through an online quantization sensitivity analysis. Three key features are proposed: the quantization-sensitivityaware dynamic execution controller, dynamic bit-width adaptive datapath design, and the low-power multi-level-aligned block- FP unit (BFPU). This chip achieves 13.2-TFLOPS/W energy efficiency and 1.07-TFLOPS/mm2 area efficiency.
Published in: IEEE Journal of Solid-State Circuits ( Volume: 59, Issue: 9, September 2024)