Loading web-font TeX/Main/Regular
FreFlex: A High-Performance Processor for Convolution and Attention Computations via Sparsity-Adaptive Dynamic Frequency Boosting | IEEE Journals & Magazine | IEEE Xplore

FreFlex: A High-Performance Processor for Convolution and Attention Computations via Sparsity-Adaptive Dynamic Frequency Boosting


Abstract:

A high degree of sparsity in machine learning (ML) models has been highlighted as a significant opportunity to improve energy and delay efficiencies by skipping the compu...Show More

Abstract:

A high degree of sparsity in machine learning (ML) models has been highlighted as a significant opportunity to improve energy and delay efficiencies by skipping the computation of zero elements in operands. Despite the potential, its unstructured positions of zeros and a wide range of sparsity make it challenging to exploit this nature in hardware implementations that are often built on regular structures. To address these challenges, this article presents a low-power and high-performance AI accelerator, the so-called FreFlex, via sparsity-adaptive dynamic frequency modulation (SA-DFM) conjointly with the proposed processing element (PE) in a 2-D systolic array. The sparsity of each layer is determined by counting zero elements from the output while the layer is being computed. Then, the clock frequency is optimally modulated based on the sparsity level obtained from the previous layer’s output, which becomes an input of the next layer. The unused power slack due to the sparsity is exploited to boost performance while fully using the power budget. The proposed technique achieves up to 1.8 \times performance improvement by exploiting the sparsity while incurring less than 7% power overhead, even when there is no sparsity. The silicon prototype, fabricated in a 65-nm CMOS node, demonstrates 0.6–1.0-TOPS/W efficiency for convolution and attention computations, with a performance of 160 GOPS/s/mm2 with a maximum frequency of 1.1 GHz.
Published in: IEEE Journal of Solid-State Circuits ( Volume: 59, Issue: 3, March 2024)
Page(s): 855 - 866
Date of Publication: 22 December 2023

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.