A 22nm 54.94TFLOPS/W Transformer Fine-Tuning Processor with Exponent-Stationary Re-Computing, Aggressive Linear Fitting, and Logarithmic Domain Multiplicating | IEEE Conference Publication | IEEE Xplore

A 22nm 54.94TFLOPS/W Transformer Fine-Tuning Processor with Exponent-Stationary Re-Computing, Aggressive Linear Fitting, and Logarithmic Domain Multiplicating

Publisher: IEEE

Abstract:

This paper proposes a Transformer-based processor supporting energy-efficient fine-tuning with batch-iteration-matrix multi-level optimizations. It has three key features...View more

Abstract:

This paper proposes a Transformer-based processor supporting energy-efficient fine-tuning with batch-iteration-matrix multi-level optimizations. It has three key features: 1) An exponent-stationary re-computing scheduler (ESRC) reduces 44.2% of the storage requirement for each batch. 2) An aggressive linear fitting unit (ALFU) saves 47.4% of the computations in each iteration. 3) A logarithmic domain processing element (LDPE) decreases 36.3% of energy for matrix multiplications (MM) in fine-tuning. The proposed Transformer processor achieves an energy efficiency of 54.94TFLOPS/W. It reduces fine-tuning energy by 4.27× and offers 3.57× speedup for GPT-2.
Date of Conference: 16-20 June 2024
Date Added to IEEE Xplore: 26 August 2024
ISBN Information:

ISSN Information:

Publisher: IEEE
Conference Location: Honolulu, HI, USA

Funding Agency:


References

References is not available for this document.