Abstract:
This paper proposes a Transformer-based processor supporting energy-efficient fine-tuning with batch-iteration-matrix multi-level optimizations. It has three key features...View moreMetadata
Abstract:
This paper proposes a Transformer-based processor supporting energy-efficient fine-tuning with batch-iteration-matrix multi-level optimizations. It has three key features: 1) An exponent-stationary re-computing scheduler (ESRC) reduces 44.2% of the storage requirement for each batch. 2) An aggressive linear fitting unit (ALFU) saves 47.4% of the computations in each iteration. 3) A logarithmic domain processing element (LDPE) decreases 36.3% of energy for matrix multiplications (MM) in fine-tuning. The proposed Transformer processor achieves an energy efficiency of 54.94TFLOPS/W. It reduces fine-tuning energy by 4.27× and offers 3.57× speedup for GPT-2.
Date of Conference: 16-20 June 2024
Date Added to IEEE Xplore: 26 August 2024
ISBN Information:
ISSN Information:
Conference Location: Honolulu, HI, USA