Processing math: 50%
SoftAct: A High-Precision Softmax Architecture for Transformers Supporting Nonlinear Functions | IEEE Journals & Magazine | IEEE Xplore

SoftAct: A High-Precision Softmax Architecture for Transformers Supporting Nonlinear Functions


Abstract:

Transformer-based deep learning networks are revolutionizing our society. The convolution and attention co-designed (CAC) Transformers have demonstrated superior performa...Show More

Abstract:

Transformer-based deep learning networks are revolutionizing our society. The convolution and attention co-designed (CAC) Transformers have demonstrated superior performance compared to the conventional Transformer-based networks. However, CAC Transformer networks contain various nonlinear functions, such as softmax and complex activation functions, which require high precision hardware design yet typically with significant cost in area and power consumption. To address these challenges, SoftAct, a compact and high-precision algorithm-hardware co-designed architecture, is proposed to implement both softmax and nonlinear activation functions in CAC Transformer accelerators. An improved softmax algorithm with penalties is proposed to maintain precision in hardware. A stage-wise full zero detection method is developed to skip redundant computation in softmax. A compact and reconfigurable architecture with a symmetrically designed linear fitting module is proposed to achieve nonlinear functions. The SoftAct architecture is designed in an industrial 28-nm CMOS technology with the MobileViT-xxs network classifying the ImageNet-1k dataset as the benchmark. Compared with the state of the art, SoftAct improves up to 5.87% network accuracy under 8-bit quantization, 153.2\times area efficiency, and 1435\times overall efficiency.
Published in: IEEE Transactions on Circuits and Systems for Video Technology ( Volume: 34, Issue: 9, September 2024)
Page(s): 8912 - 8923
Date of Publication: 09 April 2024

ISSN Information:

Funding Agency:


References

References is not available for this document.