Journals & Magazines >IEEE Transactions on Industri... >Volume: 20 Issue: 8

BinaryFormer: A Hierarchical-Adaptive Binary Vision Transformer (ViT) for Efficient Computing

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Vision Transformer (ViT) has recently demonstrated impressive nonlinear modeling capabilities and achieved state-of-the-art performance in various industrial applications...Show More

Metadata

Abstract:

Vision Transformer (ViT) has recently demonstrated impressive nonlinear modeling capabilities and achieved state-of-the-art performance in various industrial applications, such as object recognition, anomaly detection, and robot control. However, their practical deployment can be hindered by high storage requirements and computational intensity. To alleviate these challenges, we propose a binary transformer called BinaryFormer, which quantizes the learned weights of the ViT module from 32-b precision to 1 b. Furthermore, we propose a hierarchical-adaptive architecture that replaces expensive matrix operations with more affordable addition and bit operations by switching between two attention modes. As a result, BinaryFormer is able to effectively compress the model size as well as reduce the computation cost of ViT. Experimental results on the ImageNet-1K benchmark datasets show that BinaryFormer reduces the size of a typical ViT model by an average of 27.7× and converts over 99% of multiplication operations into bit operations while maintaining reasonable accuracy.

Published in: IEEE Transactions on Industrial Informatics ( Volume: 20, Issue: 8, August 2024)

Page(s): 10657 - 10668

Date of Publication: 15 May 2024

ISSN Information:

DOI: 10.1109/TII.2024.3396520

Funding Agency:

Contents

References is not available for this document.

BinaryFormer: A Hierarchical-Adaptive Binary Vision Transformer (ViT) for Efficient Computing

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

BinaryFormer: A Hierarchical-Adaptive Binary Vision Transformer (ViT) for Efficient Computing

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?