Abstract:
Vision Transformers, evolved from the attention based transformer architectures, have obtained elevated attention as a competitive alternative to convolutional neural net...Show MoreMetadata
Abstract:
Vision Transformers, evolved from the attention based transformer architectures, have obtained elevated attention as a competitive alternative to convolutional neural networks in computer vision learning tasks. Although showing promising accuracy, vision transformers face challenges due to large model size and intensive computation workload with millions of parameters, inhibiting their practical deployment on resource constrained platforms.In this paper, we investigate trivial computations - a set of computations the results of which can be determined without actual computations - in vision transformers. Typical examples include multiplication with 0, +1/-1 and addition with 0. Existing studies have leveraged the existence of trivial computations to save energy or accelerate execution. Based on this, we present AxBy-ViT, a reconfigurable approximate computation bypass scheme for vision transformers that leverages approximate matching of bit representations of operands. Case study on vision transformers using the ImageNet dataset shows that with 2.4% accuracy loss, AxBy-ViT can match up to 28.69% of the operand for computation bypass. The code repository of AxBy-ViT can be found at https://github.com/VU-DETAIL/AxBy-ViT.
Date of Conference: 06-07 April 2022
Date Added to IEEE Xplore: 29 June 2022
ISBN Information: