Abstract:
Following the remarkable performance demonstrated by the Transformer architecture in the field of computer vision as well as natural language processing (NLP), there is a...Show MoreMetadata
Abstract:
Following the remarkable performance demonstrated by the Transformer architecture in the field of computer vision as well as natural language processing (NLP), there is a growing demand for embedded systems capable of executing Vision Transformer (ViT) applications as well as Convolutional Neural Network (CNN) applications efficiently. Since CNN accelerators are already widely used commercially, this paper explores the possibility of using existing CNN accelerators to support ViT rather than developing separate accelerators for each. CNN accelerators inherently have some limitations in efficiently handling operations in transformers: matrix multiplication (MM) operations with two non-constant matrices and nonlinear operations. To overcome these limitations, we first propose a novel technique to efficiently handle MM operations without special reshaping hardware in an adder-tree type CNN accelerator. And we propose an optimal scheduling method to minimize the idle time caused by offloading computation of nonlinear operations of the Transformer. Additionally, we investigate the possibility of executing layer normalization and GELU operations on the accelerator with minor extensions. The experimental results validate the effectiveness of the proposed methods.
Date of Conference: 18-20 November 2024
Date Added to IEEE Xplore: 02 January 2025
ISBN Information: