Abstract:
We first propose a method to utilize the spatially linear correlation between activations and weights of CNN models and accelerate the inference process via linear regres...Show MoreMetadata
Abstract:
We first propose a method to utilize the spatially linear correlation between activations and weights of CNN models and accelerate the inference process via linear regression. With the proposed linear regression-based acceleration method, stronger bit-sparsity is excavated to reduce ineffectual computations during the inference in CNN acceleration systems without introducing any errors to the algorithm. We also propose the corresponding hardware accelerator based on the linear convolution mechanism: LINAC, which implements the linear convolution that exploits bit-sparsity to boost performance and energy efficiency significantly. In our experiments, LINAC boosts the inference performance by 11× on average over the present value-agnostic accelerator and outperforms other state-of-the-art bit-sparse accelerators.
Published in: IEEE Computer Architecture Letters ( Volume: 21, Issue: 1, 01 Jan.-June 2022)