Loading [MathJax]/extensions/MathMenu.js
Area Efficient Compression for Floating-Point Feature Maps in Convolutional Neural Network Accelerators | IEEE Journals & Magazine | IEEE Xplore

Area Efficient Compression for Floating-Point Feature Maps in Convolutional Neural Network Accelerators


Abstract:

Since Convolutional neural networks (CNNs) need massive computing resources, lots of computing architectures are proposed to improve the throughput and energy efficiency ...Show More

Abstract:

Since Convolutional neural networks (CNNs) need massive computing resources, lots of computing architectures are proposed to improve the throughput and energy efficiency of the computing. However, those computing architectures need high data movement between the chip and off-chip memories, which causes high energy consumption on the off-chip memory; thus, the feature map (fmap) compression has been discussed for reducing the data movement. Therefore, the design of fmap compression becomes one of the main researches on CNN accelerator for energy efficiency on the off-chip memory. In this brief, we proposed floating-point (FP) fmap compression for a hardware accelerator which includes hardware design and a compression algorithm. This can apply quantization methods such as ternary neural quantization (TTQ), which only quantized weights with little or no degradation in accuracy and reduces the computation cost. In addition to the zero compression, we also compress nonzero values in the fmap based on the FP format. The compression algorithm achieves low area overhead and a similar compression ratio compared with the state-of-the-art on ILSVRC 2012 dataset.
Published in: IEEE Transactions on Circuits and Systems II: Express Briefs ( Volume: 70, Issue: 2, February 2023)
Page(s): 746 - 750
Date of Publication: 12 October 2022

ISSN Information:

Funding Agency:


References

References is not available for this document.