Hybrid-Grained Pruning and Hardware Acceleration for Convolutional Neural Networks | IEEE Conference Publication | IEEE Xplore

Hybrid-Grained Pruning and Hardware Acceleration for Convolutional Neural Networks


Abstract:

Throughout various convolutional neural network (CNN) models, the sparsity increases as the network deepens, which poses significant potential to model compression and ha...Show More

Abstract:

Throughout various convolutional neural network (CNN) models, the sparsity increases as the network deepens, which poses significant potential to model compression and hardware acceleration. In this paper, a dual-factor hybrid-grained pruning method is introduced to make a good balance between model compression and accuracy preservation. The pro-posed pruning method combines hardware-friendly unstructured vector-level pruning with structured filter-level pruning to explore multiple grains of sparsity in CNNs. The architecture of the corresponding hardware accelerator is then proposed based on the row-based convolution dataflow, which could fully utilize the hybrid sparsity to accelerate CNN processing. Experimental results demonstrate that the proposed method increases the compression rate by 1.08× while causing 0.21% accuracy loss compared to the state-of-the-art filter pruning method in VGG16, and 2.39% hardware resource increase compared to the accelerator without sparsity optimization.
Date of Conference: 19-22 May 2024
Date Added to IEEE Xplore: 02 July 2024
ISBN Information:

ISSN Information:

Conference Location: Singapore, Singapore

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.