Heuristic-based automatic pruning of deep neural networks

Choudhary, Tejalal; Mishra, Vipul; Goswami, Anurag; Sarangapani, Jagannathan

doi:10.1007/s00521-021-06679-z

Heuristic-based automatic pruning of deep neural networks

Original Article
Published: 10 January 2022

Volume 34, pages 4889–4903, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Tejalal Choudhary¹,
Vipul Mishra ORCID: orcid.org/0000-0002-3649-1388¹,
Anurag Goswami¹ &
…
Jagannathan Sarangapani²

630 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

The performance of a deep neural network (deep NN) is dependent upon a significant number of weight parameters that need to be trained which is a computational bottleneck. The growing trend of deeper architectures poses a restriction on the training and inference scheme on resource-constrained devices. Pruning is an important method for removing the deep NN’s unimportant parameters and making their deployment easier on resource-constrained devices for practical applications. In this paper, we proposed a heuristics-based novel filter pruning method to automatically identify and prune the unimportant filters and make the inference process faster on devices with limited resource availability. The selection of the unimportant filters is made by a novel pruning estimator (\(\gamma\)). The proposed method is tested on various convolutional architectures AlexNet, VGG16, ResNet34, and datasets CIFAR10, CIFAR100, and ImageNet. The experimental results on a large-scale ImageNet dataset show that the FLOPs of the VGG16 can be reduced up to 77.47%, achieving \(\approx ~5x\) inference speedup. The FLOPs of a more popular ResNet34 model are reduced by 41.94% while retaining competitive performance compared to other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Hardware-Aware Evolutionary Explainable Filter Pruning for Convolutional Neural Networks

Article Open access 22 February 2024

DualPrune: A Dual Purpose Pruning of Convolutional Neural Networks for Resource-Constrained Devices

Recursive least squares method for training and pruning convolutional neural networks

Article Open access 26 July 2023

References

Ayinde BO, Inanc T, Zurada JM (2019) Redundant feature pruning for accelerated inference in deep neural networks. Neural Netw 118:148–158
Article Google Scholar
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article Google Scholar
Cai Z, He X, Sun J, Vasconcelos N (2017) Deep learning with low precision by half-wave gaussian quantization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5918–5926
Chen G, Choi W, Yu X, Han T, Chandraker M (2017) Learning efficient object detection models with knowledge distillation. In: Advances in neural information processing systems, pp 742–751
Chen S, Zhao Q (2018) Shallowing deep networks: layer-wise pruning based on feature representations. IEEE Trans Pattern Anal Mach Intell
Cheng J, Wu J, Leng C, Wang Y, Hu Q (2017) Quantized CNN: a unified approach to accelerate and compress convolutional networks. IEEE Trans Neural Netw Learn Syst
Cheng Y, Wang D, Zhou P, Zhang T (2017b) A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282
Cheng Y, Wang D, Zhou P, Zhang T (2018) Model compression and acceleration for deep neural networks: the principles, progress, and challenges. IEEE Signal Process Mag 35(1):126–136
Article Google Scholar
Choudhary T, Mishra V, Goswami A, Sarangapani J (2020) A comprehensive survey on model compression and acceleration. Artif Intell Rev 53:5113–5155
Article Google Scholar
Courbariaux M, Bengio Y, David JP (2015) Binaryconnect: Training deep neural networks with binary weights during propagations. In: Advances in neural information processing systems, pp 3123–3131
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255
Denil M, Shakibi B, Dinh L, De Freitas N, et al (2013) Predicting parameters in deep learning. In: Advances in neural information processing systems, pp 2148–2156
Girshick R (2015) Faster r-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Gong Y, Liu L, Yang M, Bourdev L (2015) Compressing deep convolutional networks using vector quantization. Under review as a conference paper at ICLR
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Graves A, Ar Mohamed, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 6645–6649
Guo Y, Yao A, Chen Y (2016) Dynamic network surgery for efficient DNNs. In: Advances In neural information processing systems, pp 1379–1387
Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. In: Advances in neural information processing systems, pp 1135–1143
Han S, Mao H, Dally WJ (2016) Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. Published as a conference paper at ICLR
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He Y, Zhang X, Sun J (2017) Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1389–1397
He Y, Kang G, Dong X, Fu Y, Yang Y (2018) Soft filter pruning for accelerating deep convolutional neural networks. arXiv preprint arXiv:1808.06866
He Y, Dong X, Kang G, Fu Y, Yan C, Yang Y (2019) Asymptotic soft filter pruning for deep convolutional neural networks. IEEE Trans Cybern 50(8):3594–3604
Article Google Scholar
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531
Horowitz M (2014) 1.1 computing’s energy problem (and what we can do about it). In: 2014 IEEE international solid-state circuits conference digest of technical papers (ISSCC). IEEE, pp 10–14
Huang Q, Zhou K, You S, Neumann U (2018) Learning to prune filters in convolutional neural networks. In: 2018 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 709–718
Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks. In: Advances in neural information processing systems, pp 4107–4115
Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2017) Quantized neural networks: training neural networks with low precision weights and activations. J Mach Learn Res 18(1):6869–6898
MathSciNet MATH Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
Jaderberg M, Vedaldi A, Zisserman A (2014). Speeding up convolutional neural networks with low rank expansions. In: Proceedings of the British machine vision conference. BMVA Press
Krizhevsky A (2014) One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Tech. rep, Citeseer
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
LeCun Y, Denker JS, Solla SA (1990) Optimal brain damage. In: Advances in neural information processing systems, pp 598–605
Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2017) Pruning filters for efficient convnets. Published as a conference paper at ICLR
Liu S, Lin Y, Zhou Z, Nan K, Liu H, Du J (2018). On-demand deep model compression for mobile devices: a usage-driven model selection framework. In: Proceedings of the 16th annual international conference on mobile systems, applications, and services, pp 389–400
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Liu Z, Li J, Shen Z, Huang G, Yan S, Zhang C (2017) Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE international conference on computer vision, pp 2736–2744
Liu Z, Sun M, Zhou T, Huang G, Darrell T (2019) Rethinking the value of network pruning. Published as a conference paper at ICLR
Luo JH, Zhang H, Zhou HY, Xie CW, Wu J, Lin W (2018) Thinet: Pruning CNN filters for a thinner net. IEEE Trans Pattern Anal Mach Intell https://doi.org/10.3390/electronics9081209
Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. Published as a conference paper at ICLR
Srinivas S, Babu RV (2015) Data-free parameter pruning for deep neural networks. arXiv preprint arXiv:1507.06149
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Swaminathan S, Garg D, Kannan R, Andres F (2020) Sparse low rank factorization for deep neural network compression. Neurocomputing. https://doi.org/10.1016/j.neucom.2020.02.035
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Tung F, Mori G (2018) Deep neural network compression by in-parallel pruning-quantization. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2018.2886192
Wu X, Wu Y, Z Y (2016) Binarized neural networks on the imagenet classification task. arXiv preprint arXiv:1604.03058
Yang TJ, Chen YH, Sze V (2017) Designing energy-efficient convolutional neural networks using energy-aware pruning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5687–5695
Yu R, Li A, Chen CF, Lai JH, Morariu VI, Han X, Gao M, Lin CY, Davis LS (2018) Nisp: Pruning networks using neuron importance score propagation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9194–9203
Zhang C, Bengio S, Hardt M, Recht B, Vinyals O (2016a) Understanding deep learning requires rethinking generalization. arXiv preprint arXiv:1611.03530
Zhang X, Zou J, He K, Sun J (2016) Accelerating very deep convolutional networks for classification and detection. IEEE Trans Pattern Anal Mach Intell 38(10):1943–1955
Zhao C, Ni B, Zhang J, Zhao Q, Zhang W, Tian Q (2019) Variational convolutional neural network pruning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2780–2789
Zhou A, Yao A, Guo Y, Xu L, Chen Y (2017) Incremental network quantization: towards lossless CNNs with low-precision weights. arXiv preprint arXiv:1702.03044
Zhou Y, Zhang Y, Wang Y, Tian Q (2019) Accelerate CNN via recursive bayesian pruning. In: Proceedings of the IEEE international conference on computer vision, pp 3306–3315
Zhu M, Gupta S (2017) To prune, or not to prune: exploring the efficacy of pruning for model compression. arXiv preprint arXiv:1710.01878

Download references

Author information

Authors and Affiliations

Bennett University, Greater Noida, 201310, India
Tejalal Choudhary, Vipul Mishra & Anurag Goswami
Missouri University of Science and Technology, Rolla, MO, 65409, USA
Jagannathan Sarangapani

Authors

Tejalal Choudhary
View author publications
You can also search for this author in PubMed Google Scholar
Vipul Mishra
View author publications
You can also search for this author in PubMed Google Scholar
Anurag Goswami
View author publications
You can also search for this author in PubMed Google Scholar
Jagannathan Sarangapani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vipul Mishra.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Choudhary, T., Mishra, V., Goswami, A. et al. Heuristic-based automatic pruning of deep neural networks. Neural Comput & Applic 34, 4889–4903 (2022). https://doi.org/10.1007/s00521-021-06679-z

Download citation

Received: 20 January 2021
Accepted: 27 October 2021
Published: 10 January 2022
Issue Date: March 2022
DOI: https://doi.org/10.1007/s00521-021-06679-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Heuristic-based automatic pruning of deep neural networks

Abstract

Access this article

Similar content being viewed by others

Hardware-Aware Evolutionary Explainable Filter Pruning for Convolutional Neural Networks

DualPrune: A Dual Purpose Pruning of Convolutional Neural Networks for Resource-Constrained Devices

Recursive least squares method for training and pruning convolutional neural networks

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Heuristic-based automatic pruning of deep neural networks

Abstract

Access this article

Similar content being viewed by others

Hardware-Aware Evolutionary Explainable Filter Pruning for Convolutional Neural Networks

DualPrune: A Dual Purpose Pruning of Convolutional Neural Networks for Resource-Constrained Devices

Recursive least squares method for training and pruning convolutional neural networks

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation