Skip to main content
Log in

Deep neural network pruning method based on sensitive layers and reinforcement learning

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

It is of great significance to compress neural network models so that they can be deployed on resource-constrained embedded mobile devices. However, due to the lack of theoretical guidance for non-salient network components, existing model compression methods are inefficient and labor-intensive. In this paper, we propose a new pruning method to achieve model compression. By exploring the rank ordering of the feature maps of convolutional layers, we introduce the concept of sensitive layers and treat layers with more low-rank feature maps as sensitive layers. We propose a new algorithm for finding sensitive layers while using reinforcement learning deterministic strategies to automate pruning for insensitive layers. Experimental results show that our method achieves significant improvements over the state-of-the-art in floating-point operations and parameter reduction, with lower precision loss. For example, using ResNet-110 on CIFAR-10 achieves a 62.2% reduction in floating-point operations by removing 63.9% of parameters. When testing ResNet-50 on ImageNet, our method reduces floating-point operations by 53.8% by deleting 39.9% of the parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available in the Cifar-10 and ImageNet repository, [http://www.cs.toronto.edu/~kriz/cifar.html, https://image-net.org/].

References

  • Ashok A, Rhinehart N, Beainy F, Kitani KM (2017) N2n learning: network to network compression via policy gradient reinforcement learning. ArXiv preprint arXiv:1709.06030

  • Brock A, Lim T, Ritchie JM, Weston N, Smash (2017). One-shot model architecture search through hypernetworks. ArXiv preprint arXiv:1708.05344

  • Cai H, Chen T, Zhang W, Wang J (2017) Reinforcement learning for architecture search by network transformation. ArXiv preprint arXiv:1707.04873

  • Carreira-Perpinan MA, Idelbayev Y (2018) Learning compression algorithms for neural net pruning. In: Computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2018.00890

  • Chen W, Wilson JT, Tyree S, Weinberger KQ, Chen Y (2015) Compressing neural networks with the hashing trick. Int Conf Mach Learn. https://doi.org/10.5555/3045118.3045361

    Article  Google Scholar 

  • Chen T, Goodfellow I, Shlens J (2015b) Net2net: accelerating learning via knowledge transfer. ArXiv preprint arXiv:1511.05641

  • Chollet F. Xception (2016) Deep learning with depthwise separable convolutions. ArXiv preprint arXiv:1610.02357

  • Choudhary T, Mishra V, Goswami A et al (2020) A comprehensive survey on model compression and acceleration. Artif Intell Rev. https://doi.org/10.1007/s10462-020-09816-7

    Article  Google Scholar 

  • Cong S, Zhou Y (2023) A review of convolutional neural network architectures and their optimizations. Artif Intell Rev. https://doi.org/10.1007/s10462-022-10213-5

    Article  Google Scholar 

  • Denton EL, Zaremba W, Bruna J, LeCun Y, Fergus R (2014) Exploiting linear structure within convolutional networks for efficient evaluation. In: Neural information processing systems (NeurIPS). https://doi.org/10.5555/2968826.2968968

  • Emmons S, Eysenbach B, Kostrikov I et al (2021) RvS: what is essential for offline RL via supervised learning? ArXiv preprint arxiv.org/abs/2112.10751

  • Han S, Mao H, Dally WJ (2015a) Deep compression: compressing deep neural networks with pruning,trained quantization and huffman coding. ArXiv preprint arxiv.org/abs/1510.00149v5

  • Han S, Pool J, Tran J, Dally WJ (2015b) Learning both weights and connections for efficient neural network. In: Neural information processing systems (NeurIPS). https://doi.org/10.5555/2969239.2969366

  • Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ (2016) Eie: efficient inference engine on compressed deep neural network. In: International conference on computer architecture (ISCA). https://doi.org/10.1109/isca.2016.30

  • He K, Zhang X, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: Computer vision and pattern recognition (CVPR). pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

  • He K, Zhang X, Ren S, Sun J (2016b) Deep residual learning for image recognition. In: Computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2016.90

  • He Y, Kang G, Dong X, Fu Y, Yang Y (2018a) Soft filter pruning for accelerating deep convolutional neural networks. In: International joint conference on artificial intelligence (IJCAI). https://doi.org/10.5555/3304889.3304970

  • He Y, Lin J, Liu Z et al (2018b) Amc: Automl for model compression and acceleration on mobile devices. In: The European conference on computer vision (ECCV), pp. 784–800. https://doi.org/10.48550/arXiv.1802.03494

  • Huang Z, Wang N (2018) Data-driven sparse structure selection for deep neural networks. ArXiv preprint arXiv:1707.01213

  • Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2017.243

  • Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images. In: Technical report, Citeseer

  • Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2017) Pruning filters for efficient convnets. ArXiv preprint arXiv:1608.08710

  • Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. ArXiv preprint arXiv:1509.02971

  • Lin S, Ji R, Yan C, Zhang B, Cao L, Ye Q, Huang F, Doermann D (2019) Towards optimal structured cnn pruning via generative adversarial learning. In: Computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2019.00290

  • Lin M, Ji R, Wang Y, Zhang Y (2020a) HRank: filter pruning using high-rank feature map. ArXiv preprint arXiv:2002.10179v2

  • Lin M, Ji R, Zhang Y, Zhang B, Wu Y, Tian Y (2020b) Channel pruning via automatic structure search. ArXiv preprint arXiv:2001.08565

  • Liu N, Ma X, Xu Z, Wang Y, Tang J, Ye J (2019) AutoCompress: an automatic DNN structured pruning framework for ultra-high compression rates. In: AAAI Conference on artificial intelligence. https://doi.org/10.1609/aaai.v34i04.5924

  • Lopez-Martin M, Sanchez-Esguevillas A, Arribas JI et al (2021) Network intrusion detection based on extended rbf neural network with offline reinforcement learning. IEEE Access. https://doi.org/10.1109/ACCESS.2021.3127689

    Article  Google Scholar 

  • Luo JH, Wu J, Lin W (2017) Thinet: a filter level pruning method for deep neural network compression. ArXiv preprint arXiv:1707.06342

  • Miikkulainen R, Liang J, Meyerson E, Rawal A, Fink D, Francon O, Raju B, Navruzyan A, Duffy N, Hodjat B (2017) Evolving deep neural networks. ArXiv preprint arXiv:1703.00548

  • Park J, Li S, Wen W, Tang PTP, Li H, Chen Y, Dubey P (2016) Faster cnns with direct sparse convolutions and guided pruning. ArXiv preprint arXiv:1608.01409

  • Paszke A, Gross S, Chintala S, Chanan G et al (2017) Automatic differentiation in pytorch. In: Neural information processing systems (NeurIPS)

  • Real E, Moore S, Selle A, Saxena S, Suematsu YL, Le Q, Kurakin A (2017) Large-scale evolution of image classifiers. ArXiv preprint arXiv:1703.01041

  • Russakovsky O, Deng J, Su H, Krause J et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis. https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  • Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ArXiv preprint arXiv:1409.1556

  • Stanley KO, Miikkulainen R (2017) Evolving neural networks through augmenting topologies. Evolut Comput. https://doi.org/10.1162/106365602320169811

    Article  Google Scholar 

  • Su W, Li L, Liu F et al (2022) AI on the edge: a comprehensive review. Artif Intell Rev. https://doi.org/10.1007/s10462-022-10141-4

    Article  Google Scholar 

  • Suau X, Zappella L, Palakkode V, Apostoloff N (2018) Principal filter analysis for guided network compression. ArXiv preprint arXiv:1807.10585

  • Vadera S, Ameen S (2021) Methods for pruning deep neural networks. ArXiv preprint arXiv:2011.00241v2

  • Wang J, Bai H, Wu J, Cheng J (2020) Bayesian automatic model compression. IEEE J Sel Top Signal Process. https://doi.org/10.1109/JSTSP.2020.2977090

    Article  Google Scholar 

  • Wang D, Zhou L, Zhang X, Bai X, Zhou J (2018) Exploring linear relationship in feature map subspace for convnets compression. ArXiv preprint arXiv:1803.05729

  • Ye J, Lu X, Lin Z, Wang JZ (2018) Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. ArXiv preprint arXiv:1802.00124

  • Zhan H, Lin WM, Cao Y (2021) Deep model compression via two-stage deep reinforcement learning. ArXiv preprint arXiv:1912.0225

Download references

Funding

This paper is supported by the National Natural Science Foundation of China (No.61936008).

Author information

Authors and Affiliations

Authors

Contributions

Wenchuan Yang, Haoran Yu, and Baojiang Cui wrote the main manuscript text ,Runqi Sui and Tianyu Gu prepared all the experiments. All authors reviewed the manuscript.

Corresponding author

Correspondence to Haoran Yu.

Ethics declarations

Conflict of interest

All authors certify that they have no affiliations or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, W., Yu, H., Cui, B. et al. Deep neural network pruning method based on sensitive layers and reinforcement learning. Artif Intell Rev 56 (Suppl 2), 1897–1917 (2023). https://doi.org/10.1007/s10462-023-10566-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-023-10566-5

Keywords

Navigation