Abstract
Filter pruning is one of the main methods of neural network model compression. Existing filter pruning methods rely more on experience based on manual techniques and are less efficient, while local optimal solutions are prone to appear based on greedy or heuristic algorithms. Some researchers use reinforcement learning to determine the compression strategy automatically but lack the guidance of network characteristics and prior knowledge, and the pruning efficiency needs to be improved. To this end, we propose a reinforcement learning pruning method based on prior knowledge to address these issues. Firstly, we rank the filters globally and obtain the position variables α, k, and the rank scaling factor r. Then, the relevant variables of the filter are passed into the deep deterministic policy gradient agent as prior knowledge through the concept of defined importance filters. Finally, a reinforcement learning-based automated pruning method is used to iteratively perform filter selection and parameter optimization. We verify the effectiveness of this method through extensive experiments. The experiments use three mainstream neural network models, including VGG, ResNet, and MobileNet, to compare the performance of our method with others on the CIFAR-10/100 and ImageNet datasets. On the ImageNet dataset, when the accuracy of MobileNetV2 is only reduced by 0.82%, there are only 59.62% of the original FLOPs and 48.4% of the parameters.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Chen J, Ran X (2019) Deep learning with edge computing: a review. Proc IEEE 107(8):1655–1674
Dauphin YN, Bengio Y (2013) Big neural networks waste capacity. In: 1st international conference on learning representations, Scottsdale, Arizona, USA
Frankle J, Carbin M J (2019) The lottery ticket hypothesis: finding sparse, trainable neural networks. In: 7th international conference on learning representations, New Orleans, LA, USA
Deng L, Li G, Han S et al (2020) Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc IEEE 108(4):485–532
Iandola FN, Han S, Moskewicz MW, et al (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size. arxiv abs/1602.07360
Howard AG, Zhu M, Chen B, et al (2017) MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861
Zhang X, Zhou X, Lin M, et al (2018) ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, Salt Lake city, UT, USA, pp 6848–6856
Han K, Wang Y, Tian Q, et al (2020) GhostNet: more features from cheap operations. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, Seattle, WA, USA, pp 1577–1586.
Gordon A, Eban E, Nachum O, et al (2018) MorphNet: fast & simple resource-constrained structure learning of deep networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, Salt Lake city, UT, USA, pp 1586–1595
Cichocki A, Mandic D, Lathauwer LD et al (2015) Tensor decompositions for signal processing applications from two-way to multiway component analysis. IEEE Signal Process Mag 32(2):145–163
Ye J, Wang L, Li G, et al (2018) learning compact recurrent neural networks with block-term tensor decomposition. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, Salt Lake City, UT, USA, pp 9378–9387
Xue J, Zhao Y, Liao W et al (2019) Nonlocal low-rank regularized tensor decomposition for hyperspectral image denoising. IEEE Trans Geosci Remote Sens 57(7):5174–5189
Gong Y, Liu L, Ming Y, et al (2014) Compressing deep convolutional networks using vector quantization. CoRR abs/1412.6115
Khoram S, Jing L (2018) Adaptive quantization of neural networks. In: 6th international conference on learning representations, Vancouver, BC, Canada
Wang K, Liu Z, Lin Y, et al (2019) HAQ: Hardware-aware automated quantization with mixed precision. In: 2019 IEEE/CVF conference on computer vision and pattern recognition, Long Beach, CA, USA, pp 8604–8612
Lou Q, Guo F, Liu L, et al (2020) AutoQ: automated kernel-wise neural network quantization. In: 8th international conference on learning representations, Addis Ababa, Ethiopia
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. CoRR abs/1503.02531.
Wang J, Bao W, Sun L, et al (2019) Private Model Compression via Knowledge Distillation. In: The thirty-third AAAI conference on artificial intelligence, Honolulu, Hawaii, USA, pp 1190–1197
Walawalkar D, Shen Z, Savvides M (2020) Online ensemble model compression using knowledge distillation. In: European conference on computer vision, Glasgow, UK, pp 18–35
Hu H, Peng R, Tai YW, et al (2016) Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. CoRR abs/1607.03250.
Zhuang L, Li J, Shen Z, et al (2017) learning efficient convolutional networks through network slimming. In: 2017 IEEE international conference on computer vision, Venice, Italy, pp 2755–2763
Zhuang L, Sun M, Zhou T, et al (2018). Rethinking the value of network pruning. In: 7th international conference on learning representations, New Orleans, LA, USA.
Li Y, Gu S, Mayer C, et al (2020) Group sparsity: the hinge between filter pruning and decomposition for network compression. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, Seattle, WA, USA, pp 8015–8024
Guo J, Ouyang W, Xu D (2020) Multi-dimensional pruning: a unified framework for model compression. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, Seattle, WA, USA, pp 1505–1514
Anwar S, Sung W (2016) Compact deep convolutional neural networks with coarse pruning. CoRR abs/1610.09639
Song H, Mao H, Dally W J (2016) Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: 4th international conference on learning representations, San Juan, Puerto Rico
Goodfellow I, Pouget-Abadie J, Mirza M, et al (2014) Generative adversarial nets. In: 27th international conference on neural information processing systems, Montreal, Canada, pp 2672–2680
Lin S, Ji R, Yan C, et al (2019) Towards optimal structured CNN pruning via generative adversarial learning. In: 2019 IEEE/CVF conference on computer vision and pattern recognition, Long Beach, CA, USA, pp 2785–2794
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: Theory and applications. Neurocomptuing 70(1–3):489–501
Zhang J, Li YJ, Xiao WD et al (2020) Non-iterative and fast deep learning: multilayer extreme learning machines. J Franklin Inst 357(13):8925–8955
He Y, Liu P, Wang Z, et al (2019) Filter pruning via geometric median for deep convolutional neural networks acceleration. In: 2019 IEEE/CVF conference on computer vision and pattern recognition, Long Beach, CA, USA, pp 4335–4344
Lin M, Ji R, Wang Y, et al (2020) HRank: filter pruning using high-rank feature map. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, Seattle, WA, USA, pp 1526–1535
He Y, Lin J, Liu Z, et al (2018) AMC: AutoML for model compression and acceleration on mobile devices. In: european conference on computer vision, Munich, Germany, pp 815-832
Chin TW, Ding R, Zhang C, et al (2020) Towards efficient model compression via learned global ranking. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, Seattle, WA, USA, pp 1515–1525
Li H, Kadav A, Durdanovic I, et al (2017) Pruning filters for efficient ConvNets. In: 5th international conference on learning representations, Toulon, France.
He Y, Dong X, Kang G et al (2019) Asymptotic soft filter pruning for deep convolutional neural networks. IEEE Trans Cybern 50(8):3594–3604
Luo JH, Wu J, Lin W (2017) ThiNet: a filter level pruning method for deep neural network compression. In: 2017 IEEE international conference on computer vision, Venice, Italy, pp 5068–5076
Lin S, Ji R, Li Y, et al (2018) Accelerating convolutional networks via global & dynamic filter pruning. In: twenty-seventh international joint conference on artificial intelligence, Stockholm, Sweden, pp 2425-2432
Liu Z, Mu H, X Zhang, et al (2019) MetaPruning: meta learning for automatic neural network channel pruning. In: 2019 IEEE/CVF international conference on computer vision, Seoul, Korea (South), pp 3295–3304.
Liu N, Ma X, Xu Z, et al (2020) AutoCompress: an automatic dnn structured pruning framework for ultra-high compression rates. In: The thirty-fourth AAAI conference on artificial intelligence, New York, NY, USA, pp 4876-4883
Yu F, Cui L, Wang P et al (2021) EasiEdge: a novel global deep neural networks pruning method for efficient edge computing. IEEE Internet Things J 8(3):1259–1271
Guo S, Wang Y, Li Q, et al (2020) DMCP: Differentiable Markov channel pruning for neural networks. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, Seattle, WA, USA, pp 1536–1544
Wang J, Bai H, Wu J et al (2020) bayesian automatic model compression. IEEE J Sel Topics Signal Process 14(4):727–736
Wang Z, Taylor ME (2019) Interactive reinforcement learning with dynamic reuse of prior knowledge from human and agent demonstrations. In: twenty-eighth international joint conference on artificial intelligence, Macao, China, pp 3820–3827
Krizhevsky A, Nair V, Hinton G. The CIFAR-10 Dataset. http://www.cs.toronto.edu/kriz/cifar.html/
Jia D, Wei D, Socher R, et al (2009) ImageNet: a large-scale hierarchical image database. In: 2009 ieee conference on computer vision and pattern recognition, Miami, FL, USA, pp 248-255
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. In: 3rd international conference on learning representations, San Diego, CA, USA
He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, pp 770–778
Sandler M, Howard A, Zhu M, et al (2018) MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, Salt Lake City, UT, USA, pp 4510–4520
Xu XZ, Chen J, Su HY et al (2022) Towards efficient filter pruning via topology. J Real-Time Image Proc 19(3):639–649
Zhan H, Lin W M, Cao Y (2021) Deep model compression via two-stage deep reinforcement learning. In: European conference on machine learning and principles and practice of knowledge discovery in databases, Bilbao, Spain, pp 238–254
Zhou Z, Chen X, Li E et al (2019) Edge intelligence: paving the last mile of artificial intelligence with edge computing. Proc IEEE 107(8):1738–1762
Cheng Y , Wang D , Zhou P , et al (2017) A Survey of model compression and acceleration for deep neural networks. CoRR abs/1710.09282.
Yang Q, Stork JA, Stoyanov T (2022) MPR-RL: multi-prior regularized reinforcement learning for knowledge transfer. IEEE Robot Autom Lett 7(3):7652–7659
Chen XX, Huang KH, Liang XX et al (2022) Tactical prior knowledge inspiring multi-agent bilevel reinforcement learning. J Command Control 8(1):72–79
Lin M, Cao L, Li S, et al (2021) Filter sketch for network pruning. IEEE Trans Neural Netw Learn Syst 1–10
Tian G, Chen J, Zeng X et al (2021) Pruning by training: a noveldeep neural network compression framework for image processing. IEEE Signal Process Lett 28:344–348
Huang Z, Wang N (2018) Data-driven sparse structure selection for deep neural networks. In: European conference on computer vision, Munich, Germany, pp 317–334
Acknowledgements
This work was supported by the Natural Science Foundation of China (Grant No.61976098), Science and Technology Development Foundation of Quanzhou City (Grant No.2020C067).
Funding
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, W., Ji, M., Yu, H. et al. ReLP: Reinforcement Learning Pruning Method Based on Prior Knowledge. Neural Process Lett 55, 4661–4678 (2023). https://doi.org/10.1007/s11063-022-11058-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-022-11058-3