Abstract
In this paper, we propose a simple and effective network pruning framework, which introduces novel weight-dependent gates to prune filter adaptively. We argue that the pruning decision should depend on the convolutional weights, in other words, it should be a learnable function of filter weights. We thus construct the weight-dependent gates (W-Gates) to learn the information from filter weights and obtain binary filter gates to prune or keep the filters automatically. To prune the network under hardware constraint, we train a Latency Predict Net (LPNet) to estimate the hardware latency of candidate pruned networks. Based on the proposed LPNet, we can optimize W-Gates and the pruning ratio of each layer under latency constraint. The whole framework is differentiable and can be optimized by gradient-based method to achieve a compact network with better trade-off between accuracy and efficiency. We have demonstrated the effectiveness of our method on Resnet34 and Resnet50, achieving up to 1.33/1.28 higher Top-1 accuracy with lower hardware latency on ImageNet. Compared with state-of-the-art pruning methods, our method achieves superior performance(This work is done when Yun Li, Weiqun Wu and Zechun Liu are interns at Megvii Inc (Face++)).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332 (2018)
Cao, S., et al.: SeerNet: predicting convolutional neural network feature-map sparsity through low-bit quantization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11216–11225 (2019)
Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y.: Compressing neural networks with the hashing trick. In: International Conference on Machine Learning (ICML), pp. 2285–2294 (2015)
Courbariaux, M., Bengio, Y., David, J.P.: BinaryConnect: training deep neural networks with binary weights during propagations. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 3123–3131 (2015)
Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or -1. arXiv preprint arXiv:1602.02830 (2016)
Dai, X., Y., et al.: ChamNet: towards efficient network design through platform-aware model adaptation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11398–11407 (2019)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255 (2009)
Ding, X., Ding, G., Guo, Y., Han, J.: Centripetal SGD for pruning very deep convolutional networks with complicated structure. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4943–4953 (2019)
Ding, X., Ding, G., Guo, Y., Han, J., Yan, C.: Approximated oracle filter pruning for destructive CNN width optimization. arXiv preprint arXiv:1905.04748 (2019)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. In: Proceedings of International Conference on Learning Representations (ICLR) (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4340–4349 (2019)
He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J., Han, S.: AMC: AutoML for model compression and acceleration on mobile devices. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 815–832. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_48
He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Huang, Z., Wang, N.: Data-driven sparse structure selection for deep neural networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 317–334. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_19
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 4107–4115 (2016)
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016)
Jia, K., Tao, D., Gao, S., Xu, X.: Improving training of deep neural networks via singular value bounding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2017, pp. 3994–4002 (2017)
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. In: Proceedings of International Conference on Learning Representations (ICLR) (2017)
Li, Y., et al.: Exploiting kernel sparsity and entropy for interpretable CNN compression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2800–2809 (2019)
Li, Y., Wang, L., Peng, S., Kumar, A., Yin, B.: Using feature entropy to guide filter pruning for efficient convolutional networks. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11728, pp. 263–274. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30484-3_22
Lin, S., Ji, R., Li, Y., Wu, Y., Huang, F., Zhang, B.: Accelerating convolutional networks via global & dynamic filter pruning. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), pp. 2425–2432 (2018)
Lin, S., et al.: Towards optimal structured CNN pruning via generative adversarial learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2790–2799 (2019)
Liu, B., Wang, M., Foroosh, H., Tappen, M., Pensky, M.: Sparse convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 806–814 (2015)
Liu, Z., Mu, H., Zhang, X., Guo, Z., Yang, X., Cheng, K.T., Sun, J.: MetaPruning: meta learning for automatic neural network channel pruning. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 3296–3305 (2019)
Liu, Z., Wu, B., Luo, W., Yang, X., Liu, W., Cheng, K.-T.: Bi-real net: enhancing the performance of 1-bit CNNs with improved representational capability and advanced training algorithm. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 747–763. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_44
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2736–2744 (2017)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015)
Luo, J.H., Wu, J.: Autopruner: an end-to-end trainable filter pruning method for efficient deep model inference. Pattern Recognit. (PR) 107461 (2020)
Luo, J.H., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 5058–5066 (2017)
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 122–138. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_8
Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712 (2016)
Molchanov, P., Mallya, A., Tyree, S., Frosio, I., Kautz, J.: Importance estimation for neural network pruning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Paszke, A., Gross, S., Chintala, S., Chanan, G.: Pytorch: tensors and dynamic neural networks in python with strong GPU acceleration (2017)
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Wu, B., et al.: FBNet: hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10734–10742 (2019)
Xu, Y., et al.: Latency-aware differentiable neural architecture search. arXiv preprint arXiv:2001.06392 (2020)
Yang, H., Zhu, Y., Liu, J.: ECC: platform-independent energy-constrained deep neural network compression via a bilinear regression model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11206–11215 (2019)
Yang, T.-J., et al.: NetAdapt: platform-aware neural network adaptation for mobile applications. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 289–304. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_18
Yu, J., Yang, L., Xu, N., Yang, J., Huang, T.: Slimmable neural networks. In: International Conference on Learning Representations (ICLR) (2019)
Yu, X., Liu, T., Wang, X., Tao, D.: On compressing deep models by low rank and sparse decomposition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7370–7379 (2017)
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6848–6856 (2018)
Zhao, C., Ni, B., Zhang, J., Zhao, Q., Zhang, W., Tian, Q.: Variational convolutional neural network pruning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2780–2789 (2019)
Zhou, Y., Zhang, Y., Wang, Y., Tian, Q.: Accelerate CNN via recursive Bayesian pruning. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 3306–3315 (2019)
Acknowledgement
This work is supported the Equipment Pre-Research Foundation of China under grant No. 61403120201.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Y. et al. (2020). Weight-Dependent Gates for Differentiable Neural Network Pruning. In: Bartoli, A., Fusiello, A. (eds) Computer Vision – ECCV 2020 Workshops. ECCV 2020. Lecture Notes in Computer Science(), vol 12539. Springer, Cham. https://doi.org/10.1007/978-3-030-68238-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-68238-5_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68237-8
Online ISBN: 978-3-030-68238-5
eBook Packages: Computer ScienceComputer Science (R0)