Learning Sparse Filters in Deep Convolutional Neural Networks with a $$l_1/l_2$$ Pseudo-Norm

Berthelier, Anthony; Yan, Yongzhe; Chateau, Thierry; Blanc, Christophe; Duffner, Stefan; Garcia, Christophe

doi:10.1007/978-3-030-68763-2_50

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12661))

Included in the following conference series:

International Conference on Pattern Recognition

2535 Accesses
4 Citations

Abstract

While deep neural networks (DNNs) have proven to be efficient for numerous tasks, they come at a high memory and computation cost, thus making them impractical on resource-limited devices. However, these networks are known to contain a large number of parameters. Recent research has shown that their structure can be more compact without compromising their performance.

In this paper, we present a sparsity-inducing regularization term based on the ratio $l_1/l_2$ pseudo-norm defined on the filter coefficients. By defining this pseudo-norm appropriately for the different filter kernels, and removing irrelevant filters, the number of kernels in each layer can be drastically reduced leading to very compact Deep Convolutional Neural Networks (DCNN) structures. Unlike numerous existing methods, our approach does not require an iterative retraining process and, using this regularization term, directly produces a sparse model during the training process. Furthermore, our approach is also much easier and simpler to implement than existing methods. Experimental results on MNIST and CIFAR-10 show that our approach significantly reduces the number of filters of classical models such as LeNet and VGG while reaching the same or even better accuracy than the baseline models. Moreover, the trade-off between the sparsity and the accuracy is compared to other loss regularization terms based on the l1 or l2 norm as well as the SSL [1], NISP [2] and GAL [3] methods and shows that our approach is outperforming them.

This work has been sponsored by the Auvergne Regional Council and the European funds of regional development (FEDER).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Efficient Bayesian CNN Model Compression using Bayes by Backprop and L1-Norm Regularization

Article Open access 04 April 2024

A Survey for Sparse Regularization Based Compression Methods

Article 16 April 2022

Improve Convolutional Neural Network Pruning by Maximizing Filter Variety

References

Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. arXiv preprint arXiv:1608.03665 (2016)
Yu, R., et al.: NISP: pruning networks using neuron importance score propagation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9194–9203 (2018)
Google Scholar
Lin, S., et al.: Towards optimal structured CNN pruning via generative adversarial learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2790–2799 (2019)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale Image recognition. arXiv preprint arXiv:1409.1556 (2015)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
He, K., Sun, J.: Convolutional neural networks at constrained time cost. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5353–5360 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient ConvNets. arXiv preprint arXiv:1608.08710 (2017)
Luo, J. H., Wu, J., Lin, W.: Thinet: a filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5058–5066 (2017)
Google Scholar
Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. arXiv preprint arXiv:1405.3866 (2014)
Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Optimization with sparsity-inducing penalties. arXiv preprint arXiv:1108.0775 (2012)
Huang, J., Zhang, T.: The benefit of group sparsity. Ann. Statist. 38(4), 1978–2004 (2010)
Article MathSciNet Google Scholar
Turlach, B., Venables, W., Wright, S.: Simultaneous variable selection. Technometrics 47(3), 349–363 (2000)
Article MathSciNet Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc. B (Stat. Methodol.) 68(1), 49–67 (2006)
Article MathSciNet Google Scholar
Dauphin, Y.N., Bengio, Y.: Big neural networks waste capacity. arXiv preprint arXiv:1301.3583 (2013)
Ba, L.J., Caruana, R.: Do deep nets really need to be deep? arXiv preprint arXiv:1312.6184 (2014)
Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: International Conference on Machine Learning, pp. 1737–1746 (2015)
Google Scholar
Courbariaux, M., Bengio, Y., David, J.P.: Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024 (2014)
Williamson, D.: Dynamically scaled fixed point arithmetic. In: IEEE Pacific Rim Conference on Communications, Computers and Signal Processing Conference Proceedings, pp. 315–318 (1991)
Google Scholar
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: alexnet-level accuracy with 50x fewer parameters and<0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016)
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Miikkulainen, R., et al.: Evolving deep neural networks. In: Artificial Intelligence in the Age of Neural Networks and Brain Computing, pp. 293–312 (2017)
Google Scholar
Tan, M., Chen, B., Pang, R., Vasudevan, V., Le, Q.V.: MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)
Google Scholar
He, Y., Lin, J., Liu, Z., Wang, H., Li, L.J., Han, S.: AMC: autoML for model compression and acceleration on mobile devices. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 784–800 (2018)
Google Scholar
Han, S., Mao, H., Dally, W. J.: Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2016)
Choi, Y., El-Khamy, M., Lee, J.: Towards the limit of network quantization. arXiv preprint arXiv:1612.01543 (2017)
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. arXiv preprint arXiv:1506.02626 (2015)
Anwar, S., Hwang, K., Sung, W.: Structured pruning of deep convolutional neural networks. ACM J. Emerg. Technol. Comput. Syst. 13(3), 1–18 (2017)
Article Google Scholar
Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient transfer learning. arXiv preprint arXiv:1611.06440, 3 (2017)
He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1389–1397 (2017)
Google Scholar
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2736–2744 (2017)
Google Scholar
Zhuang, Z., et al.: Discrimination-aware channel pruning for deep neural networks. arXiv preprint arXiv:1810.11809 (2018)
He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4340–4349 (2019)
Google Scholar
Liu, B., Wang, M., Foroosh, H., Tappen, M., Pensky, M.: Sparse convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 806–814 (2015)
Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc. B (Stat. Methodol. 68(1), 49–67 (2006)
Article MathSciNet Google Scholar
Liu, J., Ye, J.: Efficient l1/lq norm regularization. arXiv preprint arXiv:1009.4766 (2010)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Universite Clermont Auvergne, Institut Pascal, Clermont-Ferrand, France
Anthony Berthelier, Yongzhe Yan, Thierry Chateau & Christophe Blanc
INSA Lyon, LIRIS, Lyon, France
Stefan Duffner & Christophe Garcia

Authors

Anthony Berthelier
View author publications
You can also search for this author in PubMed Google Scholar
Yongzhe Yan
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Chateau
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Blanc
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Duffner
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Garcia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anthony Berthelier .

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Alberto Del Bimbo
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Rita Cucchiara
Department of Computer Science, Boston University, Boston, MA, USA
Stan Sclaroff
Dipartimento di Matematica e Informatica, University of Catania, Catania, Italy
Giovanni Maria Farinella
Cloud & AI, JD.COM, Beijing, China
Tao Mei
Dipartimento di Ingegneria dell’Informazione, University of Firenze, Firenze, Italy
Marco Bertini
Computational Sciences Department, National Institute of Astrophysics, Optics and Electronics (INAOE), Tonantzintla, Puebla, Mexico
Hugo Jair Escalante
Dipartimento di Ingegneria “Enzo Ferrari”, Università di Modena e Reggio Emilia, Modena, Italy
Roberto Vezzani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Berthelier, A., Yan, Y., Chateau, T., Blanc, C., Duffner, S., Garcia, C. (2021). Learning Sparse Filters in Deep Convolutional Neural Networks with a $l_1/l_2$ Pseudo-Norm. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12661. Springer, Cham. https://doi.org/10.1007/978-3-030-68763-2_50

Download citation

DOI: https://doi.org/10.1007/978-3-030-68763-2_50
Published: 21 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68762-5
Online ISBN: 978-3-030-68763-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Learning Sparse Filters in Deep Convolutional Neural Networks with a \(l_1/l_2\) Pseudo-Norm

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Bayesian CNN Model Compression using Bayes by Backprop and L1-Norm Regularization

A Survey for Sparse Regularization Based Compression Methods

Improve Convolutional Neural Network Pruning by Maximizing Filter Variety

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Learning Sparse Filters in Deep Convolutional Neural Networks with a \(l_1/l_2\) Pseudo-Norm

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Bayesian CNN Model Compression using Bayes by Backprop and L1-Norm Regularization

A Survey for Sparse Regularization Based Compression Methods

Improve Convolutional Neural Network Pruning by Maximizing Filter Variety

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships