Fast data-free model compression via dictionary-pair reconstruction

Gao, Yangcheng; Zhang, Zhao; Zhang, Haijun; Zhao, Mingbo; Yang, Yi; Wang, Meng

doi:10.1007/s10115-023-01846-1

Fast data-free model compression via dictionary-pair reconstruction

Regular Paper
Published: 11 April 2023

Volume 65, pages 3435–3461, (2023)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Yangcheng Gao^1,2,
Zhao Zhang^1,2,
Haijun Zhang³,
Mingbo Zhao⁴,
Yi Yang⁵ &
…
Meng Wang^1,2

229 Accesses
1 Altmetric
Explore all metrics

Abstract

Deep neural network (DNN) obtained satisfactory results on different vision tasks; however, they usually suffer from large models and massive parameters during model deployment. While DNN compression can reduce the memory footprint of deep model effectively, so that the deep model can be deployed on portable devices. However, most of the existing model compression methods cost lots of time, e.g., vector quantization or pruning, which makes them inept to the application that needs fast computation. In this paper, we therefore explore how to accelerate the model compression process by reducing the computation cost. Then, we propose a new model compression method, termed dictionary-pair-based fast data-free DNN compression, which aims at reducing the memory consumption of DNNs without extra training and can greatly improve the compression efficiency. Specifically, our method performs tensor decomposition of DNN model with a fast dictionary-pair learning-based reconstruction approach, which can be deployed on different weight layers (e.g., convolution and fully connected layers). Given a pre-trained DNN model, we first divide the parameters (i.e., weights) of each layer into a series of partitions for dictionary pair-driven fast reconstruction, which can potentially discover more fine-grained information and provide the possibility for parallel model compression. Then, dictionaries of less memory occupation are learned to reconstruct the weights. Moreover, automatic hyper-parameter tuning and shared-dictionary mechanism is proposed to improve the model performance and availability. Extensive experiments on popular DNN models (i.e., VGG-16, ResNet-18 and ResNet-50) showed that our proposed weight compression method can significantly reduce the memory footprint and speed up the compression process, with less performance loss.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic and Adaptive Threshold for DNN Compression from Scratch

Search-and-Train: Two-Stage Model Compression and Acceleration

CMD: controllable matrix decomposition with global optimization for deep neural network compression

Article 06 January 2022

References

Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Ji Y, Zhang H, Zhang Z, Liu M (2021) Cnn-based encoder-decoder networks for salient object detection: A comprehensive review and recent advances. Inf Sci 546:835–857
Article MathSciNet Google Scholar
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp 248–255
Howard A.G, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1440–1448
Feng R, Li C, Zhou S, Sun W, Zhu Q, Jiang J, Yang Q, Loy C.C, Gu J (2022) Mipi 2022 challenge on under-display camera image restoration: Methods and results. arXiv preprint arXiv:2209.07052
Zhang Z, Zheng H, Hong R, Xu M, Yan S, Wang M (2022) Deep color consistent network for low-light image enhancement. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1899–1908
Zhao S, Zhang Z, Hong R, Xu M, Yang Y, Wang M (2022) Fcl-gan: A lightweight and real-time baseline for unsupervised blind image deblurring. arXiv preprint arXiv:2204.07820
Stock P, Joulin A, Gribonval R, Graham B, Jégou H (2019) And the bit goes down: revisiting the quantization of neural networks. arXiv preprint arXiv:1907.05686
Yu R, Li A, Chen C-F, Lai J-H, Morariu V.I, Han X, Gao M, Lin C-Y, Davis LS (2018) Nisp: Pruning networks using neuron importance score propagation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 9194–9203
Lin M, Ji R, Wang Y, Zhang Y, Zhang B, Tian Y, Shao L (2020) Hrank: Filter pruning using high-rank feature map. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1529–1538
He Y, Kang G, Dong X, Fu Y, Yang Y (2018) Soft filter pruning for accelerating deep convolutional neural networks. arXiv preprint arXiv:1808.06866
Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2016) Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710
He Y, Liu P, Wang Z, Hu Z, Yang Y (2019) Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4340–4349
Liu J, Zhuang B, Zhuang Z, Guo Y, Huang J, Zhu J, Tan M (2021) Discrimination-aware network pruning for deep model compression. IEEE Trans Pattern Anal Mach Intell 44:4035
Google Scholar
Luo J-H, Wu J, Lin W (2017) Thinet: A filter level pruning method for deep neural network compression. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5058–5066
Ruan X, Liu Y, Yuan C, Li B, Hu W, Li Y, Maybank S (2020) Edp: an efficient decomposition and pruning scheme for convolutional neural network compression. IEEE Trans Neural Netw Learn Syst 32:4499
Article Google Scholar
Banner R, Nahshan Y, Soudry D (2019) Post training 4-bit quantization of convolutional networks for rapid-deployment. Adv Neural Inf Process Syst 32:7948–7956
Gao Y, Zhang Z, Hong R, Zhang H, Fan J, Yan S (2022) Towards feature distribution alignment and diversity enhancement for data-free quantization. In: 2022 IEEE International Conference on Data Mining (ICDM) . IEEE
Zhong Y, Lin M, Nan G, Liu J, Zhang B, Tian Y, Ji R (2022) Intraq: Learning synthetic images with intra-class heterogeneity for zero-shot network quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12339–12348
Denton EL, Zaremba W, Bruna J, LeCun Y, Fergus R (2014) Exploiting linear structure within convolutional networks for efficient evaluation. Adv Neural Inf Process Syst 27:1269–1277
Google Scholar
Tai C, Xiao T, Zhang Y, Wang X, et al (2015) Convolutional neural networks with low-rank regularization. arXiv preprint arXiv:1511.06067
Gittens A, Mahoney M. (2013) Revisiting the nystrom method for improved large-scale machine learning. In: International Conference on Machine Learning, pp 567–575 . PMLR
Aizenberg I, Luchetta A, Manetti S (2012) A modified learning algorithm for the multilayer neural network with multi-valued neurons based on the complex qr decomposition. Soft Comput 16(4):563–575
Article Google Scholar
Kim Y-D, Park E, Yoo S, Choi T, Yang L, Shin D (2015) Compression of deep convolutional neural networks for fast and low power mobile applications. arXiv preprint arXiv:1511.06530
Lebedev V, Ganin Y, Rakhuba M, Oseledets I, Lempitsky V (2014) Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv preprint arXiv:1412.6553
Oseledets IV (2011) Tensor-train decomposition. SIAM J Sci Comput 33(5):2295–2317
Article MathSciNet MATH Google Scholar
Rigamonti R, Sironi A, Lepetit V, Fua P (2013) Learning separable filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2754–2761
Gao Y, Zhang Z, Zhang H, Zhao M, Yang Y, Wang M (2021) Dictionary pair-based data-free fast deep neural network compression. In: Proceedings of the 21th IEEE International Conference on Data Mining (ICDM), pp 1–10
Gong Y, Liu L, Yang M, Bourdev L (2014) Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115
Dai S, Venkatesan R, Ren M, Zimmer B, Dally W, Khailany B (2021) Vs-quant: Per-vector scaled quantization for accurate low-precision neural network inference. Proc Mach Learn Syst 3:873–884
Google Scholar
Deng L, Li G, Han S, Shi L, Xie Y (2020) Model compression and hardware acceleration for neural networks: a comprehensive survey. Proc IEEE 108(4):485–532
Article Google Scholar
Han S, Mao H, Dally WJ (2015) Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149
Umuroglu Y, Fraser N.J, Gambardella G, Blott M, Leong P, Jahre M, Vissers K (2017) Finn: A framework for fast, scalable binarized neural network inference. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp 65–74
Kim H, Khan MUK, Kyung C-M (2019) Efficient neural network compression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12569–12577
Gu S, Zhang L, Zuo W, Feng X (2014) Projective dictionary pair learning for pattern classification. Adv Neural Inf Process Syst 27:793–801
Google Scholar
Kim H, Kyung C-M (2018) Automatic rank selection for high-speed convolutional neural network. arXiv preprint arXiv:1806.10821
Krizhevsky A, Hinton G, et al (2009) Learning multiple layers of features from tiny images. Technical report, University of Toronto
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang EZ, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S (2019) Pytorch: An imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037
Marcel S, Rodriguez Y (2010) Torchvision the machine-vision package of torch. In: Proceedings of the 18th ACM International Conference on Multimedia, pp 1485–1488
Huang Z, Wang N (2018) Data-driven sparse structure selection for deep neural networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 304–320
Hu H, Peng R, Tai Y-W, Tang C-K (2016) Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250
Sainath T.N, Kingsbury B, Sindhwani V, Arisoy E, Ramabhadran B (2013) Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp 6655–6659
Lin S, Ji R, Li Y, Deng C, Li X (2019) Toward compact convnets via structure-sparsity regularized filter pruning. IEEE Trans Neural Netw Learn Syst 31(2):574–588
Article MathSciNet Google Scholar
Guo J, Zhang W, Ouyang W, Xu D (2020) Model compression using progressive channel pruning. IEEE Trans Circ Syst Video Technol 31(3):1114–1124
Article Google Scholar
Zhao C, Ni B, Zhang J, Zhao Q, Zhang W, Tian Q (2019) Variational convolutional neural network pruning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2780–2789
Li Y, Lin S, Zhang B, Liu J, Doermann D, Wu Y, Huang F, Ji R (2019) Exploiting kernel sparsity and entropy for interpretable cnn compression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2800–2809
Wei Y, Zhang Z, Wang Y, Xu M, Yang Y, Yan S, Wang M (2021) Deraincyclegan: rain attentive cyclegan for single image deraining and rainmaking. IEEE Trans Image Process 30:4788–4801
Article Google Scholar

Download references

Acknowledgements

The work described in this paper is partially supported by the National Natural Science Foundation of China (62072151, 62020106007) and Anhui Provincial Natural Science Fund for Distinguished Young Scholars (2008085J30). Zhao Zhang is the corresponding author of this paper, and Mingbo Zhao is the co-corresponding author of this paper.

Author information

Authors and Affiliations

School of Computer Science and Information Engineering, Hefei University of Technology, Hefei, 230009, Anhui, China
Yangcheng Gao, Zhao Zhang & Meng Wang
Key Laboratory of Knowledge Engineering with Big Data (Ministry of Education), Hefei University of Technology, Hefei, 230009, Anhui, China
Yangcheng Gao, Zhao Zhang & Meng Wang
Department of Computer Science, Harbin Institute of Technology (Shenzhen), Shenzhen, China
Haijun Zhang
Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, China
Mingbo Zhao
School of Computer Science and Technology, Zhejiang University, Hangzhou, China
Yi Yang

Authors

Yangcheng Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zhao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Haijun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Mingbo Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Meng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zhao Zhang or Mingbo Zhao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Gao, Y., Zhang, Z., Zhang, H. et al. Fast data-free model compression via dictionary-pair reconstruction. Knowl Inf Syst 65, 3435–3461 (2023). https://doi.org/10.1007/s10115-023-01846-1

Download citation

Received: 27 January 2022
Revised: 06 February 2023
Accepted: 12 February 2023
Published: 11 April 2023
Issue Date: August 2023
DOI: https://doi.org/10.1007/s10115-023-01846-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast data-free model compression via dictionary-pair reconstruction

Abstract

Access this article

Similar content being viewed by others

Dynamic and Adaptive Threshold for DNN Compression from Scratch

Search-and-Train: Two-Stage Model Compression and Acceleration

CMD: controllable matrix decomposition with global optimization for deep neural network compression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fast data-free model compression via dictionary-pair reconstruction

Abstract

Access this article

Similar content being viewed by others

Dynamic and Adaptive Threshold for DNN Compression from Scratch

Search-and-Train: Two-Stage Model Compression and Acceleration

CMD: controllable matrix decomposition with global optimization for deep neural network compression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation