Skip to main content

Advertisement

Log in

MXQN:Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Quantization, which involves bit-width reduction, is considered as one of the most effective approaches to rapidly and energy-efficiently deploy deep convolutional neural networks (DCNNs) on resource-constrained embedded hardware. However, bit-width reduction on the weights and activations of DCNNs seriously degrades accuracy. To solve this problem, in this paper we propose a mixed hardware-friendly quantization (MXQN) method that applies fixed-point quantization and logarithmic quantization for DCNNs without the necessity to retrain and fine-tune the DCNN. Our MXQN algorithm is a multi-staged process where, first, we employ a signal-to-quantization-noise ratio (SQNR) process as the metric to estimate the interplay between the parameter quantization errors of each layer and the overall model prediction accuracy. Then, we utilize a fixed-point quantization process to quantize weights, and depending on the SQNR metric we empirically select either a logarithmic or a fixed-point quantization process to quantize activations. For improved accuracy, we propose an optimized logarithmic quantization scheme that affords a fine-grained step size. We evaluate the performance of MXQN utilizing the VGG16 network on the MNIST, CIFAR-10, CIFAR-100, and the ImageNet datasets, as well as VGG19 and ResNet (ResNet18, ResNet34, ResNet50) networks on the ImageNet, and demonstrate that the MXQN-quantized DCNN despite not being retrained and fine-tuned, it still achieves high accuracy close to the original DCNN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Egrioglu E, Yolcu U, Bas E (2019) Intuitionistic high-order fuzzy time series forecasting method based on pi-sigma artificial neural networks trained by artificial bee colony. Granular Comput 4(4):639–654

    Article  Google Scholar 

  2. Melin P, Sánchez D (2019) Optimization of type-1, interval type-2 and general type-2 fuzzy inference systems using a hierarchical genetic algorithm for modular granular neural networks. Granular Comput 4(2):211–236

    Article  Google Scholar 

  3. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, NIPS’12. Curran Associates Inc., Red Hook, pp 1097–1105

  4. Chen J, Zhuo X, Xu F, Wang J, Zhang D, Zhang L (2020) A novel multi-classifier based on a density-dependent quantized binary tree lssvm and the logistic global whale optimization algorithm. Appl Intell:1–14

  5. Zheng S, Zhang Y, Liu W, Zou Y (2020) Improved image representation and sparse representation for image classification. Appl Intell:1–12

  6. Yuan C, Wu Y, Qin X, Qiao S, Pan Y, Huang P, Liu D, Han N (2019) An effective image classification method for shallow densely connected convolution networks through squeezing and splitting techniques. Appl Intell 49(10):3570–3586

    Article  Google Scholar 

  7. Tang C, Yuan L, Tan P (2020) Lsm: Learning subspace minimization for low-level vision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  8. Zhu Y, Yu X, Chandraker M, Wang Y-X (2020) Private-knn: Practical differential privacy for computer vision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  9. Peng W, Pan H, Liu H, Sun Y (2020) Ida-3d: Instance-depth-aware 3d object detection from stereo vision for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  10. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  11. Sze V, Chen Y, Yang T, Emer JS (2017) Efficient processing of deep neural networks: A tutorial and survey. Proc IEEE 105(12):2295–2329. https://doi.org/10.1109/JPROC.2017.2761740

    Article  Google Scholar 

  12. Courbariaux M, Hubara I, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks: Training deep neural networks with weights and activations constrained to + 1 or -1. arXiv:1602.02830

  13. Wu S, Li G, Chen F, Shi L (2018) Training and inference with integers in deep neural networks. arXiv:1802.04680

  14. Yang J, Shen X, Xing J, Tian X, Li H, Deng B, Huang J, Hua X (2019) Quantization networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 7300–7308

  15. Faraone J, Fraser N, Blott M, Leong P HW (2018) Syq: Learning symmetric quantization for efficient deep neural networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4300–4309

  16. Das D, Mellempudi N, Mudigere D, Kalamkar D, Avancha S, Banerjee K, Sridharan S, Vaidyanathan K, Kaul B, Georganas E, Heinecke A, Dubey P, Corbal J, Shustrov N, Dubtsov R, Fomenko E, Pirogov V (2018) Mixed precision training of convolutional neural networks using integer operations. arXiv:1802.00930

  17. Wang N, Choi J, Brand D, Chen C-Y, Gopalakrishnan K (2018) Training deep neural networks with 8-bit floating point numbers. arXiv:1812.08011

  18. Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Quantized neural networks: Training neural networks with low precision weights and activations. arXiv:1609.07061

  19. Banner R, Hubara I, Hoffer E, Soudry D (2018) Scalable methods for 8-bit training of neural networks. arXiv:1805.11046

  20. Zhang D, Yang J, Ye D, Hua G (2018) Lq-nets: Learned quantization for highly accurate and compact deep neural networks. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer Vision – ECCV 2018. Springer International Publishing, Cham, pp 373–390

  21. Lin DD, Talathi SS (2016) Overcoming challenges in fixed point training of deep convolutional networks. arXiv:1607.02241

  22. Banner R, Nahshan Y, Hoffer E, Soudry D (2018) Post-training 4-bit quantization of convolution networks for rapid-deployment. arXiv:1810.05723

  23. Lin DD, Talathi SS, Annapureddy VS (2015) Fixed point quantization of deep convolutional networks. arXiv:1511.06393

  24. Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) Xnor-net: Imagenet classification using binary convolutional neural networks. In: European conference on computer vision. Springer, pp 525–542

  25. Zhao X, Wang Y, Cai X, Liu C, Zhang L (2020) Linear symmetric quantization of neural networks for low-precision integer hardware. In: International Conference on Learning Representations

  26. Lee EH, Miyashita D, Chai E, Murmann B, Wong SS (2017) Lognet: Energy-efficient neural networks using logarithmic computation. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 5900–5904

  27. Pouransari H, Tu Z, Tuzel O (2020) Least squares binary quantization of neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp 698–699

  28. Zhou Y, Moosavidezfooli S, Cheung N, Frossard P (2018) Adaptive quantization for deep neural network, pp 4596–4604

  29. Wang K, Liu Z, Lin Y, Lin J, Han S (2019) Haq: Hardware-aware automated quantization with mixed precision. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2019.00881

  30. Zhou E, Fang L, Yang B (2018) Memristive spiking neural networks trained with unsupervised stdp. Electronics 7(12). https://doi.org/10.3390/electronics7120396, https://www.mdpi.com/2079-9292/7/12/396

  31. Zhou E, Fang L, Liu R, Tang Z (2019) Area-efficient memristor spiking neural networks and supervised learning method. Sci China Inf Sci 62(9):199103. https://doi.org/10.1007/s11432-018-9607-8

    Article  Google Scholar 

  32. Shi YQ, Sun H (2008) Image and video compression for multimedia engineering: Fundamentals, algorithms, and standards, 2nd edn. CRC Press, Inc., USA

    Google Scholar 

  33. Banner R, Nahshan Y, Hoffer E, Soudry D (2018) ACIQ: analytical clipping for integer quantization of neural networks. CoRR 1810.05723

  34. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778

  35. Gautschi M, Schaffner M, Grkaynak FK, Benini L (2016) A 65nm cmos 6.4-to-29.2pj/flop@0.8v shared logarithmic floating point unit for acceleration of nonlinear function kernels in a tightly coupled processor cluster. In: 2016 IEEE International Solid-State Circuits Conference (ISSCC), pp 82–83

  36. Miyashita D, Lee EH, Murmann B (2016) Convolutional neural networks using logarithmic data representation. arXiv:1603.01025

  37. Gupta S, Agrawal A, Gopalakrishnan K, Narayanan P (2015) Deep learning with limited numerical precision. arXiv:1502.02551

  38. Park E, Ahn J, Yoo S (2017) Weighted-entropy-based quantization for deep neural networks, pp 7197–7205

  39. Keras (2020) Models for image classification with weights trained on imagenet, https://keras.io/applications/, accessed January 1, 2020

  40. Zhou S, Wu Y, Ni Z, Zhou X, Wen H, Zou Y (2016) Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv:1606.06160

  41. Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) Xnor-net: Imagenet classification using binary convolutional neural networks. In: Leibe B, Matas J, Sebe N, Welling M (eds) Computer Vision – ECCV 2016. Springer International Publishing, Cham, pp 525–542

  42. Gong R, Liu X, Jiang S, Li T, Hu P, Lin J, Yu F, Yan J (2019) Differentiable soft quantization: Bridging full-precision and low-bit neural networks. arXiv:1908.05033

  43. Jung S, Son C, Lee S, Son J, Han J-J, Kwak Y, Hwang SJ, Choi C (2019) Learning to quantize deep networks by optimizing quantization intervals with task loss. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2019.00448

  44. Hou L, Kwok JT (2018) Loss-aware weight quantization of deep networks. arXiv:1802.08635

  45. McDonnell MD (2018) Training wide residual networks for deployment using a single bit for each weight. arXiv:1802.08530

  46. Jacob B, Kligys S, Chen B, Zhu M, Tang M, Howard A, Adam H, Kalenichenko D (2018) Quantization and training of neural networks for efficient integer-arithmetic-only inference. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/cvpr.2018.00286

  47. Seo S, Kim J (2019) Efficient weights quantization of convolutional neural networks using kernel density estimation based non-uniform quantizer. Appl Sci 9(12). https://doi.org/10.3390/app9122559, https://www.mdpi.com/2076-3417/9/12/2559

  48. Zhang X, Liu S, Zhang R, Liu C, Huang D, Zhou S, Guo J, Kang Y, Guo Q, Du Z, Chen Y (2019) Adaptive precision training: Quantify back propagation in neural networks with fixed-point numbers. arXiv:1911.00361

  49. Migacz S (2017) 8-bit inference with tensorrt. In: Proceedings in GPU Technology Conference

  50. Louizos C, Reisser M, Blankevoort T, Gavves E, Welling M (2019) Relaxed quantization for discretized neural networks. In: International Conference on Learning Representations

  51. Khayrov EM, Malsagov MY, Karandashev IM (2020) Post-training quantization of deep neural network weights. In: Kryzhanovsky B, Dunin-Barkowski W, Redko V, Tiumentsev Y (eds) Advances in Neural Computation, Machine Learning, and Cognitive Research III. Springer International Publishing, Cham, pp 230–238

  52. Li F, Zhang B, Liu B (2016) Ternary weight networks. arXiv:1605.04711

  53. Zhou A, Yao A, Guo Y, Xu L, Chen Y (2017) Incremental network quantization: Towards lossless cnns with low-precision weights. arXiv:1702.03044

  54. Mellempudi N, Kundu A, Mudigere D, Das D, Kaul B, Dubey P (2017) Ternary neural networks with fine-grained quantization. arXiv:1705.01462

  55. Leng C, Dou Z, Li H, Zhu S, Jin R (2018) Extremely low bit neural network: Squeeze the last bit out with admm. In: Thirty-Second AAAI Conference on Artificial Intelligence

  56. Akhauri Y (2019) Hadanets: Flexible quantization strategies for neural networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 526–534

Download references

Acknowledgements

The authors would like to thank Jing Wang and Haiyu Mao from Tsinghua University for their comments and beneficial suggestions.

This research was funded by National Key Research & Development Program of China (Grant No. 2018YFB1003304).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Fang.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, C., Liu, P. & Fang, L. MXQN:Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks. Appl Intell 51, 4561–4574 (2021). https://doi.org/10.1007/s10489-020-02109-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-02109-0

Keywords

Navigation