Root quantization: a self-adaptive supplement STE

Zhang, Luoming; He, Yefei; Lou, Zhenyu; Ye, Xin; Wang, Yuxing; Zhou, Hong

doi:10.1007/s10489-022-03691-1

Root quantization: a self-adaptive supplement STE

Published: 07 July 2022

Volume 53, pages 6266–6275, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Luoming Zhang¹,
Yefei He¹,
Zhenyu Lou¹,
Xin Ye¹,
Yuxing Wang¹ &
…
Hong Zhou ORCID: orcid.org/0000-0003-1314-8883¹

611 Accesses
2 Citations
Explore all metrics

Abstract

Low precision deep neural network model quantization can further reveal stronger abilities of models such as shorter inference time, lower energy consumption and memory usage, but meanwhile induce performance degradation and instability during training. Straight Through Estimator (STE) is widely used in Quantization-Aware-Training (QAT) to overcome these shortcomings, and achieves good results on (2-, 3-, 4-bit) quantization. Different STE function may achieve different performance under various quantization precision settings. In order to explore the applicable bit-width settings range of STE functions and stabilize the training process, we propose Root Quantization. Root Quantization combines two estimators, the linear estimator and the root estimator. While linear estimator is based on existing methods of training quantizer and weights under task loss function, root estimator is based on high degree root and acts as a correction module to fine-tune the weights, which not only approximates the gradient of quantization error, but also makes the gradient more accurate. Root estimator can also adapt and adjust each layer’s root degree to the most suitable value through the task loss gradient. Extensive experimental results on CIFAR-10 and ImageNet, with different network architectures under various bit-width range, show the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive Rounding Compensation for Post-training Quantization

Symmetry Regularization and Saturating Nonlinearity for Robust Quantization

IQNN: Training Quantized Neural Networks with Iterative Optimizations

References

Banner R, Nahshan Y, Hoffer E, Soudry D (2019) Post-training 4-bit quantization of convolution networks for rapid-deployment. In: Advances in neural information processing systems. Vancouver, Canada, pp 7948–7956
Bengio Y, léonard N, Courville A (2013) Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv:1308.3432
Bhalgat Y, Lee J, Nagel M, Blankevoort T, Kwak N (2020) Lsq+: Improving low-bit quantization through learnable offsets and better initialization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 696–697
Cai Z, He X, Sun J, Vasconcelos N (2017) Deep learning with low precision by half-wave gaussian quantization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5918–5926
Choi J, Wang Z, Venkataramani S, Chuang PI-J, Srinivasan V, Gopalakrishnan K (2018) Pact: parameterized clipping activation for quantized neural networks. arXiv:1805.06085
Choukroun Y, Kravchik E, Yang F, Kisilev P (2019) Low-bit quantization of neural networks for efficient inference. In: IEEE/CVF international conference on computer vision workshop (ICCVW), pp 3009–3018
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009). In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255
Esser SK, McKinstry JL, Bablani D, Appuswamy R, Modha DS (2019) Learned step size quantization. In: International conference on learning representations
Fan A, Stock P, Graham B, Grave E, Gribonval R, Jegou H, Joulin A (2020) Training with quantization noise for extreme model compression. In: International conference on learning representations
Frankle J, Carbin M (2018) The lottery ticket hypothesis: finding sparse, trainable neural networks. In: arXiv:1803.03635, 2018
Liu Z, Luo W, Wu B, Liu XYW, Cheng K (2020) Bi-real net: binarizing deep network towards real-network performance. Int J Comput Vis 128(6):202–219
Article Google Scholar
Huang C, Liu P, Fang L (2021) MXQN: mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks. Appl Intell 51 (7):4561– 4574
Article Google Scholar
Fan Y, Wei P, Liu S (2021) HFPQ: deep neural network compression by hardware-friendly pruning-quantization. Appl Intell 51(10):7016–7028
Article Google Scholar
Gong R, Liu X, Jiang S, Li T, Hu P, Lin J, Yu F, Yan J (2019) Differentiable soft quantization: bridging full-precision and low-bit neural networks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4852– 4861
Gray RM, Neuhoff DL (1998) Quantization. IEEE Trans Inf Theory 44(6):2325–2383
Article MATH Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. In: arXiv:1503.02531
Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks. Advances in neural information processing systems, vol 29
Jacob B, Kligys S, Chen B, Zhu M, Tang M, Howard A, Adam H, Kalenichenko D (2018) Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2704–2713
Jung S, Son C, Lee S, Son J, Han J-J, Kwak Y, Hwang SJ, Choi C (2019) Learning to quantize deep networks by optimizing quantization intervals with task loss. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4350–4359
Krizhevsky A, Hinton G, et al. (2009) Learning multiple layers of features from tiny images
Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Article Google Scholar
Alex Krizhevsky VN, Hinton G (2014) cifar-10, http://www.cs.toronto.edu/kriz/cifar.html accessed:
LeCun Y, Denker JS, Solla SA (1990) Optimal brain damage. In: Advances in neural information processing systems, pp 598–605
Lee J, Kim D, Ham B (2021) Network quantization with element-wise gradient scaling. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6448–6457
Li F, Zhang B, Liu B (2016) Ternary weight networks. In: arXiv:1605.04711
Liu Z, Shen Z, Li S, Helwegen K, Huang D, Cheng K-T (2021) How do adam and training strategies help bnns optimization?. In: International conference on machine learning. PMLR, pp 6936–6946
Nagel M, Amjad RA, Van Baalen M, Louizos C, Blankevoort T (2020) Up or down? adaptive rounding for post-training quantization. In: International conference on machine learning. PMLR, pp 7197–7206
Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch
Qin H, Gong R, Liu X, Shen M, Wei Z, Yu F, Song J (2020) Forward and backward information retention for accurate binary neural networks. In: IEEE CVPR
Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) Xnor-net: imagenet classification using binary convolutional neural networks. In: European conference on computer vision. Springer, pp 525–542
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Wang K, Liu Z, Lin Y, Lin J, Han S (2019) Haq: hardware-aware automated quantization with mixed precision. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8612–8620
Yamamoto K (2021) Learnable companding quantization for accurate low-bit neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5029–5038
Yao Z, Dong Z, Zheng Z, Gholami A, Yu J, Tan E, Wang L, Huang Q, Wang Y, Mahoney M et al (2021) Hawq-v3: dyadic neural network quantization. In: International conference on machine learning. PMLR, pp 11875–11886
Yin P, Lyu J, Zhang S, Osher S, Qi Y, Xin J (2019) Understanding straight-through estimator in training activation quantized neural nets. International Conference on Learning Representations
Zhang D, Yang J, Ye D, Hua G (2018) Lq-nets: learned quantization for highly accurate and compact deep neural networks. In: Proceedings of the European conference on computer vision (ECCV), pp 365–382
Zhou S, Wu Y, Ni Z, Zhou X, Wen H, Zou Y (2016). In: arXiv:1606.06160
Zhuang B, Liu L, Tan M, Shen C, Reid I (2020) Training quantized neural networks with a full-precision auxiliary module. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1488–1497
Liu Z, Shen Z, Savvides M, Cheng K (2020) Reactnet: towards precise binary neural network with generalized activation functions. In: Proceedings of the European conference on computer vision (ECCV), pp 143–159

Download references

Author information

Authors and Affiliations

Key Laboratory for Biomedical Engineering of Ministry, Zhejiang University, Hangzhou, 310007, China
Luoming Zhang, Yefei He, Zhenyu Lou, Xin Ye, Yuxing Wang & Hong Zhou

Authors

Luoming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yefei He
View author publications
You can also search for this author in PubMed Google Scholar
Zhenyu Lou
View author publications
You can also search for this author in PubMed Google Scholar
Xin Ye
View author publications
You can also search for this author in PubMed Google Scholar
Yuxing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hong Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hong Zhou.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, L., He, Y., Lou, Z. et al. Root quantization: a self-adaptive supplement STE. Appl Intell 53, 6266–6275 (2023). https://doi.org/10.1007/s10489-022-03691-1

Download citation

Accepted: 27 April 2022
Published: 07 July 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10489-022-03691-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Root quantization: a self-adaptive supplement STE

Abstract

Access this article

Similar content being viewed by others

Adaptive Rounding Compensation for Post-training Quantization

Symmetry Regularization and Saturating Nonlinearity for Robust Quantization

IQNN: Training Quantized Neural Networks with Iterative Optimizations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Root quantization: a self-adaptive supplement STE

Abstract

Access this article

Similar content being viewed by others

Adaptive Rounding Compensation for Post-training Quantization

Symmetry Regularization and Saturating Nonlinearity for Robust Quantization

IQNN: Training Quantized Neural Networks with Iterative Optimizations

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation