Abstract
In this paper, we propose Branch-wise Activation-clipping Search Quantization (BASQ), which is a novel quantization method for low-bit activation. BASQ optimizes clip value in continuous search space while simultaneously searching L2 decay weight factor for updating clip value in discrete search space. We also propose a novel block structure for low precision that works properly on both MobileNet and ResNet structures with branch-wise searching. We evaluate the proposed methods by quantizing both weights and activations to 4-bit or lower. Contrary to the existing methods which are effective only for redundant networks, e.g., ResNet-18, or highly optimized networks, e.g., MobileNet-v2, our proposed method offers constant competitiveness on both types of networks across low precisions from 2 to 4-bits. Specifically, our 2-bit MobileNet-v2 offers top-1 accuracy of 64.71% on ImageNet, outperforming the existing method by a large margin (2.8%), and our 4-bit MobileNet-v2 gives 71.98% which is comparable to the full-precision accuracy 71.88% while our uniform quantization method offers comparable accuracy of 2-bit ResNet-18 to the state-of-the-art non-uniform quantization method. Source code is on https://github.com/HanByulKim/BASQ.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In [13], the accuracy is evaluated in a different way. [13] use about 16% of the validation set for architecture selection. The test set is constructed and evaluated using the entire validation set. In order to avoid such a duplicate use of the same data in architecture selection and evaluation, we adopt k-fold evaluation.
References
Bai, H., Cao, M., Huang, P., Shan, J.: BatchQuant: quantized-for-all architecture search with robust quantizer. Advances in Neural Information Processing Systems 34 (2021)
Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)
Bulat, A., Martinez, B., Tzimiropoulos, G.: Bats: Binary architecture search. In: European Conference on Computer Vision (ECCV) (2020)
Cai, H., Zhu, L., Han, S.: ProxylessNAS: Direct neural architecture search on target task and hardware. In: International Conference on Learning Representations (ICLR) (2019)
Chen, P., Liu, J., Zhuang, B., Tan, M., Shen, C.: Towards accurate quantized object detection. In: Computer Vision and Pattern Recognition (CVPR) (2021)
Chen, X., Xie, L., Wu, J., Tian, Q.: Progressive differentiable architecture search: bridging the depth gap between search and evaluation. In: International Conference on Computer Vision (ICCV) (2019)
Choi, J., Venkataramani, S., Srinivasan, V., Gopalakrishnan, K., Wang, Z., Chuang, P.: Accurate and efficient 2-bit quantized neural networks. Proc. Mach. Learn. Syst. 1, 348–359 (2019)
Choi, J., Wang, Z., Venkataramani, S., Chuang, P., Srinivasan, V., Gopalakrishnan, K.: Pact: parameterized clipping activation for quantized neural networks. arXiv:1805.06085 (2018)
Chu, X., Zhang, B., Xu, R.: Fairnas: rethinking evaluation fairness of weight sharing neural architecture search. In: International Conference on Computer Vision (ICCV) (2021)
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a largescale hierarchical image database. In: Computer Vision and Pattern Recognition (CVPR) (2009)
Esser, S., McKinstry, J., Bablani, D., Appuswamy, R., Modha, D.: Learned step size quantization. In: International Conference on Learning Representations (ICLR) (2020)
Gong, R., et al.: Differentiable soft quantization: Bridging full-precision and low-bit neural networks. In: International Conference on Computer Vision (ICCV), vol. 1, pp. 348–359 (2019)
Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. In: European Conference on Computer Vision (ECCV) (2020)
Habi, H., Jennings, R., Netzer, A.: HMQ: Hardware friendly mixed precision quantization block for CNNs. In: European Conference on Computer Vision (ECCV) (2020)
Han, T., Li, D., Liu, J., Tian, L., Shan, Y.: Improving low-precision network quantization via bin regularization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5261–5270 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition (CVPR) (2016)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv:1503.02531 (2015)
Howard, A., et al.: Searching for MobileNetV3. In: International Conference on Computer Vision (ICCV) (2019)
Howard, A., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Computer Vision and Pattern Recognition (CVPR) (2018)
Jung, S., et al.: Learning to quantize deep networks by optimizing quantization intervals with task loss. In: Computer Vision and Pattern Recognition (CVPR) (2018)
Kim, J., Bhalgat, Y., Lee, J., Patel, C., Kwak, N.: QKD: quantization-aware knowledge distillation. arXiv:1911.12491 (2019)
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2014)
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)
Li, H., Xu, Z., Taylor, G., Studer, C., Goldstein, T.: Visualizing the loss landscape of neural nets. In: Advances in neural information processing systems (NIPS) 31 (2018)
Li, Y., Dong, X., Wang, W.: Additive powers-of-two quantization: an efficient non-uniform discretization for neural networks. In: International Conference on Learning Representations (ICLR) (2020)
Liu, H., Simonyan, K., Yang, Y.: Darts: differentiable architecture search. In: International Conference on Learning Representations (ICLR) (2019)
Liu, Z., Shen, Z., Li, S., Helwegen, K., Huang, D., Cheng, K.: How do Adam and training strategies help BNNs optimization. In: International Conference on Machine Learning (ICML) (2021)
Liu, Z., Shen, Z., Savvides, M., Cheng, K.: ReActNet: Towards precise binary neural network with generalized activation functions. In: European Conference on Computer Vision (ECCV) (2020)
Ma, N., Zhang, X., Zheng, H., Sun, J.: ShuffleNet V2: Practical guidelines for efficient cnn architecture design. In: European Conference on Computer Vision (ECCV) (2018)
Ma, Y., et al.: OMPQ: Orthogonal mixed precision quantization. arXiv:2109.07865 (2021)
Martinez, B., Yang, J., Bulat, A., Tzimiropoulos, G.: Training binary neural networks with real-to-binary convolutions. In: International Conference on Learning Representations (ICLR) (2020)
Park, E., Yoo, S.: Profit: a novel training method for sub-4-bit MobileNet models. In: European Conference on Computer Vision (ECCV) (2020)
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: European Conference on Computer Vision (ECCV) (2016)
Real, E., Aggarwal, A., Huang, Y., Le, Q.: Regularized evolution for image classifier architecture search. In: AAAI Conference on Artificial Intelligence 33, 4780–4789 (2019)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.: MobileNet V2: inverted residuals and linear bottlenecks. In: Computer Vision and Pattern Recognition (CVPR) (2018)
Uhlich, S., et al.: Mixed precision DNNs: All you need is a good parametrization. In: International Conference on Learning Representations (ICLR) (2020)
Wang, K., Liu, Z., Lin, Y., Lin, J., Han, S.: HAQ: hardware-aware automated quantization with mixed precision. In: Computer Vision and Pattern Recognition (CVPR) (2019)
Wang, T., et al.: APQ: Joint search for network architecture, pruning and quantization policy. In: Computer Vision and Pattern Recognition (CVPR) (2020)
Wu, B., Wang, Y., Zhang, P., Tian, Y., Vajda, P., Keutzer, K.: Mixed precision quantization of convnets via differentiable neural architecture search. arXiv:1812.00090 (2018)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Computer Vision and Pattern Recognition (CVPR) (2017)
Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. International Conference on Learning Representations (ICLR) (2018)
Yamamoto, K.: Learnable companding quantization for accurate low-bit neural networks. In: Computer Vision and Pattern Recognition (CVPR) (2021)
You, S., Huang, T., Yang, M., Wang, F., Qian, C., Zhang, C.: GreedyNAS: Towards fast one-shot NAS with greedy supernet. In: Computer Vision and Pattern Recognition (CVPR) (2020)
Yu, H., Li, H., Shi, H., Huang, T., Hua, G.: Any-precision deep neural networks. arXiv:1911.07346 (2019)
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Computer Vision and Pattern Recognition (CVPR) (2018)
Zhang, Y., Pan, J., Liu, X., Chen, H., Chen, D., Zhang, Z.: FracBNN: accurate and FPGA-efficient binary neural networks with fractional activations. ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 171–182 (2021)
Zhou, S., Ni, Z., Zhou, X., Wen, H., Wu, Y., Zou, Y.: DoReFa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv:1606.06160 (2016)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.: Learning transferable architectures for scalable image recognition. In: Computer Vision and Pattern Recognition (CVPR) (2018)
Acknowledgment
This work was supported by IITP and NRF grants funded by the Korea government (MSIT, 2021-0-00105, NRF-2021M3F3A2A02037893) and Samsung Electronics (Memory Division, SAIT, and SRFC-TC1603-04).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kim, HB., Park, E., Yoo, S. (2022). BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13672. Springer, Cham. https://doi.org/10.1007/978-3-031-19775-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-19775-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19774-1
Online ISBN: 978-3-031-19775-8
eBook Packages: Computer ScienceComputer Science (R0)