Skip to main content

BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

In this paper, we propose Branch-wise Activation-clipping Search Quantization (BASQ), which is a novel quantization method for low-bit activation. BASQ optimizes clip value in continuous search space while simultaneously searching L2 decay weight factor for updating clip value in discrete search space. We also propose a novel block structure for low precision that works properly on both MobileNet and ResNet structures with branch-wise searching. We evaluate the proposed methods by quantizing both weights and activations to 4-bit or lower. Contrary to the existing methods which are effective only for redundant networks, e.g., ResNet-18, or highly optimized networks, e.g., MobileNet-v2, our proposed method offers constant competitiveness on both types of networks across low precisions from 2 to 4-bits. Specifically, our 2-bit MobileNet-v2 offers top-1 accuracy of 64.71% on ImageNet, outperforming the existing method by a large margin (2.8%), and our 4-bit MobileNet-v2 gives 71.98% which is comparable to the full-precision accuracy 71.88% while our uniform quantization method offers comparable accuracy of 2-bit ResNet-18 to the state-of-the-art non-uniform quantization method. Source code is on https://github.com/HanByulKim/BASQ.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In [13], the accuracy is evaluated in a different way. [13] use about 16% of the validation set for architecture selection. The test set is constructed and evaluated using the entire validation set. In order to avoid such a duplicate use of the same data in architecture selection and evaluation, we adopt k-fold evaluation.

References

  1. Bai, H., Cao, M., Huang, P., Shan, J.: BatchQuant: quantized-for-all architecture search with robust quantizer. Advances in Neural Information Processing Systems 34 (2021)

    Google Scholar 

  2. Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)

  3. Bulat, A., Martinez, B., Tzimiropoulos, G.: Bats: Binary architecture search. In: European Conference on Computer Vision (ECCV) (2020)

    Google Scholar 

  4. Cai, H., Zhu, L., Han, S.: ProxylessNAS: Direct neural architecture search on target task and hardware. In: International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  5. Chen, P., Liu, J., Zhuang, B., Tan, M., Shen, C.: Towards accurate quantized object detection. In: Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  6. Chen, X., Xie, L., Wu, J., Tian, Q.: Progressive differentiable architecture search: bridging the depth gap between search and evaluation. In: International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  7. Choi, J., Venkataramani, S., Srinivasan, V., Gopalakrishnan, K., Wang, Z., Chuang, P.: Accurate and efficient 2-bit quantized neural networks. Proc. Mach. Learn. Syst. 1, 348–359 (2019)

    Google Scholar 

  8. Choi, J., Wang, Z., Venkataramani, S., Chuang, P., Srinivasan, V., Gopalakrishnan, K.: Pact: parameterized clipping activation for quantized neural networks. arXiv:1805.06085 (2018)

  9. Chu, X., Zhang, B., Xu, R.: Fairnas: rethinking evaluation fairness of weight sharing neural architecture search. In: International Conference on Computer Vision (ICCV) (2021)

    Google Scholar 

  10. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a largescale hierarchical image database. In: Computer Vision and Pattern Recognition (CVPR) (2009)

    Google Scholar 

  11. Esser, S., McKinstry, J., Bablani, D., Appuswamy, R., Modha, D.: Learned step size quantization. In: International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  12. Gong, R., et al.: Differentiable soft quantization: Bridging full-precision and low-bit neural networks. In: International Conference on Computer Vision (ICCV), vol. 1, pp. 348–359 (2019)

    Google Scholar 

  13. Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. In: European Conference on Computer Vision (ECCV) (2020)

    Google Scholar 

  14. Habi, H., Jennings, R., Netzer, A.: HMQ: Hardware friendly mixed precision quantization block for CNNs. In: European Conference on Computer Vision (ECCV) (2020)

    Google Scholar 

  15. Han, T., Li, D., Liu, J., Tian, L., Shan, Y.: Improving low-precision network quantization via bin regularization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5261–5270 (2021)

    Google Scholar 

  16. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  17. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv:1503.02531 (2015)

  18. Howard, A., et al.: Searching for MobileNetV3. In: International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  19. Howard, A., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)

  20. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

  21. Jung, S., et al.: Learning to quantize deep networks by optimizing quantization intervals with task loss. In: Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

  22. Kim, J., Bhalgat, Y., Lee, J., Patel, C., Kwak, N.: QKD: quantization-aware knowledge distillation. arXiv:1911.12491 (2019)

  23. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv:1412.6980 (2014)

  24. Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)

    Google Scholar 

  25. Li, H., Xu, Z., Taylor, G., Studer, C., Goldstein, T.: Visualizing the loss landscape of neural nets. In: Advances in neural information processing systems (NIPS) 31 (2018)

    Google Scholar 

  26. Li, Y., Dong, X., Wang, W.: Additive powers-of-two quantization: an efficient non-uniform discretization for neural networks. In: International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  27. Liu, H., Simonyan, K., Yang, Y.: Darts: differentiable architecture search. In: International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  28. Liu, Z., Shen, Z., Li, S., Helwegen, K., Huang, D., Cheng, K.: How do Adam and training strategies help BNNs optimization. In: International Conference on Machine Learning (ICML) (2021)

    Google Scholar 

  29. Liu, Z., Shen, Z., Savvides, M., Cheng, K.: ReActNet: Towards precise binary neural network with generalized activation functions. In: European Conference on Computer Vision (ECCV) (2020)

    Google Scholar 

  30. Ma, N., Zhang, X., Zheng, H., Sun, J.: ShuffleNet V2: Practical guidelines for efficient cnn architecture design. In: European Conference on Computer Vision (ECCV) (2018)

    Google Scholar 

  31. Ma, Y., et al.: OMPQ: Orthogonal mixed precision quantization. arXiv:2109.07865 (2021)

  32. Martinez, B., Yang, J., Bulat, A., Tzimiropoulos, G.: Training binary neural networks with real-to-binary convolutions. In: International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  33. Park, E., Yoo, S.: Profit: a novel training method for sub-4-bit MobileNet models. In: European Conference on Computer Vision (ECCV) (2020)

    Google Scholar 

  34. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: European Conference on Computer Vision (ECCV) (2016)

    Google Scholar 

  35. Real, E., Aggarwal, A., Huang, Y., Le, Q.: Regularized evolution for image classifier architecture search. In: AAAI Conference on Artificial Intelligence 33, 4780–4789 (2019)

    Google Scholar 

  36. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.: MobileNet V2: inverted residuals and linear bottlenecks. In: Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

  37. Uhlich, S., et al.: Mixed precision DNNs: All you need is a good parametrization. In: International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  38. Wang, K., Liu, Z., Lin, Y., Lin, J., Han, S.: HAQ: hardware-aware automated quantization with mixed precision. In: Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  39. Wang, T., et al.: APQ: Joint search for network architecture, pruning and quantization policy. In: Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  40. Wu, B., Wang, Y., Zhang, P., Tian, Y., Vajda, P., Keutzer, K.: Mixed precision quantization of convnets via differentiable neural architecture search. arXiv:1812.00090 (2018)

  41. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  42. Xie, S., Zheng, H., Liu, C., Lin, L.: SNAS: stochastic neural architecture search. International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  43. Yamamoto, K.: Learnable companding quantization for accurate low-bit neural networks. In: Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  44. You, S., Huang, T., Yang, M., Wang, F., Qian, C., Zhang, C.: GreedyNAS: Towards fast one-shot NAS with greedy supernet. In: Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  45. Yu, H., Li, H., Shi, H., Huang, T., Hua, G.: Any-precision deep neural networks. arXiv:1911.07346 (2019)

  46. Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

  47. Zhang, Y., Pan, J., Liu, X., Chen, H., Chen, D., Zhang, Z.: FracBNN: accurate and FPGA-efficient binary neural networks with fractional activations. ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 171–182 (2021)

    Google Scholar 

  48. Zhou, S., Ni, Z., Zhou, X., Wen, H., Wu, Y., Zou, Y.: DoReFa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv:1606.06160 (2016)

  49. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.: Learning transferable architectures for scalable image recognition. In: Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

Download references

Acknowledgment

This work was supported by IITP and NRF grants funded by the Korea government (MSIT, 2021-0-00105, NRF-2021M3F3A2A02037893) and Samsung Electronics (Memory Division, SAIT, and SRFC-TC1603-04).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sungjoo Yoo .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 575 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kim, HB., Park, E., Yoo, S. (2022). BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13672. Springer, Cham. https://doi.org/10.1007/978-3-031-19775-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19775-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19774-1

  • Online ISBN: 978-3-031-19775-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics