Skip to main content

Fine-grained Data Distribution Alignment for Post-Training Quantization

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13671))

Included in the following conference series:

  • 2812 Accesses

Abstract

While post-training quantization receives popularity mostly due to its evasion in accessing the original complete training dataset, its poor performance also stems from scarce images. To alleviate this limitation, in this paper, we leverage the synthetic data introduced by zero-shot quantization with calibration dataset and propose a fine-grained data distribution alignment (FDDA) method to boost the performance of post-training quantization. The method is based on two important properties of batch normalization statistics (BNS) we observed in deep layers of the trained network, i.e., inter-class separation and intra-class incohesion. To preserve this fine-grained distribution information: 1) We calculate the per-class BNS of the calibration dataset as the BNS centers of each class and propose a BNS-centralized loss to force the synthetic data distributions of different classes to be close to their own centers. 2) We add Gaussian noise into the centers to imitate the incohesion and propose a BNS-distorted loss to force the synthetic data distribution of the same class to be close to the distorted centers. By utilizing these two fine-grained losses, our method manifests the state-of-the-art performance on ImageNet, especially when both the first and last layers are quantized to the low-bit. Code is at https://github.com/zysxmu/FDDA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Occasionally, the label \(\hat{y}\) is not available. In this case, it can be predicted by the pre-trained full-precision model.

References

  1. Banner, R., Nahshan, Y., Soudry, D., et al.: Post training 4-bit quantization of convolutional networks for rapid-deployment. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 7950–7958 (2019)

    Google Scholar 

  2. Cai, Y., Yao, Z., Dong, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: Zeroq: a novel zero shot quantization framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13169–13178 (2020)

    Google Scholar 

  3. Chen, H., et al.: Data-free learning of student networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3514–3522 (2019)

    Google Scholar 

  4. Choi, J., Wang, Z., Venkataramani, S., Chuang, P.I.J., Srinivasan, V., Gopalakrishnan, K.: Pact: parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085 (2018)

  5. Choi, K., Hong, D., Park, N., Kim, Y., Lee, J.: Qimera: data-free quantization with synthetic boundary supporting samples. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS) (2021)

    Google Scholar 

  6. Fang, J., Shafiee, A., Abdel-Aziz, H., Thorsley, D., Georgiadis, G., Hassoun, J.H.: Post-training piecewise linear quantization for deep neural networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 69–86 (2020)

    Google Scholar 

  7. Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021)

  8. Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 2672–2680 (2014)

    Google Scholar 

  9. Han, S., Pool, J., Tran, J., Dally, W.J., et al.: Learning both weights and connections for efficient neural network. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 1135–1143 (2015)

    Google Scholar 

  10. Haroush, M., Hubara, I., Hoffer, E., Soudry, D.: The knowledge within: methods for data-free model compression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8494–8502 (2020)

    Google Scholar 

  11. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  12. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  13. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  14. Hubara, I., Nahshan, Y., Hanani, Y., Banner, R., Soudry, D.: Improving post training neural quantization: layer-wise calibration and integer programming. arXiv preprint arXiv:2006.10518 (2020)

  15. Hubara, I., Nahshan, Y., Hanani, Y., Banner, R., Soudry, D.: Accurate post training quantization with small calibration sets. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 4466–4475 (2021)

    Google Scholar 

  16. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (ICLR) (2014)

    Google Scholar 

  17. Krishnamoorthi, R.: Quantizing deep convolutional networks for efficient inference: a whitepaper. arXiv preprint arXiv:1806.08342 (2018)

  18. Kryzhanovskiy, V., Balitskiy, G., Kozyrskiy, N., Zuruev, A.: Qpp: real-time quantization parameter prediction for deep neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10684–10692 (2021)

    Google Scholar 

  19. Li, Y., et al.: Brecq: pushing the limit of post-training quantization by block reconstruction. In: Proceedings of the International Conference on Learning Representations (ICLR) (2021)

    Google Scholar 

  20. Li, Y., et al.: Mixmix: all you need for data-free compression are feature and data mixing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4410–4419 (2021)

    Google Scholar 

  21. Lin, M., et al.: Rotated binary neural network. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 7474–7485 (2020)

    Google Scholar 

  22. Liu, X., Ye, M., Zhou, D., Liu, Q.: Post-training quantization with multiple points: mixed precision without mixed precision. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 8697–8705 (2021)

    Google Scholar 

  23. Liu, Y., Zhang, W., Wang, J.: Zero-shot adversarial quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1512–1521 (2021)

    Google Scholar 

  24. Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: Proceedings of the International Conference on Learning Representations (ICLR) (2016)

    Google Scholar 

  25. van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. (JMLR) 9, 2579–2605 (2008)

    MATH  Google Scholar 

  26. Martinez, J., Shewakramani, J., Liu, T.W., Bârsan, I.A., Zeng, W., Urtasun, R.: Permute, quantize, and fine-tune: efficient compression of neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15699–15708 (2021)

    Google Scholar 

  27. Martinez, J., Zakhmi, S., Hoos, H.H., Little, J.J.: Lsq++: lower running time and higher recall in multi-codebook quantization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 491–506 (2018)

    Google Scholar 

  28. Nagel, M., Amjad, R.A., Van Baalen, M., Louizos, C., Blankevoort, T.: Up or down? adaptive rounding for post-training quantization. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 7197–7206 (2020)

    Google Scholar 

  29. Nahshan, Y., et al.: Loss aware post-training quantization. arXiv preprint arXiv:1911.07190 (2019)

  30. Nesterov, Y.E.: A method of solving a convex programming problem with convergence rate o(k\(\hat{\,\,}\)2). In: Proceedings of the Russian Academy of Sciences (RAS), pp. 543–547 (1983)

    Google Scholar 

  31. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 8026–8037 (2019)

    Google Scholar 

  32. Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., Dollár, P.: Designing network design spaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10428–10436 (2020)

    Google Scholar 

  33. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. (JCAM) 20, 53–65 (1987)

    Article  MATH  Google Scholar 

  34. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision (IJCV) 115, 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  35. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4510–4520 (2018)

    Google Scholar 

  36. Wang, P., Chen, Q., He, X., Cheng, J.: Towards accurate post-training network quantization via bit-split and stitching. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 9847–9856 (2020)

    Google Scholar 

  37. Xu, S., Li, H., Zhuang, B., Liu, J., Cao, J., Liang, C., Tan, M.: Generative low-bitwidth data free quantization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 1–17 (2020)

    Google Scholar 

  38. Yin, H., et al.: Dreaming to distill: Data-free knowledge transfer via deepinversion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8715–8724 (2020)

    Google Scholar 

  39. Zhang, X., et al.: Diversifying sample generation for accurate data-free quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15658–15667 (2021)

    Google Scholar 

  40. Zhang, Y., Chen, H., Chen, X., Deng, Y., Xu, C., Wang, Y.: Data-free knowledge distillation for image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7852–7861 (2021)

    Google Scholar 

  41. Zhong, Y., et al.: Intraq: learning synthetic images with intra-class heterogeneity for zero-shot network quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12339–12348 (2022)

    Google Scholar 

Download references

Acknowledgements

. This work was supported by the National Science Fund for Distinguished Young Scholars (No. 62025603), the National Natural Science Foundation of China (No. U21B2037, No. 62176222, No. 62176223, No. 62176226, No. 62072386, No. 62072387, No. 62072389, and No. 62002305), Guangdong Basic and Applied Basic Research Foundation (No. 2019B1515120049), and the Natural Science Foundation of Fujian Province of China (No. 2021J01002).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rongrong Ji .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1463 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhong, Y. et al. (2022). Fine-grained Data Distribution Alignment for Post-Training Quantization. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13671. Springer, Cham. https://doi.org/10.1007/978-3-031-20083-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20083-0_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20082-3

  • Online ISBN: 978-3-031-20083-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics