Fine-grained Data Distribution Alignment for Post-Training Quantization

Zhong, Yunshan; Lin, Mingbao; Chen, Mengzhao; Li, Ke; Shen, Yunhang; Chao, Fei; Wu, Yongjian; Ji, Rongrong

doi:10.1007/978-3-031-20083-0_5

Yunshan Zhong^12,13,
Mingbao Lin¹⁴,
Mengzhao Chen¹³,
Ke Li¹⁴,
Yunhang Shen¹⁴,
Fei Chao¹³,
Yongjian Wu¹⁴ &
…
Rongrong Ji^12,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13671))

Included in the following conference series:

European Conference on Computer Vision

2812 Accesses

Abstract

While post-training quantization receives popularity mostly due to its evasion in accessing the original complete training dataset, its poor performance also stems from scarce images. To alleviate this limitation, in this paper, we leverage the synthetic data introduced by zero-shot quantization with calibration dataset and propose a fine-grained data distribution alignment (FDDA) method to boost the performance of post-training quantization. The method is based on two important properties of batch normalization statistics (BNS) we observed in deep layers of the trained network, i.e., inter-class separation and intra-class incohesion. To preserve this fine-grained distribution information: 1) We calculate the per-class BNS of the calibration dataset as the BNS centers of each class and propose a BNS-centralized loss to force the synthetic data distributions of different classes to be close to their own centers. 2) We add Gaussian noise into the centers to imitate the incohesion and propose a BNS-distorted loss to force the synthetic data distribution of the same class to be close to the distorted centers. By utilizing these two fine-grained losses, our method manifests the state-of-the-art performance on ImageNet, especially when both the first and last layers are quantized to the low-bit. Code is at https://github.com/zysxmu/FDDA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MetaAug: Meta-data Augmentation for Post-training Quantization

ADEQ: Adaptive Diversity Enhancement for Zero-Shot Quantization

TTT-MIM: Test-Time Training with Masked Image Modeling for Denoising Distribution Shifts

Notes

1.
Occasionally, the label $\hat{y}$ is not available. In this case, it can be predicted by the pre-trained full-precision model.

References

Banner, R., Nahshan, Y., Soudry, D., et al.: Post training 4-bit quantization of convolutional networks for rapid-deployment. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 7950–7958 (2019)
Google Scholar
Cai, Y., Yao, Z., Dong, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: Zeroq: a novel zero shot quantization framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13169–13178 (2020)
Google Scholar
Chen, H., et al.: Data-free learning of student networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3514–3522 (2019)
Google Scholar
Choi, J., Wang, Z., Venkataramani, S., Chuang, P.I.J., Srinivasan, V., Gopalakrishnan, K.: Pact: parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085 (2018)
Choi, K., Hong, D., Park, N., Kim, Y., Lee, J.: Qimera: data-free quantization with synthetic boundary supporting samples. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS) (2021)
Google Scholar
Fang, J., Shafiee, A., Abdel-Aziz, H., Thorsley, D., Georgiadis, G., Hassoun, J.H.: Post-training piecewise linear quantization for deep neural networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 69–86 (2020)
Google Scholar
Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021)
Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 2672–2680 (2014)
Google Scholar
Han, S., Pool, J., Tran, J., Dally, W.J., et al.: Learning both weights and connections for efficient neural network. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 1135–1143 (2015)
Google Scholar
Haroush, M., Hubara, I., Hoffer, E., Soudry, D.: The knowledge within: methods for data-free model compression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8494–8502 (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Hubara, I., Nahshan, Y., Hanani, Y., Banner, R., Soudry, D.: Improving post training neural quantization: layer-wise calibration and integer programming. arXiv preprint arXiv:2006.10518 (2020)
Hubara, I., Nahshan, Y., Hanani, Y., Banner, R., Soudry, D.: Accurate post training quantization with small calibration sets. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 4466–4475 (2021)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (ICLR) (2014)
Google Scholar
Krishnamoorthi, R.: Quantizing deep convolutional networks for efficient inference: a whitepaper. arXiv preprint arXiv:1806.08342 (2018)
Kryzhanovskiy, V., Balitskiy, G., Kozyrskiy, N., Zuruev, A.: Qpp: real-time quantization parameter prediction for deep neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10684–10692 (2021)
Google Scholar
Li, Y., et al.: Brecq: pushing the limit of post-training quantization by block reconstruction. In: Proceedings of the International Conference on Learning Representations (ICLR) (2021)
Google Scholar
Li, Y., et al.: Mixmix: all you need for data-free compression are feature and data mixing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4410–4419 (2021)
Google Scholar
Lin, M., et al.: Rotated binary neural network. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 7474–7485 (2020)
Google Scholar
Liu, X., Ye, M., Zhou, D., Liu, Q.: Post-training quantization with multiple points: mixed precision without mixed precision. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 8697–8705 (2021)
Google Scholar
Liu, Y., Zhang, W., Wang, J.: Zero-shot adversarial quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1512–1521 (2021)
Google Scholar
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: Proceedings of the International Conference on Learning Representations (ICLR) (2016)
Google Scholar
van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. (JMLR) 9, 2579–2605 (2008)
MATH Google Scholar
Martinez, J., Shewakramani, J., Liu, T.W., Bârsan, I.A., Zeng, W., Urtasun, R.: Permute, quantize, and fine-tune: efficient compression of neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15699–15708 (2021)
Google Scholar
Martinez, J., Zakhmi, S., Hoos, H.H., Little, J.J.: Lsq++: lower running time and higher recall in multi-codebook quantization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 491–506 (2018)
Google Scholar
Nagel, M., Amjad, R.A., Van Baalen, M., Louizos, C., Blankevoort, T.: Up or down? adaptive rounding for post-training quantization. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 7197–7206 (2020)
Google Scholar
Nahshan, Y., et al.: Loss aware post-training quantization. arXiv preprint arXiv:1911.07190 (2019)
Nesterov, Y.E.: A method of solving a convex programming problem with convergence rate o(k$\hat{\,\,}$2). In: Proceedings of the Russian Academy of Sciences (RAS), pp. 543–547 (1983)
Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 8026–8037 (2019)
Google Scholar
Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., Dollár, P.: Designing network design spaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10428–10436 (2020)
Google Scholar
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. (JCAM) 20, 53–65 (1987)
Article MATH Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision (IJCV) 115, 211–252 (2015)
Article MathSciNet Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4510–4520 (2018)
Google Scholar
Wang, P., Chen, Q., He, X., Cheng, J.: Towards accurate post-training network quantization via bit-split and stitching. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 9847–9856 (2020)
Google Scholar
Xu, S., Li, H., Zhuang, B., Liu, J., Cao, J., Liang, C., Tan, M.: Generative low-bitwidth data free quantization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 1–17 (2020)
Google Scholar
Yin, H., et al.: Dreaming to distill: Data-free knowledge transfer via deepinversion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8715–8724 (2020)
Google Scholar
Zhang, X., et al.: Diversifying sample generation for accurate data-free quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15658–15667 (2021)
Google Scholar
Zhang, Y., Chen, H., Chen, X., Deng, Y., Xu, C., Wang, Y.: Data-free knowledge distillation for image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7852–7861 (2021)
Google Scholar
Zhong, Y., et al.: Intraq: learning synthetic images with intra-class heterogeneity for zero-shot network quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12339–12348 (2022)
Google Scholar

Download references

Acknowledgements

. This work was supported by the National Science Fund for Distinguished Young Scholars (No. 62025603), the National Natural Science Foundation of China (No. U21B2037, No. 62176222, No. 62176223, No. 62176226, No. 62072386, No. 62072387, No. 62072389, and No. 62002305), Guangdong Basic and Applied Basic Research Foundation (No. 2019B1515120049), and the Natural Science Foundation of Fujian Province of China (No. 2021J01002).

Author information

Authors and Affiliations

Institute of Artificial Intelligence, Xiamen University, Xiamen, China
Yunshan Zhong & Rongrong Ji
Media Analytics and Computing Lab, Department of Artificial Intelligence, School of Informatics, Xiamen University, Xiamen, China
Yunshan Zhong, Mengzhao Chen, Fei Chao & Rongrong Ji
Tencent Youtu Lab., Shanghai, China
Mingbao Lin, Ke Li, Yunhang Shen & Yongjian Wu

Authors

Yunshan Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Mingbao Lin
View author publications
You can also search for this author in PubMed Google Scholar
Mengzhao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ke Li
View author publications
You can also search for this author in PubMed Google Scholar
Yunhang Shen
View author publications
You can also search for this author in PubMed Google Scholar
Fei Chao
View author publications
You can also search for this author in PubMed Google Scholar
Yongjian Wu
View author publications
You can also search for this author in PubMed Google Scholar
Rongrong Ji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rongrong Ji .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1463 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, Y. et al. (2022). Fine-grained Data Distribution Alignment for Post-Training Quantization. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13671. Springer, Cham. https://doi.org/10.1007/978-3-031-20083-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-20083-0_5
Published: 03 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20082-3
Online ISBN: 978-3-031-20083-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Fine-grained Data Distribution Alignment for Post-Training Quantization