Abstract
While post-training quantization receives popularity mostly due to its evasion in accessing the original complete training dataset, its poor performance also stems from scarce images. To alleviate this limitation, in this paper, we leverage the synthetic data introduced by zero-shot quantization with calibration dataset and propose a fine-grained data distribution alignment (FDDA) method to boost the performance of post-training quantization. The method is based on two important properties of batch normalization statistics (BNS) we observed in deep layers of the trained network, i.e., inter-class separation and intra-class incohesion. To preserve this fine-grained distribution information: 1) We calculate the per-class BNS of the calibration dataset as the BNS centers of each class and propose a BNS-centralized loss to force the synthetic data distributions of different classes to be close to their own centers. 2) We add Gaussian noise into the centers to imitate the incohesion and propose a BNS-distorted loss to force the synthetic data distribution of the same class to be close to the distorted centers. By utilizing these two fine-grained losses, our method manifests the state-of-the-art performance on ImageNet, especially when both the first and last layers are quantized to the low-bit. Code is at https://github.com/zysxmu/FDDA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Occasionally, the label \(\hat{y}\) is not available. In this case, it can be predicted by the pre-trained full-precision model.
References
Banner, R., Nahshan, Y., Soudry, D., et al.: Post training 4-bit quantization of convolutional networks for rapid-deployment. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 7950–7958 (2019)
Cai, Y., Yao, Z., Dong, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: Zeroq: a novel zero shot quantization framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13169–13178 (2020)
Chen, H., et al.: Data-free learning of student networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3514–3522 (2019)
Choi, J., Wang, Z., Venkataramani, S., Chuang, P.I.J., Srinivasan, V., Gopalakrishnan, K.: Pact: parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085 (2018)
Choi, K., Hong, D., Park, N., Kim, Y., Lee, J.: Qimera: data-free quantization with synthetic boundary supporting samples. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS) (2021)
Fang, J., Shafiee, A., Abdel-Aziz, H., Thorsley, D., Georgiadis, G., Hassoun, J.H.: Post-training piecewise linear quantization for deep neural networks. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 69–86 (2020)
Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021)
Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 2672–2680 (2014)
Han, S., Pool, J., Tran, J., Dally, W.J., et al.: Learning both weights and connections for efficient neural network. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 1135–1143 (2015)
Haroush, M., Hubara, I., Hoffer, E., Soudry, D.: The knowledge within: methods for data-free model compression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8494–8502 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Hubara, I., Nahshan, Y., Hanani, Y., Banner, R., Soudry, D.: Improving post training neural quantization: layer-wise calibration and integer programming. arXiv preprint arXiv:2006.10518 (2020)
Hubara, I., Nahshan, Y., Hanani, Y., Banner, R., Soudry, D.: Accurate post training quantization with small calibration sets. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 4466–4475 (2021)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (ICLR) (2014)
Krishnamoorthi, R.: Quantizing deep convolutional networks for efficient inference: a whitepaper. arXiv preprint arXiv:1806.08342 (2018)
Kryzhanovskiy, V., Balitskiy, G., Kozyrskiy, N., Zuruev, A.: Qpp: real-time quantization parameter prediction for deep neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10684–10692 (2021)
Li, Y., et al.: Brecq: pushing the limit of post-training quantization by block reconstruction. In: Proceedings of the International Conference on Learning Representations (ICLR) (2021)
Li, Y., et al.: Mixmix: all you need for data-free compression are feature and data mixing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4410–4419 (2021)
Lin, M., et al.: Rotated binary neural network. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 7474–7485 (2020)
Liu, X., Ye, M., Zhou, D., Liu, Q.: Post-training quantization with multiple points: mixed precision without mixed precision. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 8697–8705 (2021)
Liu, Y., Zhang, W., Wang, J.: Zero-shot adversarial quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1512–1521 (2021)
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: Proceedings of the International Conference on Learning Representations (ICLR) (2016)
van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. (JMLR) 9, 2579–2605 (2008)
Martinez, J., Shewakramani, J., Liu, T.W., Bârsan, I.A., Zeng, W., Urtasun, R.: Permute, quantize, and fine-tune: efficient compression of neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15699–15708 (2021)
Martinez, J., Zakhmi, S., Hoos, H.H., Little, J.J.: Lsq++: lower running time and higher recall in multi-codebook quantization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 491–506 (2018)
Nagel, M., Amjad, R.A., Van Baalen, M., Louizos, C., Blankevoort, T.: Up or down? adaptive rounding for post-training quantization. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 7197–7206 (2020)
Nahshan, Y., et al.: Loss aware post-training quantization. arXiv preprint arXiv:1911.07190 (2019)
Nesterov, Y.E.: A method of solving a convex programming problem with convergence rate o(k\(\hat{\,\,}\)2). In: Proceedings of the Russian Academy of Sciences (RAS), pp. 543–547 (1983)
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), pp. 8026–8037 (2019)
Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., Dollár, P.: Designing network design spaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10428–10436 (2020)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. (JCAM) 20, 53–65 (1987)
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision (IJCV) 115, 211–252 (2015)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4510–4520 (2018)
Wang, P., Chen, Q., He, X., Cheng, J.: Towards accurate post-training network quantization via bit-split and stitching. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 9847–9856 (2020)
Xu, S., Li, H., Zhuang, B., Liu, J., Cao, J., Liang, C., Tan, M.: Generative low-bitwidth data free quantization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 1–17 (2020)
Yin, H., et al.: Dreaming to distill: Data-free knowledge transfer via deepinversion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8715–8724 (2020)
Zhang, X., et al.: Diversifying sample generation for accurate data-free quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 15658–15667 (2021)
Zhang, Y., Chen, H., Chen, X., Deng, Y., Xu, C., Wang, Y.: Data-free knowledge distillation for image super-resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7852–7861 (2021)
Zhong, Y., et al.: Intraq: learning synthetic images with intra-class heterogeneity for zero-shot network quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12339–12348 (2022)
Acknowledgements
. This work was supported by the National Science Fund for Distinguished Young Scholars (No. 62025603), the National Natural Science Foundation of China (No. U21B2037, No. 62176222, No. 62176223, No. 62176226, No. 62072386, No. 62072387, No. 62072389, and No. 62002305), Guangdong Basic and Applied Basic Research Foundation (No. 2019B1515120049), and the Natural Science Foundation of Fujian Province of China (No. 2021J01002).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhong, Y. et al. (2022). Fine-grained Data Distribution Alignment for Post-Training Quantization. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13671. Springer, Cham. https://doi.org/10.1007/978-3-031-20083-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-20083-0_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20082-3
Online ISBN: 978-3-031-20083-0
eBook Packages: Computer ScienceComputer Science (R0)