Data-Free Quantization of Vision Transformers Through Perturbation-Aware Image Synthesis

Yang, Yuchen; Mu, Lianrui; Zhuang, Jiedong; Liang, Xiaoyu; Ye, Jiangnan; Hu, Haoji

doi:10.1007/978-981-96-0122-6_32

Yuchen Yang¹²,
Lianrui Mu¹²,
Jiedong Zhuang¹²,
Xiaoyu Liang¹²,
Jiangnan Ye¹² &
…
Haoji Hu¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 15283))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

Abstract

Vision Transformers (ViT) have demonstrated outstanding performance in visual tasks. However, deploying and inferring ViT models on resource-constrained edge devices face challenges due to their high computational overhead. Existing quantization methods require access to raw training data, which raises security and privacy concerns. To address this issue, this paper proposes a data-free quantization method named Perturbation-Aware Vision Transformer (PA-ViT), which effectively enhances the robustness of synthetic images, thereby improving the performance of downstream post-training quantization tasks. Specifically, PA-ViT introduces perturbations to the synthetic images, and then models the inconsistency between the attention maps and predicted labels of both perturbed and unperturbed images, as processed by the full-precision (FP) model. A loss function is constructed to guide the generation of robust images. Experimental results on ImageNet demonstrate significant performance improvements compared to existing techniques and even surpass quantization using real data. For instance, PA-ViT with Swin-T as the backbone model achieves a 5.29% and 4.93% improvement in top-1 accuracy compared to the state-of-the-art model when quantized to 4-bit and 8-bit precision, respectively, providing an excellent solution for data-free post-training quantization of vision transformers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bai, J., et al.: Robustness-guided image synthesis for data-free quantization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, pp. 10971–10979 (2024)
Google Scholar
Chen, X., Yan, B., Zhu, J., Wang, D., Yang, X., Lu, H.: Transformer tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8126–8135 (2021)
Google Scholar
Choi, K., Hong, D., Park, N., Kim, Y., Lee, J.: Qimera: data-free quantization with synthetic boundary supporting samples. Adv. Neural. Inf. Process. Syst. 34, 14835–14847 (2021)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012)
Google Scholar
Li, Z., Chen, M., Xiao, J., Gu, Q.: PSAQ-ViT V2: toward accurate and general data-free quantization for vision transformers. IEEE Trans. Neural Netw. Learn. Syst. (2023)
Google Scholar
Li, Z., Ma, L., Chen, M., Xiao, J., Gu, Q.: Patch similarity aware data-free quantization for vision transformers. In: European Conference on Computer Vision, pp. 154–170. Springer (2022). https://doi.org/10.1007/978-3-031-20083-0_10
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Liu, Z., Wang, Y., Han, K., Zhang, W., Ma, S., Gao, W.: Post-training quantization for vision transformer. Adv. Neural. Inf. Process. Syst. 34, 28092–28103 (2021)
Google Scholar
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)
Google Scholar
Xu, S., Li, H., Zhuang, B., Liu, J., Cao, J., Liang, C., Tan, M.: Generative low-bitwidth data free quantization. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XII 16, pp. 1–17. Springer (2020). https://doi.org/10.1007/978-3-030-58610-2_1
Zhang, Y., et al.: CausalAdv: adversarial robustness through the lens of causality. arXiv preprint arXiv:2106.06196 (2021)
Zhong, Y., et al.: IntraQ: learning synthetic images with intra-class heterogeneity for zero-shot network quantization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12339–12348 (2022)
Google Scholar
Zhu, B., Hofstee, P., Peltenburg, J., Lee, J., Alars, Z.: Autorecon: neural architecture search-based reconstruction for data-free compression. arXiv preprint arXiv:2105.12151 (2021)

Download references

Author information

Authors and Affiliations

College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
Yuchen Yang, Lianrui Mu, Jiedong Zhuang, Xiaoyu Liang, Jiangnan Ye & Haoji Hu

Authors

Yuchen Yang
View author publications
You can also search for this author in PubMed Google Scholar
Lianrui Mu
View author publications
You can also search for this author in PubMed Google Scholar
Jiedong Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyu Liang
View author publications
You can also search for this author in PubMed Google Scholar
Jiangnan Ye
View author publications
You can also search for this author in PubMed Google Scholar
Haoji Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haoji Hu .

Editor information

Editors and Affiliations

Kyoto University, Kyoto, Japan
Rafik Hadfi
Lincoln University, Christchurch, New Zealand
Patricia Anthony
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Alok Sharma
Kyoto University, Kyoto, Japan
Takayuki Ito
University of Tasmania, Tasmania, TAS, Australia
Quan Bai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, Y., Mu, L., Zhuang, J., Liang, X., Ye, J., Hu, H. (2025). Data-Free Quantization of Vision Transformers Through Perturbation-Aware Image Synthesis. In: Hadfi, R., Anthony, P., Sharma, A., Ito, T., Bai, Q. (eds) PRICAI 2024: Trends in Artificial Intelligence. PRICAI 2024. Lecture Notes in Computer Science(), vol 15283. Springer, Singapore. https://doi.org/10.1007/978-981-96-0122-6_32

Download citation

DOI: https://doi.org/10.1007/978-981-96-0122-6_32
Published: 12 November 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0121-9
Online ISBN: 978-981-96-0122-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Data-Free Quantization of Vision Transformers Through Perturbation-Aware Image Synthesis