13 January 2024 Quantizing separable convolution of MobileNets with mixed precision
Chenlu Zhang, Guanpeng Zuo, Zhe Zheng, Wu Zhang, Yuan Rao, Zhaohui Jiang
Author Affiliations +
Abstract

As deep learning moves toward edge computing, researchers have developed techniques for efficient resource usage and accurate inference on mobile devices. Quantization, as one of the key approaches, enables the deployment of deep learning models on embedded platforms. However, MobileNet’s accuracy suffers due to quantization errors in depth-wise separable convolutions. To reach a smaller model size, we turn to a mixed-precision quantization strategy instead of uniform quantization. Motivated to gain a higher precision, a quantization-friendly separable convolution architecture has been conducted in a mixed precision quantization strategy search. Our approach introduces a quantization-friendly separable convolution architecture, enhancing MobileNet’s accuracy by addressing redundancy and quantization loss. Our framework demonstrates an eight times model size reduction with minimal accuracy loss compared to fixed-bit quantization. Evaluating on the ImageNet dataset and common objects in context dataset, our modified MobileNets almost closed the gap to the floating pipeline across 2-, 4-, 6-, and 8-bit settings. In the ablation experiment, after mixed quantization, our model can still maintain an accuracy of 72.84%, whereas our model has been compressed more than eight times.

© 2024 SPIE and IS&T
Chenlu Zhang, Guanpeng Zuo, Zhe Zheng, Wu Zhang, Yuan Rao, and Zhaohui Jiang "Quantizing separable convolution of MobileNets with mixed precision," Journal of Electronic Imaging 33(1), 013013 (13 January 2024). https://doi.org/10.1117/1.JEI.33.1.013013
Received: 21 June 2023; Accepted: 20 December 2023; Published: 13 January 2024
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Quantization

Convolution

Education and training

Ablation

Performance modeling

Batch normalization

Data modeling

Back to Top