Quantizing separable convolution of MobileNets with mixed precision

Chenlu Zhang; Guanpeng Zuo; Zhe Zheng; Wu Zhang; Yuan Rao; Zhaohui Jiang

doi:10.1117/1.JEI.33.1.013013

13 January 2024 Quantizing separable convolution of MobileNets with mixed precision

Chenlu Zhang, Guanpeng Zuo, Zhe Zheng, Wu Zhang, Yuan Rao, Zhaohui Jiang

Author Affiliations +

Journal of Electronic Imaging, Vol. 33, Issue 1, 013013 (January 2024). https://doi.org/10.1117/1.JEI.33.1.013013

Abstract

As deep learning moves toward edge computing, researchers have developed techniques for efficient resource usage and accurate inference on mobile devices. Quantization, as one of the key approaches, enables the deployment of deep learning models on embedded platforms. However, MobileNet’s accuracy suffers due to quantization errors in depth-wise separable convolutions. To reach a smaller model size, we turn to a mixed-precision quantization strategy instead of uniform quantization. Motivated to gain a higher precision, a quantization-friendly separable convolution architecture has been conducted in a mixed precision quantization strategy search. Our approach introduces a quantization-friendly separable convolution architecture, enhancing MobileNet’s accuracy by addressing redundancy and quantization loss. Our framework demonstrates an eight times model size reduction with minimal accuracy loss compared to fixed-bit quantization. Evaluating on the ImageNet dataset and common objects in context dataset, our modified MobileNets almost closed the gap to the floating pipeline across 2-, 4-, 6-, and 8-bit settings. In the ablation experiment, after mixed quantization, our model can still maintain an accuracy of 72.84%, whereas our model has been compressed more than eight times.

Citation Download Citation

Chenlu Zhang, Guanpeng Zuo, Zhe Zheng, Wu Zhang, Yuan Rao, and Zhaohui Jiang "Quantizing separable convolution of MobileNets with mixed precision," Journal of Electronic Imaging 33(1), 013013 (13 January 2024). https://doi.org/10.1117/1.JEI.33.1.013013

Received: 21 June 2023; Accepted: 20 December 2023; Published: 13 January 2024

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $24.00

Non-members: $28.00 ADD TO CART

JOURNAL ARTICLE
17 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Quantization

Convolution

Education and training

Ablation

Performance modeling

Batch normalization

Data modeling

Show All Keywords

Keywords/Phrases

Search In:

Publication Years