Conferences >2021 International Joint Conf...

An Empirical Comparison of Quantization, Pruning and Low-rank Neural Network Compression using the LC Toolkit

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Compression of machine learning models, and of neural networks in particular, has become an essential problem among practitioners. Many different approaches including qua...Show More

Metadata

Abstract:

Compression of machine learning models, and of neural networks in particular, has become an essential problem among practitioners. Many different approaches including quantization, pruning, low-rank and tensor decompositions have been proposed in the literature to solve the problem. Despite this, an important unanswered question remains: what is the best compression scheme for a model? As a step towards answering this question objectively and fairly, we empirically compare quantization, pruning, and low-rank compressions in the algorithmic footing of the Learning-Compression (LC) framework. This allows us to explore the compression schemes systematically and perform an apples-to-apples comparison along the entire error-compression tradeoff curves. We describe our methodology, the framework, experimental setup, and present our comparisons. Based on our experiments, we conclude that the choice of compression is strongly model-dependent: for example, VGG16 is better compressed with pruning, while quantization is more suitable for the ResNets. This, once again, underlines the need for a common benchmark of compression schemes with fair and objective comparisons of the models of interest.

Published in: 2021 International Joint Conference on Neural Networks (IJCNN)

Date of Conference: 18-22 July 2021

Date Added to IEEE Xplore: 20 September 2021

ISBN Information:

ISSN Information:

DOI: 10.1109/IJCNN52387.2021.9533730

Conference Location: Shenzhen, China

Contents

References is not available for this document.

An Empirical Comparison of Quantization, Pruning and Low-rank Neural Network Compression using the LC Toolkit

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

An Empirical Comparison of Quantization, Pruning and Low-rank Neural Network Compression using the LC Toolkit

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?