Skip to main content

An Empirical Study on Model Pruning and Quantization

  • Conference paper
  • First Online:
Broadband Communications, Networks, and Systems (BROADNETS 2023)

Abstract

In machine learning, model compression is vital for resource-constrained Internet of Things (IoT) devices, such as unmanned aerial vehicles (UAVs) and smart phones. Currently there are some state-of-the-art (SOTA) compression methods, but little study is conducted to evaluate these techniques across different models and datasets. In this paper, we present an in-depth study on two SOTA model compression methods, pruning and quantization. We apply these methods on AlexNet, ResNet18, VGG16BN and VGG19BN, with three well known datasets, Fashion-MNIST, CIFAR-10, and UCI-HAR. Through our study, we draw the conclusion that, applying pruning and retraining could keep the performance (average less than \(0.5\%\) degrade) while reducing the model size (at \(10\times \) compression rate) on spatial domain datasets (e.g. pictures); the performance on temporal domain datasets (e.g. motion sensors data) degrades more (average about \(5.0\%\) degrade); the performance of quantization is related with the pruning rate and the network architecture. We also compare different clustering methods and reveal the impact on model accuracy and quantization ratio. Finally, we provide some interesting directions for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.raspberrypi.com/documentation/microcontrollers/raspberry-pi-pico.html.

  2. 2.

    https://github.com/paul-tian/broadnets2022-compression.

  3. 3.

    https://github.com/zalandoresearch/fashion-mnist.

  4. 4.

    https://www.cs.toronto.edu/~kriz/cifar.html.

  5. 5.

    https://archive.ics.uci.edu/ml/datasets/human+activity+recognition+using+smartphones.

  6. 6.

    https://github.com/paul-tian/broadnets2022-compression.

References

  1. Anguita, D., Ghio, A., Oneto, L., Parra Perez, X., Reyes Ortiz, J.L.: A public domain dataset for human activity recognition using smartphones. In: Proceedings of the 21th International European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp. 437–442 (2013)

    Google Scholar 

  2. Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. Technical report, Stanford (2006)

    Google Scholar 

  3. Ba, J., Caruana, R.: Do deep nets really need to be deep? Adv. Neural Inf. Process. Syst. 27 (2014)

    Google Scholar 

  4. Bhandari, B., Lu, J., Zheng, X., Rajasegarar, S., Karmakar, C.: Non-invasive sensor based automated smoking activity detection. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 845–848. IEEE (2017)

    Google Scholar 

  5. Buciluǎ, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535–541 (2006)

    Google Scholar 

  6. Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332 (2018)

  7. Celebi, M.E., Kingravi, H.A.: Linear, deterministic, and order-invariant initialization methods for the K-means clustering algorithm. In: Celebi, M.E. (ed.) Partitional Clustering Algorithms, pp. 79–98. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-09259-1_3

    Chapter  Google Scholar 

  8. Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282 (2017)

  9. Dalhatu, K., Sim, A.T.H.: Density base k-mean’s cluster centroid initialization algorithm. Int. J. Comput. Appl. 137(11) (2016)

    Google Scholar 

  10. Deep, S., Tian, Y., Lu, J., Zhou, Y., Zheng, X.: Leveraging multi-view learning for human anomaly detection in industrial internet of things. In: 2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics), pp. 533–537. IEEE (2020)

    Google Scholar 

  11. Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. Adv. Neural Inf. Process. Syst. 27 (2014)

    Google Scholar 

  12. Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  13. Drias, Z., Serhrouchni, A., Vogel, O.: Analysis of cyber security for industrial control systems. In: 2015 International Conference on Cyber Security of Smart Cities, Industrial Control System and Communications (SSIC), pp. 1–8. IEEE (2015)

    Google Scholar 

  14. Fujii, K., Higuchi, K., Rekimoto, J.: Endless flyer: a continuous flying drone with automatic battery replacement. In: 2013 IEEE 10th International Conference on Ubiquitous Intelligence and Computing and 2013 IEEE 10th International Conference on Autonomic and Trusted Computing, pp. 216–223. IEEE (2013)

    Google Scholar 

  15. Han, S., Shen, H., Philipose, M., Agarwal, S., Wolman, A., Krishnamurthy, A.: MCDNN: an approximation-based execution framework for deep stream processing under resource constraints. In: Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services, pp. 123–136 (2016)

    Google Scholar 

  16. Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv preprint arXiv:1510.00149 (2015)

  17. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)

    Google Scholar 

  18. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  19. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  20. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016)

  21. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)

    Google Scholar 

  22. Kang, Y., Hauswald, J., Gao, C., Rovinski, A., Mudge, T., Mars, J., Tang, L.: Neurosurgeon: collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Comput. Archit. News 45(1), 615–629 (2017)

    Article  Google Scholar 

  23. Krizhevsky, A.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  24. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012)

    Google Scholar 

  25. Lu, J., Wang, J., Zheng, X., Karmakar, C., Rajasegarar, S.: Detection of smoking events from confounding activities of daily living. In: Proceedings of the Australasian Computer Science Week Multiconference, pp. 1–9 (2019)

    Google Scholar 

  26. Lu, J., Zheng, X., Sheng, M., Jin, J., Yu, S.: Efficient human activity recognition using a single wearable sensor. IEEE Internet Things J. 7(11), 11137–11146 (2020)

    Article  Google Scholar 

  27. Lu, J., et al.: Can steering wheel detect your driving fatigue? IEEE Trans. Veh. Technol. 70(6), 5537–5550 (2021)

    Article  Google Scholar 

  28. Lu, Y., Kumar, A., Zhai, S., Cheng, Y., Javidi, T., Feris, R.: Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5334–5343 (2017)

    Google Scholar 

  29. MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967)

    Google Scholar 

  30. Mammadova, M., Jabrayilova, Z.: Conceptual approaches to IoT-based personnel health management in offshore oil and gas industry. Control Optim. Industr. Appl. 257 (2020)

    Google Scholar 

  31. Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019)

    Article  Google Scholar 

  32. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  33. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  34. Sung, W.T., Hsu, Y.C.: Designing an industrial real-time measurement and monitoring system based on embedded system and ZigBee. Expert Syst. Appl. 38(4), 4522–4529 (2011)

    Article  Google Scholar 

  35. Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)

    Google Scholar 

  36. Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)

    Google Scholar 

  37. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  38. Wang, S., Zhang, X., Uchiyama, H., Matsuda, H.: Hivemind: towards cellular native machine learning model splitting. IEEE J. Sel. Areas Commun. 40(2), 626–640 (2021)

    Article  Google Scholar 

  39. Wang, T., et al.: Mobile edge-enabled trust evaluation for the internet of things. Inf. Fusion 75, 90–100 (2021)

    Article  Google Scholar 

  40. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017)

    Google Scholar 

  41. Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)

  42. Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)

    Google Scholar 

Download references

Acknowledgements

This work is in part supported by an Australian Research Council (ARC) Discovery Project (DP210102447), an ARC Linkage Project (LP190100676), and a DATA61 project (Data61 CRP C020996).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xi Zheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tian, Y., Luan, T.H., Zheng, X. (2023). An Empirical Study on Model Pruning and Quantization. In: Wang, W., Wu, J. (eds) Broadband Communications, Networks, and Systems. BROADNETS 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 511. Springer, Cham. https://doi.org/10.1007/978-3-031-40467-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-40467-2_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-40466-5

  • Online ISBN: 978-3-031-40467-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics