Abstract
In machine learning, model compression is vital for resource-constrained Internet of Things (IoT) devices, such as unmanned aerial vehicles (UAVs) and smart phones. Currently there are some state-of-the-art (SOTA) compression methods, but little study is conducted to evaluate these techniques across different models and datasets. In this paper, we present an in-depth study on two SOTA model compression methods, pruning and quantization. We apply these methods on AlexNet, ResNet18, VGG16BN and VGG19BN, with three well known datasets, Fashion-MNIST, CIFAR-10, and UCI-HAR. Through our study, we draw the conclusion that, applying pruning and retraining could keep the performance (average less than \(0.5\%\) degrade) while reducing the model size (at \(10\times \) compression rate) on spatial domain datasets (e.g. pictures); the performance on temporal domain datasets (e.g. motion sensors data) degrades more (average about \(5.0\%\) degrade); the performance of quantization is related with the pruning rate and the network architecture. We also compare different clustering methods and reveal the impact on model accuracy and quantization ratio. Finally, we provide some interesting directions for future research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
References
Anguita, D., Ghio, A., Oneto, L., Parra Perez, X., Reyes Ortiz, J.L.: A public domain dataset for human activity recognition using smartphones. In: Proceedings of the 21th International European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pp. 437–442 (2013)
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. Technical report, Stanford (2006)
Ba, J., Caruana, R.: Do deep nets really need to be deep? Adv. Neural Inf. Process. Syst. 27 (2014)
Bhandari, B., Lu, J., Zheng, X., Rajasegarar, S., Karmakar, C.: Non-invasive sensor based automated smoking activity detection. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 845–848. IEEE (2017)
Buciluǎ, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535–541 (2006)
Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332 (2018)
Celebi, M.E., Kingravi, H.A.: Linear, deterministic, and order-invariant initialization methods for the K-means clustering algorithm. In: Celebi, M.E. (ed.) Partitional Clustering Algorithms, pp. 79–98. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-09259-1_3
Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282 (2017)
Dalhatu, K., Sim, A.T.H.: Density base k-mean’s cluster centroid initialization algorithm. Int. J. Comput. Appl. 137(11) (2016)
Deep, S., Tian, Y., Lu, J., Zhou, Y., Zheng, X.: Leveraging multi-view learning for human anomaly detection in industrial internet of things. In: 2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics), pp. 533–537. IEEE (2020)
Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. Adv. Neural Inf. Process. Syst. 27 (2014)
Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Drias, Z., Serhrouchni, A., Vogel, O.: Analysis of cyber security for industrial control systems. In: 2015 International Conference on Cyber Security of Smart Cities, Industrial Control System and Communications (SSIC), pp. 1–8. IEEE (2015)
Fujii, K., Higuchi, K., Rekimoto, J.: Endless flyer: a continuous flying drone with automatic battery replacement. In: 2013 IEEE 10th International Conference on Ubiquitous Intelligence and Computing and 2013 IEEE 10th International Conference on Autonomic and Trusted Computing, pp. 216–223. IEEE (2013)
Han, S., Shen, H., Philipose, M., Agarwal, S., Wolman, A., Krishnamurthy, A.: MCDNN: an approximation-based execution framework for deep stream processing under resource constraints. In: Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services, pp. 123–136 (2016)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv preprint arXiv:1510.00149 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \(<\)0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)
Kang, Y., Hauswald, J., Gao, C., Rovinski, A., Mudge, T., Mars, J., Tang, L.: Neurosurgeon: collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Comput. Archit. News 45(1), 615–629 (2017)
Krizhevsky, A.: Learning multiple layers of features from tiny images (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2012)
Lu, J., Wang, J., Zheng, X., Karmakar, C., Rajasegarar, S.: Detection of smoking events from confounding activities of daily living. In: Proceedings of the Australasian Computer Science Week Multiconference, pp. 1–9 (2019)
Lu, J., Zheng, X., Sheng, M., Jin, J., Yu, S.: Efficient human activity recognition using a single wearable sensor. IEEE Internet Things J. 7(11), 11137–11146 (2020)
Lu, J., et al.: Can steering wheel detect your driving fatigue? IEEE Trans. Veh. Technol. 70(6), 5537–5550 (2021)
Lu, Y., Kumar, A., Zhai, S., Cheng, Y., Javidi, T., Feris, R.: Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5334–5343 (2017)
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967)
Mammadova, M., Jabrayilova, Z.: Conceptual approaches to IoT-based personnel health management in offshore oil and gas industry. Control Optim. Industr. Appl. 257 (2020)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Sung, W.T., Hsu, Y.C.: Designing an industrial real-time measurement and monitoring system based on embedded system and ZigBee. Expert Syst. Appl. 38(4), 4522–4529 (2011)
Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Wang, S., Zhang, X., Uchiyama, H., Matsuda, H.: Hivemind: towards cellular native machine learning model splitting. IEEE J. Sel. Areas Commun. 40(2), 626–640 (2021)
Wang, T., et al.: Mobile edge-enabled trust evaluation for the internet of things. Inf. Fusion 75, 90–100 (2021)
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms (2017)
Zagoruyko, S., Komodakis, N.: Wide residual networks. arXiv preprint arXiv:1605.07146 (2016)
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Acknowledgements
This work is in part supported by an Australian Research Council (ARC) Discovery Project (DP210102447), an ARC Linkage Project (LP190100676), and a DATA61 project (Data61 CRP C020996).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Tian, Y., Luan, T.H., Zheng, X. (2023). An Empirical Study on Model Pruning and Quantization. In: Wang, W., Wu, J. (eds) Broadband Communications, Networks, and Systems. BROADNETS 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 511. Springer, Cham. https://doi.org/10.1007/978-3-031-40467-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-40467-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40466-5
Online ISBN: 978-3-031-40467-2
eBook Packages: Computer ScienceComputer Science (R0)