Compressed neural architecture utilizing dimensionality reduction and quantization

Hasan, Md. Saqib; Alam, Rukshar; Adnan, Muhammad Abdullah

doi:10.1007/s10489-022-03221-z

Compressed neural architecture utilizing dimensionality reduction and quantization

Published: 27 April 2022

Volume 53, pages 1271–1286, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

382 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Deep learning has become the default solution for a plethora of problems nowadays. However, one drawback of such deep learning-based solutions is that the models are very large and cumbersome to process. As such, they are difficult to use in small or embedded devices and to transmit across the web. In light of this problem, this paper presents a novel method for converting large neural networks into lightweight, compressed models. Our method utilizes the dimensionality reduction algorithm known as Principal Component Analysis to decompose the network weights into smaller matrices to create a new, compressed architecture. This compressed model is further trained to overcome the error due to the lossy compression and then the parameters are finally stored after quantization. Experiments on benchmark datasets using standard models show that we achieve high compression, with compression rates between 5 to 35 depending on the complexity of the model, with little to no fall in model accuracy. Comparison with other state-of-the-art methods shows that the performance of our compression method is similar or even better in certain cases. This is the first work where dimensionality reduction and quantization are combined to create a new, compressed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Deep learning for time series classification: a review

Article 02 March 2019

References

Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Google Scholar
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5-6):602–610
Article Google Scholar
Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537
MATH Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P, et al. (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: Closing the gap to human-level performance in face verification. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1701–1708
Coates A, Huval B, Wang T, Wu D, Catanzaro B, Andrew N (2013) Deep learning with cots hpc systems. Int Conf Mach Learn:1337–1345
Neill JO (2020) An overview of neural network compression. arXiv:2006.03669
Smith LI (2002) A tutorial on principal components analysis. Technical Report
Gong Y, Liu L, Yang M, Bourdev L (2014) Compressing deep convolutional networks using vector quantization. arXiv:1412.6115
Han S, Mao H, Dally WJ (2016) Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. 4th International Conference on Learning Representations, ICLR 2016. Conference Track Proceedings, San Juan
Ba J, Caruana R (2014) Do deep nets really need to be deep? Adv Neural Inf Process Syst:2654–2662
Swaminathan S, Garg D, Kannan R, Andres F (2020) Sparse low rank factorization for deep neural network compression. Neurocomputing 398:185–196
Article Google Scholar
Mirzadeh SI, Farajtabar M, Li A, Levine N, Matsukawa A, Ghasemzadeh H (2020) Improved knowledge distillation via teacher assistant. Proc AAAI Conf Artif Intell 34(04):5191–5198
Google Scholar
Mishra R, Gupta HP, Dutta T (2020) A survey on deep neural network compression: Challenges, overview, and solutions. arXiv:2010.03954
Peng H, Wu J, Chen S, Huang J (2019) Collaborative channel pruning for deep networks. Int Conf Mach Learn:5113–5122
Luo J-H, Wu J, Lin W (2017) Thinet: A filter level pruning method for deep neural network compression. Proceedings of the IEEE Int Conf Comput Vis:5058–5066
He X, Zhou Z, Thiele L (2018) Multi-task zipping via layer-wise neuron sharing. Adv Neural Inf Process Syst:6016–6026
Wu J, Leng C, Wang Y, Hu Q, Cheng J (2016) Quantized convolutional neural networks for mobile devices. Proc IEEE Conf Comput Vis Pattern Recognit:4820–4828
Vanhoucke V, Senior A, Mao MZ (2011) Improving the speed of neural networks on cpus
Courbariaux M, Bengio Y, David J-P (2015) Binaryconnect: Training deep neural networks with binary weights during propagations. Adv Neural Inf Process Syst:3123–3131
Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) Xnor-net: Imagenet classification using binary convolutional neural networks. Eur Conf Comput Vis:525–542
Srinivas S, Babu RV (2015) Data-free parameter pruning for deep neural networks. Proceedings of the British Machine Vision Conference 2015, BMVC 2015, Swansea, pp 31.1–31.12
Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. Advances in Neural Inf Process Syst:1135–1143
Hanson SJ, Pratt LY (1989) Comparing biases for minimal network construction with back-propagation. Advances in Neural Inf Process Syst:177–185
Chen W, Wilson J, Tyree S, Weinberger K, Chen Y (2015) Compressing neural networks with the hashing trick. Int Conf Mach Learn:2285–2294
Lebedev V, Lempitsky V (2016) Fast convnets using group-wise brain damage. Proc IEEE Conf Comput Vis Pattern Recognit:2554–2564
Zhou H, Alvarez JM, Porikli F (2016) Less is more: Towards compact cnns. Eur Conf Comput Vis:662–677
Sindhwani V, Sainath T, Kumar S (2015) Structured transforms for small-footprint deep learning. Adv Neural Inf Process Syst:3088–3096
Cheng Y, Yu FX, Feris RS, Kumar S, Choudhary A, Chang S-F (2015) An exploration of parameter redundancy in deep networks with circulant projections. Proc IEEE Int Conf Comput Vis:2857–2865
Denton EL, Zaremba W, Bruna J, LeCun Y, Fergus R (2014) Exploiting linear structure within convolutional networks for efficient evaluation. Adv Neural Inf Process Syst:1269–1277
Bucilu C, Caruana R, Niculescu-Mizil A (2006) Model compression. Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 535– 541
Sadiq MT, Yu X, Yuan Z, Aziz MZ, Siuly S, Ding W (2021) Toward the development of versatile brain–computer interfaces. IEEE Trans Artif Intell 2(4):314–328
Article Google Scholar
Sadiq MT, Yu X, Yuan Z, Aziz MZ, Siuly S, Ding W (2020) A matrix determinant feature extraction approach for decoding motor and mental imagery eeg in subject specific tasks. IEEE Transactions on Cognitive and Developmental Systems
Sadiq MT, Yu X, Yuan Z, Fan Z, Rehman AU, Li G, Xiao G (2019) Motor imagery eeg signals classification based on mode amplitude and frequency components using empirical wavelet transform. IEEE Access 7:127678–127692
Article Google Scholar
Sadiq MT, Yu X, Yuan Z, Zeming F, Rehman AU, Ullah I, Li G, Xiao G (2019) Motor imagery eeg signals decoding by multivariate empirical wavelet transform-based framework for robust brain–computer interfaces. IEEE access 7:171431– 171451
Article Google Scholar

Download references

Author information

Authors and Affiliations

Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
Md. Saqib Hasan, Rukshar Alam & Muhammad Abdullah Adnan

Authors

Md. Saqib Hasan
View author publications
You can also search for this author in PubMed Google Scholar
Rukshar Alam
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Abdullah Adnan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Abdullah Adnan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hasan, M.S., Alam, R. & Adnan, M.A. Compressed neural architecture utilizing dimensionality reduction and quantization. Appl Intell 53, 1271–1286 (2023). https://doi.org/10.1007/s10489-022-03221-z

Download citation

Accepted: 10 January 2022
Published: 27 April 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10489-022-03221-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compressed neural architecture utilizing dimensionality reduction and quantization

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Deep learning for time series classification: a review

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Compressed neural architecture utilizing dimensionality reduction and quantization

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Deep learning for time series classification: a review

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation