Skip to main content
Log in

Compressed neural architecture utilizing dimensionality reduction and quantization

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Deep learning has become the default solution for a plethora of problems nowadays. However, one drawback of such deep learning-based solutions is that the models are very large and cumbersome to process. As such, they are difficult to use in small or embedded devices and to transmit across the web. In light of this problem, this paper presents a novel method for converting large neural networks into lightweight, compressed models. Our method utilizes the dimensionality reduction algorithm known as Principal Component Analysis to decompose the network weights into smaller matrices to create a new, compressed architecture. This compressed model is further trained to overcome the error due to the lossy compression and then the parameters are finally stored after quantization. Experiments on benchmark datasets using standard models show that we achieve high compression, with compression rates between 5 to 35 depending on the complexity of the model, with little to no fall in model accuracy. Comparison with other state-of-the-art methods shows that the performance of our compression method is similar or even better in certain cases. This is the first work where dimensionality reduction and quantization are combined to create a new, compressed model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  2. Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Netw 18(5-6):602–610

    Article  Google Scholar 

  3. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537

    MATH  Google Scholar 

  4. LeCun Y, Bottou L, Bengio Y, Haffner P, et al. (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  5. Taigman Y, Yang M, Ranzato M, Wolf L (2014) Deepface: Closing the gap to human-level performance in face verification. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1701–1708

  6. Coates A, Huval B, Wang T, Wu D, Catanzaro B, Andrew N (2013) Deep learning with cots hpc systems. Int Conf Mach Learn:1337–1345

  7. Neill JO (2020) An overview of neural network compression. arXiv:2006.03669

  8. Smith LI (2002) A tutorial on principal components analysis. Technical Report

  9. Gong Y, Liu L, Yang M, Bourdev L (2014) Compressing deep convolutional networks using vector quantization. arXiv:1412.6115

  10. Han S, Mao H, Dally WJ (2016) Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. 4th International Conference on Learning Representations, ICLR 2016. Conference Track Proceedings, San Juan

  11. Ba J, Caruana R (2014) Do deep nets really need to be deep? Adv Neural Inf Process Syst:2654–2662

  12. Swaminathan S, Garg D, Kannan R, Andres F (2020) Sparse low rank factorization for deep neural network compression. Neurocomputing 398:185–196

    Article  Google Scholar 

  13. Mirzadeh SI, Farajtabar M, Li A, Levine N, Matsukawa A, Ghasemzadeh H (2020) Improved knowledge distillation via teacher assistant. Proc AAAI Conf Artif Intell 34(04):5191–5198

    Google Scholar 

  14. Mishra R, Gupta HP, Dutta T (2020) A survey on deep neural network compression: Challenges, overview, and solutions. arXiv:2010.03954

  15. Peng H, Wu J, Chen S, Huang J (2019) Collaborative channel pruning for deep networks. Int Conf Mach Learn:5113–5122

  16. Luo J-H, Wu J, Lin W (2017) Thinet: A filter level pruning method for deep neural network compression. Proceedings of the IEEE Int Conf Comput Vis:5058–5066

  17. He X, Zhou Z, Thiele L (2018) Multi-task zipping via layer-wise neuron sharing. Adv Neural Inf Process Syst:6016–6026

  18. Wu J, Leng C, Wang Y, Hu Q, Cheng J (2016) Quantized convolutional neural networks for mobile devices. Proc IEEE Conf Comput Vis Pattern Recognit:4820–4828

  19. Vanhoucke V, Senior A, Mao MZ (2011) Improving the speed of neural networks on cpus

  20. Courbariaux M, Bengio Y, David J-P (2015) Binaryconnect: Training deep neural networks with binary weights during propagations. Adv Neural Inf Process Syst:3123–3131

  21. Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) Xnor-net: Imagenet classification using binary convolutional neural networks. Eur Conf Comput Vis:525–542

  22. Srinivas S, Babu RV (2015) Data-free parameter pruning for deep neural networks. Proceedings of the British Machine Vision Conference 2015, BMVC 2015, Swansea, pp 31.1–31.12

  23. Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. Advances in Neural Inf Process Syst:1135–1143

  24. Hanson SJ, Pratt LY (1989) Comparing biases for minimal network construction with back-propagation. Advances in Neural Inf Process Syst:177–185

  25. Chen W, Wilson J, Tyree S, Weinberger K, Chen Y (2015) Compressing neural networks with the hashing trick. Int Conf Mach Learn:2285–2294

  26. Lebedev V, Lempitsky V (2016) Fast convnets using group-wise brain damage. Proc IEEE Conf Comput Vis Pattern Recognit:2554–2564

  27. Zhou H, Alvarez JM, Porikli F (2016) Less is more: Towards compact cnns. Eur Conf Comput Vis:662–677

  28. Sindhwani V, Sainath T, Kumar S (2015) Structured transforms for small-footprint deep learning. Adv Neural Inf Process Syst:3088–3096

  29. Cheng Y, Yu FX, Feris RS, Kumar S, Choudhary A, Chang S-F (2015) An exploration of parameter redundancy in deep networks with circulant projections. Proc IEEE Int Conf Comput Vis:2857–2865

  30. Denton EL, Zaremba W, Bruna J, LeCun Y, Fergus R (2014) Exploiting linear structure within convolutional networks for efficient evaluation. Adv Neural Inf Process Syst:1269–1277

  31. Bucilu C, Caruana R, Niculescu-Mizil A (2006) Model compression. Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 535– 541

  32. Sadiq MT, Yu X, Yuan Z, Aziz MZ, Siuly S, Ding W (2021) Toward the development of versatile brain–computer interfaces. IEEE Trans Artif Intell 2(4):314–328

    Article  Google Scholar 

  33. Sadiq MT, Yu X, Yuan Z, Aziz MZ, Siuly S, Ding W (2020) A matrix determinant feature extraction approach for decoding motor and mental imagery eeg in subject specific tasks. IEEE Transactions on Cognitive and Developmental Systems

  34. Sadiq MT, Yu X, Yuan Z, Fan Z, Rehman AU, Li G, Xiao G (2019) Motor imagery eeg signals classification based on mode amplitude and frequency components using empirical wavelet transform. IEEE Access 7:127678–127692

    Article  Google Scholar 

  35. Sadiq MT, Yu X, Yuan Z, Zeming F, Rehman AU, Ullah I, Li G, Xiao G (2019) Motor imagery eeg signals decoding by multivariate empirical wavelet transform-based framework for robust brain–computer interfaces. IEEE access 7:171431– 171451

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Abdullah Adnan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hasan, M.S., Alam, R. & Adnan, M.A. Compressed neural architecture utilizing dimensionality reduction and quantization. Appl Intell 53, 1271–1286 (2023). https://doi.org/10.1007/s10489-022-03221-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03221-z

Keywords

Navigation