Methodologies of Compressing a Stable Performance Convolutional Neural Networks in Image Classification

Al-Hami, Mo’taz; Pietron, Marcin; Casas, Raul; Wielgosz, Maciej

doi:10.1007/s11063-019-10076-y

Methodologies of Compressing a Stable Performance Convolutional Neural Networks in Image Classification

Published: 20 July 2019

Volume 51, pages 105–127, (2020)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Mo’taz Al-Hami ORCID: orcid.org/0000-0003-4633-9870¹,
Marcin Pietron²,
Raul Casas³ &
…
Maciej Wielgosz²

607 Accesses
16 Citations
9 Altmetric
Explore all metrics

Abstract

Deep learning has made a real revolution in the embedded computing environment. Convolutional neural network (CNN) revealed itself as a reliable fit to many emerging problems. The next step, is to enhance the CNN role in the embedded devices including both implementation details and performance. Resources needs of storage and computational ability are limited and constrained, resulting in key issues we have to consider in embedded devices. Compressing (i.e., quantizing) the CNN network is a valuable solution. In this paper, Our main goals are: memory compression and complexity reduction (both operations and cycles reduction) of CNNs, using methods (including quantization and pruning) that don’t require retraining (i.e., allowing us to exploit them in mobile system, or robots). Also, exploring further quantization techniques for further complexity reduction. To achieve these goals, we compress a CNN model layers (i.e., parameters and outputs) into suitable precision formats using several quantization methodologies. The methodologies are: First, we describe a pruning approach, which allows us to reduce the required storage and computation cycles in embedded devices. Such enhancement can drastically reduce the consumed power and the required resources. Second, a hybrid quantization approach with automatic tuning for the network compression. Third, a K-means quantization approach. With a minor degradation relative to the floating-point performance, the presented pruning and quantization methods are able to produce a stable performance fixed-point reduced networks. A precise fixed-point calculations for coefficients, input/output signals and accumulators are considered in the quantization process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

A survey of the recent architectures of deep convolutional neural networks

Article 21 April 2020

A comprehensive review of Binary Neural Network

Article 30 March 2023

References

Ciresan D, Meier U, Masci J, Gambardella L, Schmidhuber J (2011) Flexible, high performance convolutional neural networks for image classification. In: Proceedings of the twenty-second international joint conference on artificial intelligence, pp 1237–1242
Ciresan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3642–3649
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Dipert B, Bier J, Rowen C, Dashwood J, Laroche D, Ors A, Thompson M (2016) Deep learning for object recognition: DSP and specialized processor optimizations. http://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/documents/pages/cnn-dsps. Aaccessed 30 Dec 2018
NVIDIA, Nvidia pascal architecture. http://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf (2016). Accessed 30 Dec 2018
Delaye E, Sirasao A, Dudha C, Das S (2017) Deep learning challenges and solutions with xilinx fpgas. In: IEEE/ACM international conference on computer-aided design (ICCAD), pp 908–913
Al-Hami M, Lakaemper R (2014) Sitting pose generation using genetic algorithm for nao humanoid robots. In: IEEE international workshop on advanced robotics and its social impacts, pp 137–142
Al-Hami M, Lakaemper R (2015) Towards human pose semantic synthesis in 3D based on query keywords. VISAPP (3)
Al-Hami M, Lakaemper R (2017) Reconstructing 3D human poses from keyword based image database query. In: International conference on 3D vision (3DV), pp 440–448
Al-Hami M, Lakaemper R, Rawashdeh M, Hossain MS (2019) Camera localization for a human-pose in 3D space using a single 2D human-pose image with landmarks: a multimedia social network emerging demand. Multimed Tools Appl 78(3):3587–3608
Google Scholar
Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2017) Quantized neural networks: training neural networks with low precision weights and activations. J Mach Learn Res 18:6869–6898
Google Scholar
Lin D, Talathi S, Annapureddy V (2016) Fixed point quantization of deep convolutional networks. In: International conference on machine learning (ICML), pp 2849–2858
Courbariaux M, David J-P, Bengio Y (2014) Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024
Courbariaux M, Hubara I, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks: training neural networks with weights and activations constrained to \(+1\) or \(-1\). arXiv preprint arXiv:1602.02830
Gupta S, Agrawal A, Gopalakrishnan K, Narayanan P (2015) Deep learning with limited numerical precision. In: Proceedings of the 32nd international conference on machine learning, pp 1737–1746
Esser S, Appuswamy R, Merolla P, Arthur J, Modha D (2015) Backpropagation for energy-efficient neuromorphic computing. In: Advances in neural information processing systems, vol 435, pp 1117–1125
Anwar S, Hwang K, Sung W (2015) Fixed point optimization of deep convolutional neural networks for object recognition. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1131–1135
Gysel P, Motamedi M, Ghiasi S (2016) Hardware-oriented approximation of convolutional neural networks. ArXiv e-prints, arXiv:1604.03168
Vanhoucke V, Senior A, Mao M (2011) Improving the speed of neural networks on cpus. In: Proceedings of the deep learning and unsupervised feature learning NIPS workshop
Hwang K, Sung W (2014) Fixed-point feedforward deep neural network design using weights +1, 0, and \(-1\). In: IEEE workshop on signal processing systems (SiPS)
Courbariaux M, Bengio Y, David J (2015) Binaryconnect: training deep neural networks with binary weights during propagations. In: Advances in neural information processing systems, pp 3123–3131
Soudry D, Hubara I, Meir R (2014) Expectation backpropagation: parameter-free training of multilayer neural networks with continuous or discrete weights. In: Advances in neural information processing systems (NIPS), pp 963–971
Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) Xnor-net: Imagenet classification using binary convolutional neural networks. ArXiv e-prints, arXiv:1603.05279
Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz M A, Dally W J (2016) Eie: Efficient inference engine on compressed deep neural network. arXiv preprint arXiv:1602.01528
Han S, Mao H, Dally W J (2016) Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1602.01528
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Google Scholar
Iandola F N, Moskewicz M W, Ashraf K, Han S, Dally W J, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 0.5mb model 465 size. arXiv preprint arXiv:1602.07360
Krizhevsky A, Sutskever I, Hinton G E (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Wielgosz M, Pietron M (2017) Using spatial pooler of hierarchical temporal memory to classify noisy videos with predefined complexity. J Neurocomput 240:84–97
Google Scholar
Wielgosz M, Pietron M, Wiatr K (2016) Opencl-accelerated object classification in video streams using spatial pooler of hierarchical temporal memory. arXiv preprint arXiv:1608.01966
Pietron M, Wielgosz M, Wiatr K (2016) Formal analysis of htm spatial pooler performance under predefined operation condition. In: International joint conference on rough sets, pp 396–405
Pietron M, Wielgosz M, Wiatr K (2016) Parallel implementation of spatial pooler in hierarchical temporal memory. In: International conference on agents and artificial intelligence (ICAART), pp 346–353
Ristretto quantization system, http://lepsucd.com/?page_id=621 (2016). Accessed 31 Dec 2018
Google, Tensorflow, https://www.tensorflow.org (2016). Accessed 22 Feb 2017
Al-Hami M, Pietron M, Casas R, Hijazi S, Kaul P (2018) Towards a stable quantized convolutional neural networks: an embedded perspective. In: 10th International conference on agents and artificial intelligence (ICAART), pp 573–580
Courbariaux M, Bengio Y, Jean-Pierre D (2014) Training deep neural networks with low precision multiplications. arXiv preprint arXiv:1412.7024
Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. In: Advances in neural information processing systems (NIPS), pp 1135–1143
Huang Q, Zhou K, You S, Neumann U (2018) Learning to prune filters in convolutional neural networks, arXiv:1801.07365
Li H, Kadav A, Durdanovic I, Samet H, Graf H P (2016) Pruning filters for efficient convnets arXiv preprint arXiv:1608.08710,
Gysel P (2016) Ristretto: Hardware-oriented approximation of convolutional neural networks. arXiv preprint arXiv:1605.06402
Al-Hami M, Pietron M, Kumar R, Casas R, Hijazi S, Rowen C (2018) Method for hybrid precision convolutional neural network representation. arXiv preprint arXiv:1807.09760
Gysel P, Motamedi M, Ghiasi S (2016) Hardware-oriented approximation of convolutional neural networks. arXiv preprint arXiv:1604.03168
Han S, Mao H, Dally W J (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149
Wu J, Leng C, Wang Y, Hu Q, Cheng J (2016) Quantized convolutional neural networks for mobile devices. In: IEEE conference on computer vision and pattern recognition (CVPR)
Zhang L, Liu B (2016) Ternary weight networks. arXiv:1605.04711

Download references

Author information

Authors and Affiliations

Department of Computer Information System, The Hashemite University, Zarqa, 13115, Jordan
Mo’taz Al-Hami
Department of Computer Science and Electrical Engineering, AGH University, Cracow, Poland
Marcin Pietron & Maciej Wielgosz
Cadence Design Systems, San Jose, CA, USA
Raul Casas

Authors

Mo’taz Al-Hami
View author publications
You can also search for this author in PubMed Google Scholar
Marcin Pietron
View author publications
You can also search for this author in PubMed Google Scholar
Raul Casas
View author publications
You can also search for this author in PubMed Google Scholar
Maciej Wielgosz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mo’taz Al-Hami.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Al-Hami, M., Pietron, M., Casas, R. et al. Methodologies of Compressing a Stable Performance Convolutional Neural Networks in Image Classification. Neural Process Lett 51, 105–127 (2020). https://doi.org/10.1007/s11063-019-10076-y

Download citation

Published: 20 July 2019
Issue Date: February 2020
DOI: https://doi.org/10.1007/s11063-019-10076-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Methodologies of Compressing a Stable Performance Convolutional Neural Networks in Image Classification

Abstract

Access this article

Similar content being viewed by others

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

A survey of the recent architectures of deep convolutional neural networks

A comprehensive review of Binary Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Methodologies of Compressing a Stable Performance Convolutional Neural Networks in Image Classification

Abstract

Access this article

Similar content being viewed by others

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

A survey of the recent architectures of deep convolutional neural networks

A comprehensive review of Binary Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation