Lightweight Deep Neural Network Accelerators Using Approximate SW/HW Techniques

Tann, Hokchhay; Hashemi, Soheil; Reda, Sherief

doi:10.1007/978-3-319-99322-5_14

Hokchhay Tann³,
Soheil Hashemi³ &
Sherief Reda³

1716 Accesses

Abstract

Deep neural networks (DNNs) provide state-of-the-art accuracy performances in many application domains, such as computer vision and speech recognition. At the same time, DNNs require millions of expensive floating-point operations to process each input, which limit their applicability to resource-constrained systems that are limited in hardware design area or power consumption. Our goal is to devise lightweight, approximate accelerators for DNN accelerations that use less hardware resources with negligible reduction in accuracy. To simplify the hardware requirements, we analyze a spectrum of data precision methods ranging from fixed-point, dynamic fixed-point, powers-of-two to binary data precision. In conjunction, we provide new training methods to compensate for the simpler hardware resources. To boost the accuracy of the proposed lightweight accelerators, we describe ensemble processing techniques that use an ensemble of lightweight DNN accelerators to achieve the same or better accuracy than the original floating-point accelerator, while still using much less hardware resources. Using 65 nm technology libraries and industrial-strength design flow, we demonstrate a custom hardware accelerator design and training procedure which achieve low-power, low-latency while incurring insignificant accuracy degradation. We evaluate our design and technique on the CIFAR-10 and ImageNet datasets and show that significant reduction in power and inference latency is realized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recent advances in efficient computation of deep convolutional neural networks

Article 26 January 2018

A survey of neural network accelerators

Article 17 May 2017

Benchmark assessment for the DeepSpeed acceleration library on image classification

Article 26 August 2023

References

Ba J, Caruana R (2014) Do deep nets really need to be deep? In: Advances in neural information processing systems (NIPS 2014), pp 2654–2662
Google Scholar
Bucilua C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of ACM SIGKDD
Google Scholar
Chen T, Du Z, Sun N, Wang J, Wu C, Chen Y, & Temam O (2014) Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In: Proceedings of ACM ASPLOS. ACM, New York, pp 269–284
Google Scholar
Courbariaux M, Bengio Y, David JP (2014) Low precision arithmetic for deep learning. arXiv:1412.7024
Google Scholar
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315–323
Google Scholar
Graham B (2014) Fractional max-pooling. arXiv:1412.6071
Google Scholar
Gysel P (2016) Ristretto: hardware-oriented approximation of convolutional neural networks. CoRR, abs/1605.06402
Google Scholar
Hashemi S, Anthony N, Tann H, Bahar RI, Reda S (2017) Understanding the impact of precision quantization on the accuracy and energy of neural networks. In: Proceedings of DATE
Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Google Scholar
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv:1503.02531
Google Scholar
Huang G, Liu Z, Weinberger KQ, van der Maaten L (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 1, p 3
Google Scholar
Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems, vol 29. Curran Associates, New York, pp 4107–4115
Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093
Google Scholar
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical report, University of Toronto
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS
Google Scholar
Lin M, Chen Q, Yan S (2013) Network in network. arXiv:1312.4400
Google Scholar
Lin Z, Courbariaux M, Memisevic R, Bengio Y (2015) Neural networks with few multiplications. CoRR, abs/1510.03009
Google Scholar
Rastegari M, Ordonez V, Redmon J, Farhadi A (2015) XNOR-net: imagenet classification using binary convolutional neural networks. In: European conference on computer vision. Springer, Berlin, pp 525–542
Google Scholar
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: hints for thin deep nets. CoRR, abs/1412.6550
Google Scholar
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Shafique M, Hafiz R, Javed MU, Abbas S, Sekanina L, Vasicek Z, Mrazek V (2017) Adaptive and energy-efficient architectures for machine learning: challenges, opportunities, and research roadmap. In: 2017 IEEE computer society annual symposium on VLSI (ISVLSI). IEEE, Piscataway, pp 627–632
Chapter Google Scholar
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Association for the advancement of artificial intelligence, vol 4, p 12
Google Scholar
Tann H, Hashemi S, Bahar RI, Reda S (2016) Runtime configurable deep neural networks for energy-accuracy trade-off. CoRR, abs/1607.05418
Google Scholar
Tann H, Hashemi S, Bahar RI, Reda S (2017) Hardware-software codesign of accurate, multiplier-free deep neural networks. In: 2017 54th ACM/EDAC/IEEE design automation conference (DAC). IEEE, Piscataway, pp 1–6
Google Scholar
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:1605.07146
Google Scholar

Download references

Acknowledgements

We would like to thank Professor R. Iris Bahar and N. Anthony for their contributions to this project [8, 25]. In comparison to our two previous publications in [8, 25], we provide in this chapter additional experimental results for various quantization schemes and ensemble deployment. More specifically, the novel contributions in this chapter include implementations of accelerators capable of performing ensemble inference for fixed-point (16,16), (8,8), and power-of-two (6,16). We also provide the performance evaluations of these accelerators in side-by-side comparisons to those from our previous works in Figs. 14.7 and 14.8. We also generalize our ensemble technique to boost the accuracy to all types of quantized networks and not just dynamic fixed-point. The additional results contributed in this chapter complete the gaps between our two previous publications, which allow for a more complete design space exploration for approximate deep neural network accelerators. This work is supported by NSF grant 1420864 and by the generous GPU hardware donations from NVIDIA Corporation.

Author information

Authors and Affiliations

Brown University, Providence, RI, USA
Hokchhay Tann, Soheil Hashemi & Sherief Reda

Authors

Hokchhay Tann
View author publications
You can also search for this author in PubMed Google Scholar
Soheil Hashemi
View author publications
You can also search for this author in PubMed Google Scholar
Sherief Reda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sherief Reda .

Editor information

Editors and Affiliations

Brown University, Rhode Island, Providence, USA
Sherief Reda
Vienna University of Technology, Wien, Wien, Austria
Muhammad Shafique

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tann, H., Hashemi, S., Reda, S. (2019). Lightweight Deep Neural Network Accelerators Using Approximate SW/HW Techniques. In: Reda, S., Shafique, M. (eds) Approximate Circuits. Springer, Cham. https://doi.org/10.1007/978-3-319-99322-5_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-99322-5_14
Published: 06 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99321-8
Online ISBN: 978-3-319-99322-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Lightweight Deep Neural Network Accelerators Using Approximate SW/HW Techniques

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Recent advances in efficient computation of deep convolutional neural networks

A survey of neural network accelerators

Benchmark assessment for the DeepSpeed acceleration library on image classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Lightweight Deep Neural Network Accelerators Using Approximate SW/HW Techniques

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Recent advances in efficient computation of deep convolutional neural networks

A survey of neural network accelerators

Benchmark assessment for the DeepSpeed acceleration library on image classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation