Skip to main content

Lightweight Deep Neural Network Accelerators Using Approximate SW/HW Techniques

  • Chapter
  • First Online:
Approximate Circuits

Abstract

Deep neural networks (DNNs) provide state-of-the-art accuracy performances in many application domains, such as computer vision and speech recognition. At the same time, DNNs require millions of expensive floating-point operations to process each input, which limit their applicability to resource-constrained systems that are limited in hardware design area or power consumption. Our goal is to devise lightweight, approximate accelerators for DNN accelerations that use less hardware resources with negligible reduction in accuracy. To simplify the hardware requirements, we analyze a spectrum of data precision methods ranging from fixed-point, dynamic fixed-point, powers-of-two to binary data precision. In conjunction, we provide new training methods to compensate for the simpler hardware resources. To boost the accuracy of the proposed lightweight accelerators, we describe ensemble processing techniques that use an ensemble of lightweight DNN accelerators to achieve the same or better accuracy than the original floating-point accelerator, while still using much less hardware resources. Using 65 nm technology libraries and industrial-strength design flow, we demonstrate a custom hardware accelerator design and training procedure which achieve low-power, low-latency while incurring insignificant accuracy degradation. We evaluate our design and technique on the CIFAR-10 and ImageNet datasets and show that significant reduction in power and inference latency is realized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ba J, Caruana R (2014) Do deep nets really need to be deep? In: Advances in neural information processing systems (NIPS 2014), pp 2654–2662

    Google Scholar 

  2. Bucilua C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of ACM SIGKDD

    Google Scholar 

  3. Chen T, Du Z, Sun N, Wang J, Wu C, Chen Y, & Temam O (2014) Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In: Proceedings of ACM ASPLOS. ACM, New York, pp 269–284

    Google Scholar 

  4. Courbariaux M, Bengio Y, David JP (2014) Low precision arithmetic for deep learning. arXiv:1412.7024

    Google Scholar 

  5. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315–323

    Google Scholar 

  6. Graham B (2014) Fractional max-pooling. arXiv:1412.6071

    Google Scholar 

  7. Gysel P (2016) Ristretto: hardware-oriented approximation of convolutional neural networks. CoRR, abs/1605.06402

    Google Scholar 

  8. Hashemi S, Anthony N, Tann H, Bahar RI, Reda S (2017) Understanding the impact of precision quantization on the accuracy and energy of neural networks. In: Proceedings of DATE

    Google Scholar 

  9. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  Google Scholar 

  10. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

    Google Scholar 

  11. Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv:1503.02531

    Google Scholar 

  12. Huang G, Liu Z, Weinberger KQ, van der Maaten L (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 1, p 3

    Google Scholar 

  13. Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in neural information processing systems, vol 29. Curran Associates, New York, pp 4107–4115

    Google Scholar 

  14. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. arXiv:1408.5093

    Google Scholar 

  15. Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical report, University of Toronto

    Google Scholar 

  16. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS

    Google Scholar 

  17. Lin M, Chen Q, Yan S (2013) Network in network. arXiv:1312.4400

    Google Scholar 

  18. Lin Z, Courbariaux M, Memisevic R, Bengio Y (2015) Neural networks with few multiplications. CoRR, abs/1510.03009

    Google Scholar 

  19. Rastegari M, Ordonez V, Redmon J, Farhadi A (2015) XNOR-net: imagenet classification using binary convolutional neural networks. In: European conference on computer vision. Springer, Berlin, pp 525–542

    Google Scholar 

  20. Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: hints for thin deep nets. CoRR, abs/1412.6550

    Google Scholar 

  21. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  22. Shafique M, Hafiz R, Javed MU, Abbas S, Sekanina L, Vasicek Z, Mrazek V (2017) Adaptive and energy-efficient architectures for machine learning: challenges, opportunities, and research roadmap. In: 2017 IEEE computer society annual symposium on VLSI (ISVLSI). IEEE, Piscataway, pp 627–632

    Chapter  Google Scholar 

  23. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Association for the advancement of artificial intelligence, vol 4, p 12

    Google Scholar 

  24. Tann H, Hashemi S, Bahar RI, Reda S (2016) Runtime configurable deep neural networks for energy-accuracy trade-off. CoRR, abs/1607.05418

    Google Scholar 

  25. Tann H, Hashemi S, Bahar RI, Reda S (2017) Hardware-software codesign of accurate, multiplier-free deep neural networks. In: 2017 54th ACM/EDAC/IEEE design automation conference (DAC). IEEE, Piscataway, pp 1–6

    Google Scholar 

  26. Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv:1605.07146

    Google Scholar 

Download references

Acknowledgements

We would like to thank Professor R. Iris Bahar and N. Anthony for their contributions to this project [8, 25]. In comparison to our two previous publications in [8, 25], we provide in this chapter additional experimental results for various quantization schemes and ensemble deployment. More specifically, the novel contributions in this chapter include implementations of accelerators capable of performing ensemble inference for fixed-point (16,16), (8,8), and power-of-two (6,16). We also provide the performance evaluations of these accelerators in side-by-side comparisons to those from our previous works in Figs. 14.7 and 14.8. We also generalize our ensemble technique to boost the accuracy to all types of quantized networks and not just dynamic fixed-point. The additional results contributed in this chapter complete the gaps between our two previous publications, which allow for a more complete design space exploration for approximate deep neural network accelerators. This work is supported by NSF grant 1420864 and by the generous GPU hardware donations from NVIDIA Corporation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sherief Reda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Tann, H., Hashemi, S., Reda, S. (2019). Lightweight Deep Neural Network Accelerators Using Approximate SW/HW Techniques. In: Reda, S., Shafique, M. (eds) Approximate Circuits. Springer, Cham. https://doi.org/10.1007/978-3-319-99322-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99322-5_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99321-8

  • Online ISBN: 978-3-319-99322-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics