HLS-Based Acceleration Framework for Deep Convolutional Neural Networks

Misra, Ashish; Kindratenko, Volodymyr

doi:10.1007/978-3-030-44534-8_17

HLS-Based Acceleration Framework for Deep Convolutional Neural Networks

Conference paper
First Online: 25 March 2020

1699 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12083))

Abstract

Deep Neural Networks (DNNs) have been successfully applied in many fields. Considering performance, flexibility, and energy efficiency, Field Programmable Gate Array (FPGA) based accelerator for DNNs is a promising solution. The existing frameworks however lack the possibility of reusability and friendliness to design a new network with minimum efforts. Modern high-level synthesis (HLS) tools greatly reduce the turnaround time of designing and implementing complex FPGA-based accelerators. This paper presents a framework for hardware accelerator for DNNs using high level specification. A novel architecture is introduced that maximizes data reuse and external memory bandwidth. This framework allows to generate a scalable HLS code for a given pre-trained model that can be mapped to different FPGA platforms. Various HLS compiler optimizations have been applied to the code to produce efficient implementation and high resource utilization. The framework achieves a peak performance of 23 frames per second for SqueezeNet on Xilinx Alveo u250 board.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Graves, A., Mohamed, A.-R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)
Google Scholar
Abdel-Hamid, O., Mohamed, A.-R., Jiang, H., Deng, L., Penn, G., Yu, D.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)
Article Google Scholar
Sankaradas, M., et al: A massively parallel coprocessor for convolutional neural networks. In: 20th IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2009, pp. 53–60. IEEE (2009)
Google Scholar
Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 161–170. ACM (2015)
Google Scholar
Ovtcharov, K., Ruwase, O., Kim, J.-Y., Fowers, J., Strauss, K., Chung, E.S.: Accelerating deep convolutional neural networks using specialized hardware. Microsoft Research Whitepaper 2(11), 1–4 (2015)
Google Scholar
Guan, Y., et al.: FP-DNN: an automated framework for mapping deep neural networks onto FPGAs with RTL-HLS hybrid templates. In: 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 152–159. IEEE (2017)
Google Scholar
Shawahna, A., Sait, S.M., El-Maleh, A.: FPGA-based accelerators of deep learning networks for learning and classification: a review. IEEE Access 7, 7823–7859 (2018)
Article Google Scholar
Qiu, J., et al.: Going deeper with embedded FPGA platform for convolutional neural network. In: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 26–35. ACM (2016)
Google Scholar
Zhang, J., Li, J.: Improving the performance of OpenCL-based FPGA accelerator for convolutional neural network. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 25–34. ACM (2017)
Google Scholar
Suda, N., et al: Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 16–25. ACM (2016)
Google Scholar
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015)
https://www.xilinx.com/products/acceleration-solutions/xilinx-machine-learning-suite.html. Accessed 21 Aug 2019
Aydonat, U., O’Connell, S., Capalija, D., Ling, A.C., Chiu, G.R.: An OpenCL^TM deep learning accelerator on arria 10. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 55–64. ACM (2017)
Google Scholar
https://dgschwend.github.io/netscope/#/preset/squeezenet. Accessed 21 Aug 2019
https://www.xilinx.com/support/documentation/sw_manuals/xilinx2017_4/ug1027-sdsoc-user-guide.pdf. Accessed 21 Aug 2019

Download references

Acknowledgments

This work is funded by the National Science Foundation’s Major Research Instrumentation program, grant #1725729. We thank Yuan Ma for his help in setting up the simulation using Caffe and Tanitpong Lawphongpanich for his contribution with TensorFlow testing.

Author information

Authors and Affiliations

University of Illinois, Urbana, IL, 61801, USA
Ashish Misra & Volodymyr Kindratenko

Authors

Ashish Misra
View author publications
You can also search for this author in PubMed Google Scholar
Volodymyr Kindratenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Volodymyr Kindratenko .

Editor information

Editors and Affiliations

Technology and Information Systems, University of Castilla-La Mancha, Ciudad Real, Spain
Fernando Rincón
Technology and Information Systems, University of Castilla-La Mancha, Ciudad Real, Spain
Jesús Barba
Department of Electrical and Electronic Engineering, University of Hong Kong, Hong Kong, China
Hayden K. H. So
INESC-ID, Lisbon, Portugal
Pedro Diniz
Technology and Information Systems, University of Castilla-La Mancha, Ciudad Real, Spain
Julián Caba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Misra, A., Kindratenko, V. (2020). HLS-Based Acceleration Framework for Deep Convolutional Neural Networks. In: Rincón, F., Barba, J., So, H., Diniz, P., Caba, J. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2020. Lecture Notes in Computer Science(), vol 12083. Springer, Cham. https://doi.org/10.1007/978-3-030-44534-8_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-44534-8_17
Published: 25 March 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44533-1
Online ISBN: 978-3-030-44534-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics