Skip to main content

Balancing Convolutional Neural Networks Pipeline in FPGAs

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2018 (ICANN 2018)

Abstract

Convolutional Neural Networks (CNNs) have achieved excellent performance in image classification, being successfully applied in a wide range of domains. However, their processing power demand offers a challenge to their implementation in embedded real-time applications. To tackle this problem, we focused in this work on the FPGA acceleration of the convolutional layers, since they account for about 90% of the overall computational load. We implemented buffers to reduce the storage of feature maps and consequently, facilitating the allocation of the whole kernel weights in Block-RAMs (BRAMs). Moreover, we used 8-bits kernel weights, rounded from an already trained CNN, to further reduce the need for memory, storing them in multiple BRAMs to aid kernel loading throughput. To balance the pipeline of convolutions through the convolutional layers we manipulated the amount of parallel computation in the convolutional step in each convolutional layer. We adopted the AlexNet CNN architecture to run our experiments and compare the results. We were able to run the inference of the convolutional layers in 3.9 ms with maximum operation frequency of 76.9 MHz.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Farabet, C., Martini, B., Akselrod, P., Talay, S., LeCun, Y., Culurciello, E.: Hardware accelerated convolutional neural networks for synthetic vision systems. In: Proceedings of the 2010 IEEE International Symposium Circuits System, pp. 257–260 (2010)

    Google Scholar 

  2. Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. Preprint (2014)

    Google Scholar 

  3. Vanhoucke, V., Senior, A., Mao, M.: Improving the speed of neural networks on CPUs. In: Proceedings of the Deep Learning Unsupervised Feature Learning Work, NIPS 2011, pp. 1–8 (2011)

    Google Scholar 

  4. Krizhevsky, A., Sutskever, I., Geoffrey, E.H.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1–9 (2012)

    Google Scholar 

  5. Peemen, M., Setio, A.A.A., Mesman, B., Corporaal, H.: Memory-centric accelerator design for convolutional neural networks. In: 2013 IEEE 31st International Conference on Computer Design (ICCD), pp. 13–19 (2013)

    Google Scholar 

  6. Qiao, Y., Shen, J., Xiao, T., Yang, Q., Wen, M., Zhang, C.: FPGA-accelerated deep convolutional neural networks for high throughput and energy efficiency. Concurr. Comput. Pract. Exp. 22, 685–701 (2016)

    Google Scholar 

  7. Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA 2015, pp. 161–170. ACM (2015)

    Google Scholar 

  8. Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 29, 2352–2449 (2017)

    Article  Google Scholar 

  9. Lin, D.D., Talathi, S.S., Annapureddy, V.S.: Fixed point quantization of deep convolutional networks, vol. 48 (2015)

    Google Scholar 

  10. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale Image recognition. In: 2014 International Conference on Learning Representation, pp. 1–14 (2015)

    Google Scholar 

  11. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  12. Dettmers, T.: 8-Bit approximations for parallelism in deep learning, pp. 1–14 (2015)

    Google Scholar 

  13. Courbariaux, M., Bengio, Y., David, J.-P.: Training deep neural networks with low precision multiplications, pp. 1–10 (2014)

    Google Scholar 

  14. Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision, vol. 37 (2015)

    Google Scholar 

  15. Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks, pp. 1–9 (2015)

    Google Scholar 

  16. Savich, A.W., Moussa, M., Areibi, S.: The impact of arithmetic representation on implementing MLP-BP on FPGAs: A study. IEEE Trans. Neural Netw. 18, 240–252 (2007)

    Article  Google Scholar 

  17. Gokhale, V., Jin, J., Dundar, A., Martini, B., Culurciello, E.: A 240 G-ops/s mobile coprocessor for deep neural networks. Presented at the June 2014

    Google Scholar 

  18. Suda, N., et al.: Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proceedings of the 2016 ACM/SIGDA International Symposium Field-Programmable Gate Arrays - FPGA 2016, pp. 16–25 (2016)

    Google Scholar 

  19. Ma, Y., Suda, N., Cao, Y., Seo, J.S., Vrudhula, S.: Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: FPL 2016 - 26th International Conference on Field-Programmable Logic Applications (2016)

    Google Scholar 

Download references

Acknowledgments

Mark Cappello Ferreira de Sousa gratefully acknowledges the National Council for Scientific and Technological Development (CNPq) for partially supporting this research. Mark also acknowledges Stelvio Henrique Ignacio Barboza, Anelise Scotti Scherer and Academic Literacy Laboratory for valuable comments. Miguel Angelo de Abreu de Sousa acknowledges the support from the Federal Institute of Education, Science and Technology of São Paulo (IFSP).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mark Cappello Ferreira de Sousa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

de Sousa, M.C.F., de Abreu de Sousa, M.A., Del-Moral-Hernandez, E. (2018). Balancing Convolutional Neural Networks Pipeline in FPGAs. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11139. Springer, Cham. https://doi.org/10.1007/978-3-030-01418-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01418-6_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01417-9

  • Online ISBN: 978-3-030-01418-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics