Balancing Convolutional Neural Networks Pipeline in FPGAs

de Sousa, Mark Cappello Ferreira; de Abreu de Sousa, Miguel Angelo; Del-Moral-Hernandez, Emilio

doi:10.1007/978-3-030-01418-6_17

Mark Cappello Ferreira de Sousa¹⁸,
Miguel Angelo de Abreu de Sousa¹⁹ &
Emilio Del-Moral-Hernandez¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11139))

Included in the following conference series:

International Conference on Artificial Neural Networks

7171 Accesses

Abstract

Convolutional Neural Networks (CNNs) have achieved excellent performance in image classification, being successfully applied in a wide range of domains. However, their processing power demand offers a challenge to their implementation in embedded real-time applications. To tackle this problem, we focused in this work on the FPGA acceleration of the convolutional layers, since they account for about 90% of the overall computational load. We implemented buffers to reduce the storage of feature maps and consequently, facilitating the allocation of the whole kernel weights in Block-RAMs (BRAMs). Moreover, we used 8-bits kernel weights, rounded from an already trained CNN, to further reduce the need for memory, storing them in multiple BRAMs to aid kernel loading throughput. To balance the pipeline of convolutions through the convolutional layers we manipulated the amount of parallel computation in the convolutional step in each convolutional layer. We adopted the AlexNet CNN architecture to run our experiments and compare the results. We were able to run the inference of the convolutional layers in 3.9 ms with maximum operation frequency of 76.9 MHz.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Farabet, C., Martini, B., Akselrod, P., Talay, S., LeCun, Y., Culurciello, E.: Hardware accelerated convolutional neural networks for synthetic vision systems. In: Proceedings of the 2010 IEEE International Symposium Circuits System, pp. 257–260 (2010)
Google Scholar
Krizhevsky, A.: One weird trick for parallelizing convolutional neural networks. Preprint (2014)
Google Scholar
Vanhoucke, V., Senior, A., Mao, M.: Improving the speed of neural networks on CPUs. In: Proceedings of the Deep Learning Unsupervised Feature Learning Work, NIPS 2011, pp. 1–8 (2011)
Google Scholar
Krizhevsky, A., Sutskever, I., Geoffrey, E.H.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25, 1–9 (2012)
Google Scholar
Peemen, M., Setio, A.A.A., Mesman, B., Corporaal, H.: Memory-centric accelerator design for convolutional neural networks. In: 2013 IEEE 31st International Conference on Computer Design (ICCD), pp. 13–19 (2013)
Google Scholar
Qiao, Y., Shen, J., Xiao, T., Yang, Q., Wen, M., Zhang, C.: FPGA-accelerated deep convolutional neural networks for high throughput and energy efficiency. Concurr. Comput. Pract. Exp. 22, 685–701 (2016)
Google Scholar
Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA 2015, pp. 161–170. ACM (2015)
Google Scholar
Rawat, W., Wang, Z.: Deep convolutional neural networks for image classification: a comprehensive review. Neural Comput. 29, 2352–2449 (2017)
Article Google Scholar
Lin, D.D., Talathi, S.S., Annapureddy, V.S.: Fixed point quantization of deep convolutional networks, vol. 48 (2015)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale Image recognition. In: 2014 International Conference on Learning Representation, pp. 1–14 (2015)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015)
Article MathSciNet Google Scholar
Dettmers, T.: 8-Bit approximations for parallelism in deep learning, pp. 1–14 (2015)
Google Scholar
Courbariaux, M., Bengio, Y., David, J.-P.: Training deep neural networks with low precision multiplications, pp. 1–10 (2014)
Google Scholar
Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision, vol. 37 (2015)
Google Scholar
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks, pp. 1–9 (2015)
Google Scholar
Savich, A.W., Moussa, M., Areibi, S.: The impact of arithmetic representation on implementing MLP-BP on FPGAs: A study. IEEE Trans. Neural Netw. 18, 240–252 (2007)
Article Google Scholar
Gokhale, V., Jin, J., Dundar, A., Martini, B., Culurciello, E.: A 240 G-ops/s mobile coprocessor for deep neural networks. Presented at the June 2014
Google Scholar
Suda, N., et al.: Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks. In: Proceedings of the 2016 ACM/SIGDA International Symposium Field-Programmable Gate Arrays - FPGA 2016, pp. 16–25 (2016)
Google Scholar
Ma, Y., Suda, N., Cao, Y., Seo, J.S., Vrudhula, S.: Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: FPL 2016 - 26th International Conference on Field-Programmable Logic Applications (2016)
Google Scholar

Download references

Acknowledgments

Mark Cappello Ferreira de Sousa gratefully acknowledges the National Council for Scientific and Technological Development (CNPq) for partially supporting this research. Mark also acknowledges Stelvio Henrique Ignacio Barboza, Anelise Scotti Scherer and Academic Literacy Laboratory for valuable comments. Miguel Angelo de Abreu de Sousa acknowledges the support from the Federal Institute of Education, Science and Technology of São Paulo (IFSP).

Author information

Authors and Affiliations

Department of Electronic Systems Engineering, School of Engineering, University of São Paulo, São Paulo, Brazil
Mark Cappello Ferreira de Sousa & Emilio Del-Moral-Hernandez
Electrical Department, Federal Institute of Education, Science and Technology – IFSP, São Paulo, Brazil
Miguel Angelo de Abreu de Sousa

Authors

Mark Cappello Ferreira de Sousa
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Angelo de Abreu de Sousa
View author publications
You can also search for this author in PubMed Google Scholar
Emilio Del-Moral-Hernandez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark Cappello Ferreira de Sousa .

Editor information

Editors and Affiliations

Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Open University of Cyprus, Latsia, Cyprus
Yannis Manolopoulos
CITEC Bielefeld University, Bielefeld, Germany
Barbara Hammer
Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
University of Piraeus, Piraeus, Greece
Ilias Maglogiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Sousa, M.C.F., de Abreu de Sousa, M.A., Del-Moral-Hernandez, E. (2018). Balancing Convolutional Neural Networks Pipeline in FPGAs. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11139. Springer, Cham. https://doi.org/10.1007/978-3-030-01418-6_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-01418-6_17
Published: 27 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01417-9
Online ISBN: 978-3-030-01418-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics