A Runtime Programmable Accelerator for Convolutional and Multilayer Perceptron Neural Networks on FPGA

Kabir, Ehsan; Poudel, Arpan; Aklah, Zeyad; Huang, Miaoqing; Andrews, David

doi:10.1007/978-3-031-19983-7_3

A Runtime Programmable Accelerator for Convolutional and Multilayer Perceptron Neural Networks on FPGA

Ehsan Kabir¹¹,
Arpan Poudel¹¹,
Zeyad Aklah¹²,
Miaoqing Huang¹¹ &
…
David Andrews¹¹

Conference paper
First Online: 27 October 2022

509 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13569))

Abstract

Deep neural networks (DNNs) are prevalent for many applications related to classification, prediction and regression. To perform different applications with better performance and accuracy, an optimized network architecture is required, which can be obtained through experiments and performance evaluation on different network topologies. However, a custom hardware accelerator is not scalable and it lacks the flexibility to switch from one topology to another at run time. In order to support convolutional neural networks (CNN) along with multilayer perceptron neural networks (MLPNN) of different sizes, we present in this paper an accelerator architecture for FPGAs that can be programmed during run time. This combined CNN and MLP accelerator (CNN-MLPA) can run any CNN and MLPNN applications without re-synthesis. Therefore, time spent on synthesis, placement and routing can be saved for executing different applications on the proposed architecture. Run time results show that the CNN-MLPA can be used for network topologies of different sizes without much degradation of performance. We evaluated the resource utilization and execution time on Xilinx Virtex 7 FPGA board for different benchmark datasets to demonstrate that our design is run time programmable, portable and scalable for any FPGA. The accelerator was then optimized to increase the throughput by applying pipelining and concurrency, and reduce resource consumption with fixed-point operations.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Zhao, M., Hu, C., Wei, F., Wang, K., Wang, C., Jiang, Y.: Real-time underwater image recognition with FPGA embedded system for convolutional neural network. Sensors 19(2), 350 (2019)
Article Google Scholar
Ann, L.Y., Ehkan, P., Mashor, M.Y., Sharun, S.M.: FPGA-based architecture of hybrid multilayered perceptron neural network. Indonesian J. Electr. Eng. Comput. Sci. 14(2), 949–956 (2019)
Article Google Scholar
Ngo, D.M., Temko, A., Murphy, C.C., Popovici, E. FPGA hardware acceleration framework for anomaly-based intrusion detection system in IoT. In 2021 31st International Conference on Field-Programmable Logic and Applications (FPL), pp. 69–75 (2021)
Google Scholar
Jiang, W., et al.: Wearable on-device deep learning system for hand gesture recognition based on FPGA accelerator. Math. Biosc. Eng. 18(1), 132–153 (2021)
Article MATH Google Scholar
Krips, M., Lammert, T., Kummert, A.: FPGA implementation of a neural network for a real-time hand tracking system. In: 2002 Proceedings of the First IEEE International Workshop on Electronic Design, Test and Applications, pp. 313–317 (2002)
Google Scholar
Cheung, K., Schultz, S.R., Luk, W.: A large-scale spiking neural network accelerator for FPGA systems. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012. LNCS, vol. 7552, pp. 113–120. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33269-2_15
Chapter Google Scholar
Liang, S., Yin, S., Liu, L., Luk, W., Wei, S.: FP-BNN: binarized neural network on FPGA. Neurocomputing 275, 1072–1086 (2018)
Article Google Scholar
Umuroglu, Y., et al.: FINN: a framework for fast, scalable binarized neural network inference. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 65–74 (2017)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Wu, C.J., et al.: Machine learning at Facebook: understanding inference at the edge. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 331–344. IEEE (2019)
Google Scholar
Jouppi, N., Young, C., Patil, N., Patterson, D.: Motivation for and evaluation of the first tensor processing unit. IEEE Micro 38(3), 10–19 (2018)
Article Google Scholar
Aklah, Z., Andrews, D.: A flexible multilayer perceptron co-processor for FPGAs. In: Sano, K., Soudris, D., Hübner, M., Diniz, P.C. (eds.) ARC 2015. LNCS, vol. 9040, pp. 427–434. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16214-0_39
Chapter Google Scholar
Majumder, K., Bondhugula, U.: A flexible FPGA accelerator for convolutional neural networks. arXiv preprint arXiv:1912.07284 (2019)
Sanaullah, A., Yang, C., Alexeev, Y., Yoshii, K., Herbordt, M.C.: Application aware tuning of reconfigurable multi-layer perceptron architectures. In: 2018 IEEE High Performance extreme Computing Conference (HPEC), pp. 1–9. IEEE (2018)
Google Scholar
Fine, T.L.: Feedforward Neural Network Methodology. Springer, Cham (2006). https://doi.org/10.1007/b97705
Stanford University. cs231n convolutional neural network for visual recognition
Google Scholar
Wang, C., Gong, L., Qi, Yu., Li, X., Xie, Y., Zhou, X.: DLAU: a scalable deep learning accelerator unit on FPGA. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 36(3), 513–517 (2016)
Google Scholar
Medus, L.D., Iakymchuk, T., Frances-Villora, J.V., Bataller-Mompeán, M., Rosado-Muñoz, A.: A novel systolic parallel hardware architecture for the FPGA acceleration of feedforward neural networks. IEEE Access 7, 76084–76103 (2019)
Article Google Scholar
Abdelsalam, A.M., Boulet, F., Demers, G., Langlois, J.P., Cheriet, F.: An efficient FPGA-based overlay inference architecture for fully connected DNNs. In: 2018 International Conference on ReConFigurable Computing and FPGAs (ReConFig), pp. 1–6. IEEE (2018)
Google Scholar
Huynh, T.V.: Deep neural network accelerator based on FPGA. In: 2017 4th NAFOSTED Conference on Information and Computer Science, pp. 254–257 (2017)
Google Scholar
Basterretxea, K., Echanobe, J., del Campo, I.: A wearable human activity recognition system on a chip. In: Proceedings of the 2014 Conference on Design and Architectures for Signal and Image Processing, pp. 1–8. IEEE (2014)
Google Scholar
Si, J., Harris, S.L., Yfantis, E.: A dynamic ReLU on neural network. In: 2018 IEEE 13th Dallas Circuits and Systems Conference (DCAS), pp. 1–6. IEEE (2018)
Google Scholar
Fazakas, A., Neag, M., Festila, L.: Block RAM versus distributed RAM implementation of SVM classifier on FPGA. In: 2006 International Conference on Applied Electronics, pp. 43–46 (2006)
Google Scholar
Python productivity for zynq. http://www.pynq.io/board.html
Hussein, A.S., Anwar, A., Fahmy, Y., Mostafa, H., Salama, K.N., Kafafy, M.: Implementation of a DPU-based intelligent thermal imaging hardware accelerator on FPGA. Electronics 11(1), 105 (2022)
Article Google Scholar
Cho, M., Kim, Y.: FPGA-based convolutional neural network accelerator with resource-optimized approximate multiply-accumulate unit. Electronics 10(22), 2859 (2021)
Article Google Scholar
Cho, M., Kim, Y.: Implementation of data-optimized FPGA-based accelerator for convolutional neural network. In: 2020 International Conference on Electronics, Information, and Communication (ICEIC), pp. 1–2 (2020)
Google Scholar
Yuan, T., Liu, W., Han, J., Lombardi, F.: High performance CNN accelerators based on hardware and algorithm co-optimization. IEEE Trans. Circ. Syst. I: Regul. Pap. 68(1), 250–263 (2021)
MathSciNet Google Scholar
Bhowmik, P., Pantho, J.H., Mbongue, J.M., Bobda, C.: ESCA: event-based split-CNN architecture with data-level parallelism on ultrascale+ FPGA. In: 2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 176–180 (2021)
Google Scholar
Mousouliotis, P.G., Petrou, L.P.: CNN-grinder: from algorithmic to high-level synthesis descriptions of CNNS for low-end-low-cost FPGA socs. Microprocess. Microsyst. 73, 102990 (2020)
Article Google Scholar
Mousouliotis, P.G., Petrou, L.P.: SqueezeJet: high-level synthesis accelerator design for deep convolutional neural networks. In: Voros, N., Huebner, M., Keramidas, G., Goehringer, D., Antonopoulos, C., Diniz, P.C. (eds.) ARC 2018. LNCS, vol. 10824, pp. 55–66. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78890-6_5
Chapter Google Scholar
Flamis, G., Kalapothas, S., Kitsos, P.: Workflow on CNN utilization and inference in FPGA for embedded applications: 6th south-east Europe design automation, computer engineering, computer networks and social media conference (seeda-cecnsm 2021). In: 2021 6th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM), pp. 1–6 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

CSCE Department, University of Arkansas, Fayetteville, AR, USA
Ehsan Kabir, Arpan Poudel, Miaoqing Huang & David Andrews
Computer Science Department, University of Thi-Qar, Nasiriyah, Iraq
Zeyad Aklah

Authors

Ehsan Kabir
View author publications
You can also search for this author in PubMed Google Scholar
Arpan Poudel
View author publications
You can also search for this author in PubMed Google Scholar
Zeyad Aklah
View author publications
You can also search for this author in PubMed Google Scholar
Miaoqing Huang
View author publications
You can also search for this author in PubMed Google Scholar
David Andrews
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ehsan Kabir .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Lin Gan
Tsinghua University, Beijing, China
Yu Wang
Tsinghua University, Beijing, China
Wei Xue
Samsung AI Center, Cambridge, UK
Thomas Chau

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kabir, E., Poudel, A., Aklah, Z., Huang, M., Andrews, D. (2022). A Runtime Programmable Accelerator for Convolutional and Multilayer Perceptron Neural Networks on FPGA. In: Gan, L., Wang, Y., Xue, W., Chau, T. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2022. Lecture Notes in Computer Science, vol 13569. Springer, Cham. https://doi.org/10.1007/978-3-031-19983-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-19983-7_3
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19982-0
Online ISBN: 978-3-031-19983-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics