A Multi-FPGA Scalable Framework for Deep Reinforcement Learning Through Neuroevolution

Laserna, Javier; Otero, Andrés; Torre, Eduardo de la

doi:10.1007/978-3-031-19983-7_4

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13569))

Included in the following conference series:

International Symposium on Applied Reconfigurable Computing

673 Accesses

Abstract

The application of Deep Neural Networks (DNN) for reinforcement learning has proven effective in solving complex problems, such as playing video games or training robots to perform human tasks. Training based on reinforcement implies the continuous interaction of the agent powered by the DNN and the environment, vanishing the typical separation between the training and inference stages in deep learning. However, the high memory and accuracy requirements of gradient-based training algorithms prevent using FPGAs for these applications. As an alternative, this work demonstrates the feasibility of using Evolutionary Algorithms (EA) for training DNNs and their usage in reinforcement learning scenarios. Unlike backpropagation, EA-based training of neural networks, referred to as neuroevolution, can be effectively implemented on FPGAs. Moreover, this paper shows how the inherent parallelism of EAs can be effectively exploited in multi-FPGA scenarios to accelerate the learning process. The proposed FPGA-based neuroevolutionary framework has been validated by building a system capable of learning autonomously to play the Pong Atari game in less than 25 generations.

This project has been funded by the Spanish Ministry for Science and Innovation under the project TALENT (ref. PID2020-116417RB-C42).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Gradient-Based Versus Gradient-Free Algorithms for Reinforcement Learning

Pareto Multi-task Deep Learning

AI Game Agents Based on Evolutionary Search and (Deep) Reinforcement Learning: A Practical Analysis with Flappy Bird

References

Asseman, A., Antoine, N., Ozcan, A.S.: Accelerating deep neuroevolution on distributed fpgas for reinforcement learning problems. J. Emerg. Technol. Comput. Syst. 17(2) (2021). https://doi.org/10.1145/3425500
Brockman, G., et al.: Openai gym (2016)
Google Scholar
Chen, T., et al.: Tvm: end-to-end optimization stack for deep learning. arXiv preprint arXiv:1802.04799, vol. 11, no. 20 (2018)
Digilent: Pynq-z1 board reference manual (2017). https://reference.digilentinc.com/_media/reference/programmable-logic/pynq-z1/pynq-rm.pdf. Accessed 8 July 2021
García, A., Zamacola, R., Otero, A., de la Torre, E.: A dynamically reconfigurable bbnn architecture for scalable neuroevolution in hardware. Electronics 9(5) (2020). https://doi.org/10.3390/electronics9050803, https://www.mdpi.com/2079-9292/9/5/803
Irmen: Pyro4 framework (2021). https://github.com/irmen/Pyro4. Accessed 8 July 2021
Kachris, C., Falsafi, B., Soudris, D.: Hardware Accelerators in Data Centers. Springer, Heidelberg (2019)
Book Google Scholar
Koutník, J., Schmidhuber, J., Gomez, F.: Evolving deep unsupervised convolutional networks for vision-based reinforcement learning. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 541–548. GECCO’14, Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/2576768.2598358
Liu, X., Xu, H., Liao, W., Yu, W.: Reinforcement learning for cyber-physical systems. In: 2019 IEEE International Conference on Industrial Internet (ICII), pp. 318–327. IEEE (2019)
Google Scholar
Luo, C., Sit, M.K., Fan, H., Liu, S., Luk, W., Guo, C.: Towards efficient deep neural network training by fpga-based batch-level parallelism. J. Semicond. 41(2), 022403 (2020)
Article Google Scholar
Mnih, V., et al.: Playing atari with deep reinforcement learning (2013)
Google Scholar
Moreau, T., et al.: A hardware-software blueprint for flexible deep learning specialization. IEEE Micro 39(5), 8–16 (2019)
Article Google Scholar
Nurvitadhi, E., et al.: Can fpgas beat gpus in accelerating next-generation deep neural networks? In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-programmable Gate Arrays, pp. 5–14 (2017)
Google Scholar
Pappalardo, A.: Xilinx/brevitas (2021). https://doi.org/10.5281/zenodo.3333552
Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S.: Continual lifelong learning with neural networks: A review. Neural Netw. 113, 54–71 (2019)
Article Google Scholar
Petroski Such, F., Madhavan, V., Conti, E., Lehman, J., Stanley, K.O., Clune, J.: Deep neuroevolution: genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv e-prints. arXiv:1712.06567 (Dec 2017)
Russell, S., Norvig, P.: Artificial intelligence: a modern approach (2002)
Google Scholar
Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017)
Article Google Scholar
Umuroglu, Y., et al.: Finn: a framework for fast, scalable binarized neural network inference. In: Proceedings of the 2017 ACM/SIGDA international symposium on field-programmable gate arrays. pp. 65–74 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidad Politécnica de Madrid, Madrid, Spain
Javier Laserna, Andrés Otero & Eduardo de la Torre

Authors

Javier Laserna
View author publications
You can also search for this author in PubMed Google Scholar
Andrés Otero
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo de la Torre
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrés Otero .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Lin Gan
Tsinghua University, Beijing, China
Yu Wang
Tsinghua University, Beijing, China
Wei Xue
Samsung AI Center, Cambridge, UK
Thomas Chau

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Laserna, J., Otero, A., Torre, E.d.l. (2022). A Multi-FPGA Scalable Framework for Deep Reinforcement Learning Through Neuroevolution. In: Gan, L., Wang, Y., Xue, W., Chau, T. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2022. Lecture Notes in Computer Science, vol 13569. Springer, Cham. https://doi.org/10.1007/978-3-031-19983-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-19983-7_4
Published: 27 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19982-0
Online ISBN: 978-3-031-19983-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Multi-FPGA Scalable Framework for Deep Reinforcement Learning Through Neuroevolution