A RISC-V Hardware Accelerator for Q-Learning Algorithm

Angeloni, Damiano; Canese, Lorenzo; Cardarilli, Gian Carlo; Di Nunzio, Luca; Re, Marco; Spanò, Sergio

doi:10.1007/978-3-031-48121-5_11

Damiano Angeloni⁴³,
Lorenzo Canese⁴³,
Gian Carlo Cardarilli⁴³,
Luca Di Nunzio⁴³,
Marco Re⁴³ &
…
Sergio Spanò⁴³

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1110))

Included in the following conference series:

International Conference on Applications in Electronics Pervading Industry, Environment and Society

176 Accesses

Abstract

We propose a Q-Learning hardware accelerator for a RISC-V platform. In particular, our work focuses on the Klessydra processor. To the best of our knowledge, this is the first work in the literature that addresses this topic. We implemented the system on an AMD-Xilinx ZedBoard development board using a small amount of hardware resources and requiring a limited dynamic power of 1.528 W. The data we obtained are compatible with the future implementation of more accelerators on the same device to enhance the capabilities of the system. Compared to a standard software version of the algorithm, our accelerator allows a speed-up of \(\times 36\) in convergence time and an energy saving of \(\times 34\). The results obtained prove how our proposed system is suitable for high-speed and low-energy applications like Edge Machine Learning and embedded IoT systems.

http://dspvlsi.uniroma2.it/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Dörflinger A, Albers M, Kleinbeck B, Guan Y, Michalik H, Klink R, Blochwitz C, Nechi A, Berekovic M (2021) A comparative survey of open-source application-class risc-v processor implementations. In: Proceedings of the 18th ACM international conference on computing frontiers, pp 12–20
Google Scholar
Ramírez C, Castelló A, Quintana-Orti ES (2022) A blis-like matrix multiplication for machine learning in the risc-v isa-based gap8 processor. J Supercomput 78(16):18051–18060
Article Google Scholar
Kovačević N, Mišeljić D, Stojković A (2022) Risc-v vector processor for acceleration of machine learning algorithms. In: 2022 30th Telecommunications Forum (TELFOR). IEEE, pp 1–4
Google Scholar
Ottavi G, Garofalo A, Tagliavini G, Conti F, Benini L, Rossi D (2020) A mixed-precision risc-v processor for extreme-edge dnn inference. In: 2020 IEEE computer society annual symposium on VLSI (ISVLSI). IEEE, pp 512–517
Google Scholar
Ciccarella G, Giuliano R, Mazzenga F, Vatalaro F, Vizzarri A (2019) Edge cloud computing in telecommunications: case studies on performance improvement and tco saving. In: 2019 fourth international conference on fog and mobile edge computing (FMEC). IEEE, pp 113–120
Google Scholar
Rothmann M, Porrmann M (2022) A survey of domain-specific architectures for reinforcement learning. IEEE Access 10:13753–13767
Article Google Scholar
Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3):279–292
Article Google Scholar
Liu X, Diao J, Li N (2022) A fpga-based accelerator implementation for path planning using q_learning algorithm. J Phys: Conf Ser 2245. IOP Publishing
Google Scholar
Sahoo SS, Baranwal AR, Ullah S, Kumar A (2021) Memorel: a memory-oriented optimization approach to reinforcement learning on fpga-based embedded systems. In: Proceedings of the 2021 on Great Lakes Symposium on VLSI, pp 339–346
Google Scholar
Meng Y, Kuppannagari S, Rajat R, Srivastava A, Kannan R, Prasanna V (2020) Qtaccel: a generic fpga based design for q-table based reinforcement learning accelerators. In: 2020 IEEE international parallel and distributed processing symposium workshops (IPDPSW). IEEE, pp 107–114
Google Scholar
Spanò S, Cardarilli GC, Di Nunzio L, Fazzolari R, Giardino D, Matta M, Nannarelli A, Re M (2019) An efficient hardware implementation of reinforcement learning: the q-learning algorithm. Ieee Access 7:186340–186351
Google Scholar
Canese L, Cardarilli GC, Di Nunzio L, Fazzolari R, Re M, Spanó S (2022) Automatic ip core generator for fpga-based q-learning hardware accelerators. In: International conference on applications in electronics pervading industry, environment and society. Springer, Berlin, pp 242–247
Google Scholar
Cheikh A, Sordillo S, Mastrandrea A, Menichelli F, Scotti G, Olivieri M (2021) Klessydra-t: designing vector coprocessors for multithreaded edge-computing cores. IEEE Micro 41(2):64–71
Article Google Scholar
Gautschi M, Schiavone PD, Traber A, Loi I, Pullini A, Rossi D, Flamand E, Gürkaynak FK, Benini L (2017) Near-threshold risc-v core with dsp extensions for scalable iot endpoint devices. IEEE Trans Very Large Scale Integr (VLSI) Syst 25(10):2700–2713 (2017)
Google Scholar
Cardarilli GC, Di Nunzio L, Fazzolari R, Giardino D, Re M, Ricci A, Spano S (2022) An fpga-based multi-agent reinforcement learning timing synchronizer. Comput Electr Eng 99:107749
Article Google Scholar
Cardarilli GC, Di Nunzio L, Fazzolari R, Giardino D, Matta M, Re M, Spanò S (2020) An action-selection policy generator for reinforcement learning hardware accelerators. In: International conference on applications in electronics pervading industry, environment and society. Springer, Berlin, pp 267–272
Google Scholar
Klessydra: Klessydra/pulpino-klessydra: an open-source microcontroller system based on risc-v. https://github.com/klessydra/pulpino-klessydra

Download references

Acknowledgments

The authors would like to thank Advanced Micro Devices, Inc. (AMD) for providing the FPGA software tools with the AMD-Xilinx University Program. This work is partially supported by Project ECS 0000024 Rome Technopole, CUP B83C22002820006, NRP Mission 4 Component 2 Investment 1.5, funded by the European Union—NextGenerationEU.

Author information

Authors and Affiliations

Tor Vergata University of Rome, Via del Politecnico 1, 00133, Rome, Italy
Damiano Angeloni, Lorenzo Canese, Gian Carlo Cardarilli, Luca Di Nunzio, Marco Re & Sergio Spanò

Authors

Damiano Angeloni
View author publications
You can also search for this author in PubMed Google Scholar
Lorenzo Canese
View author publications
You can also search for this author in PubMed Google Scholar
Gian Carlo Cardarilli
View author publications
You can also search for this author in PubMed Google Scholar
Luca Di Nunzio
View author publications
You can also search for this author in PubMed Google Scholar
Marco Re
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Spanò
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sergio Spanò .

Editor information

Editors and Affiliations

Department of Electrical, Electronic, Telecommunication Engineering and Naval Architecture (DITEN), University of Genoa, Genoa, Italy
Francesco Bellotti
Department of Electrical and Computer Engineering, Hellenic Mediterranean University, Heraklion, Greece
Miltos D. Grammatikakis
Lab-STICC, CNRS UMR 6285, École nationale supérieure de techniques avancées Bretagne, Brest, France
Ali Mansour
Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
Massimo Ruo Roch
Department of Computer Science, HTWG Konstanz—University of Applied Sciences, Konstanz, Germany
Ralf Seepold
Department of Computer Engineering and Mathematics, Universitat Rovira i Virgili, Tarragona, Catalonia, Spain
Agusti Solanas
Department of Electrical, Electronic, Telecommunication Engineering and Naval Architecture (DITEN), University of Genoa, Genoa, Italy
Riccardo Berta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Angeloni, D., Canese, L., Cardarilli, G.C., Di Nunzio, L., Re, M., Spanò, S. (2024). A RISC-V Hardware Accelerator for Q-Learning Algorithm. In: Bellotti, F., et al. Applications in Electronics Pervading Industry, Environment and Society. ApplePies 2023. Lecture Notes in Electrical Engineering, vol 1110. Springer, Cham. https://doi.org/10.1007/978-3-031-48121-5_11

Download citation

DOI: https://doi.org/10.1007/978-3-031-48121-5_11
Published: 13 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48120-8
Online ISBN: 978-3-031-48121-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics