Skip to main content

A RISC-V Hardware Accelerator for Q-Learning Algorithm

  • Conference paper
  • First Online:
Applications in Electronics Pervading Industry, Environment and Society (ApplePies 2023)

Abstract

We propose a Q-Learning hardware accelerator for a RISC-V platform. In particular, our work focuses on the Klessydra processor. To the best of our knowledge, this is the first work in the literature that addresses this topic. We implemented the system on an AMD-Xilinx ZedBoard development board using a small amount of hardware resources and requiring a limited dynamic power of 1.528 W. The data we obtained are compatible with the future implementation of more accelerators on the same device to enhance the capabilities of the system. Compared to a standard software version of the algorithm, our accelerator allows a speed-up of \(\times 36\) in convergence time and an energy saving of \(\times 34\). The results obtained prove how our proposed system is suitable for high-speed and low-energy applications like Edge Machine Learning and embedded IoT systems.

http://dspvlsi.uniroma2.it/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dörflinger A, Albers M, Kleinbeck B, Guan Y, Michalik H, Klink R, Blochwitz C, Nechi A, Berekovic M (2021) A comparative survey of open-source application-class risc-v processor implementations. In: Proceedings of the 18th ACM international conference on computing frontiers, pp 12–20

    Google Scholar 

  2. Ramírez C, Castelló A, Quintana-Orti ES (2022) A blis-like matrix multiplication for machine learning in the risc-v isa-based gap8 processor. J Supercomput 78(16):18051–18060

    Article  Google Scholar 

  3. Kovačević N, Mišeljić D, Stojković A (2022) Risc-v vector processor for acceleration of machine learning algorithms. In: 2022 30th Telecommunications Forum (TELFOR). IEEE, pp 1–4

    Google Scholar 

  4. Ottavi G, Garofalo A, Tagliavini G, Conti F, Benini L, Rossi D (2020) A mixed-precision risc-v processor for extreme-edge dnn inference. In: 2020 IEEE computer society annual symposium on VLSI (ISVLSI). IEEE, pp 512–517

    Google Scholar 

  5. Ciccarella G, Giuliano R, Mazzenga F, Vatalaro F, Vizzarri A (2019) Edge cloud computing in telecommunications: case studies on performance improvement and tco saving. In: 2019 fourth international conference on fog and mobile edge computing (FMEC). IEEE, pp 113–120

    Google Scholar 

  6. Rothmann M, Porrmann M (2022) A survey of domain-specific architectures for reinforcement learning. IEEE Access 10:13753–13767

    Article  Google Scholar 

  7. Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3):279–292

    Article  Google Scholar 

  8. Liu X, Diao J, Li N (2022) A fpga-based accelerator implementation for path planning using q_learning algorithm. J Phys: Conf Ser 2245. IOP Publishing

    Google Scholar 

  9. Sahoo SS, Baranwal AR, Ullah S, Kumar A (2021) Memorel: a memory-oriented optimization approach to reinforcement learning on fpga-based embedded systems. In: Proceedings of the 2021 on Great Lakes Symposium on VLSI, pp 339–346

    Google Scholar 

  10. Meng Y, Kuppannagari S, Rajat R, Srivastava A, Kannan R, Prasanna V (2020) Qtaccel: a generic fpga based design for q-table based reinforcement learning accelerators. In: 2020 IEEE international parallel and distributed processing symposium workshops (IPDPSW). IEEE, pp 107–114

    Google Scholar 

  11. Spanò S, Cardarilli GC, Di Nunzio L, Fazzolari R, Giardino D, Matta M, Nannarelli A, Re M (2019) An efficient hardware implementation of reinforcement learning: the q-learning algorithm. Ieee Access 7:186340–186351

    Google Scholar 

  12. Canese L, Cardarilli GC, Di Nunzio L, Fazzolari R, Re M, Spanó S (2022) Automatic ip core generator for fpga-based q-learning hardware accelerators. In: International conference on applications in electronics pervading industry, environment and society. Springer, Berlin, pp 242–247

    Google Scholar 

  13. Cheikh A, Sordillo S, Mastrandrea A, Menichelli F, Scotti G, Olivieri M (2021) Klessydra-t: designing vector coprocessors for multithreaded edge-computing cores. IEEE Micro 41(2):64–71

    Article  Google Scholar 

  14. Gautschi M, Schiavone PD, Traber A, Loi I, Pullini A, Rossi D, Flamand E, Gürkaynak FK, Benini L (2017) Near-threshold risc-v core with dsp extensions for scalable iot endpoint devices. IEEE Trans Very Large Scale Integr (VLSI) Syst 25(10):2700–2713 (2017)

    Google Scholar 

  15. Cardarilli GC, Di Nunzio L, Fazzolari R, Giardino D, Re M, Ricci A, Spano S (2022) An fpga-based multi-agent reinforcement learning timing synchronizer. Comput Electr Eng 99:107749

    Article  Google Scholar 

  16. Cardarilli GC, Di Nunzio L, Fazzolari R, Giardino D, Matta M, Re M, Spanò S (2020) An action-selection policy generator for reinforcement learning hardware accelerators. In: International conference on applications in electronics pervading industry, environment and society. Springer, Berlin, pp 267–272

    Google Scholar 

  17. Klessydra: Klessydra/pulpino-klessydra: an open-source microcontroller system based on risc-v. https://github.com/klessydra/pulpino-klessydra

Download references

Acknowledgments

The authors would like to thank Advanced Micro Devices, Inc. (AMD) for providing the FPGA software tools with the AMD-Xilinx University Program. This work is partially supported by Project ECS 0000024 Rome Technopole, CUP B83C22002820006, NRP Mission 4 Component 2 Investment 1.5, funded by the European Union—NextGenerationEU.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergio Spanò .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Angeloni, D., Canese, L., Cardarilli, G.C., Di Nunzio, L., Re, M., Spanò, S. (2024). A RISC-V Hardware Accelerator for Q-Learning Algorithm. In: Bellotti, F., et al. Applications in Electronics Pervading Industry, Environment and Society. ApplePies 2023. Lecture Notes in Electrical Engineering, vol 1110. Springer, Cham. https://doi.org/10.1007/978-3-031-48121-5_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-48121-5_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-48120-8

  • Online ISBN: 978-3-031-48121-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics