Loading [MathJax]/extensions/MathMenu.js
Practical Online Reinforcement Learning for Microprocessors With Micro-Armed Bandit | IEEE Journals & Magazine | IEEE Xplore

Practical Online Reinforcement Learning for Microprocessors With Micro-Armed Bandit


Abstract:

Although online reinforcement learning (RL) has shown promise for microarchitecture decision making, processor vendors are still reluctant to adopt it. There are two main...Show More

Abstract:

Although online reinforcement learning (RL) has shown promise for microarchitecture decision making, processor vendors are still reluctant to adopt it. There are two main reasons that make RL-based solutions unattractive. First, they have high complexity and storage overhead. Second, many RL agents are engineered for a specific problem and are not reusable. In this work, we propose a way to tackle these shortcomings. We find that, in diverse microarchitecture problems, only a few actions are useful in a given time window. Motivated by this property, we design Micro-Armed Bandit (or Bandit for short), an RL agent that is based on the low-complexity Multi-Armed Bandit algorithms. We show that Bandit can match or exceed the performance of more complex RL and non-RL alternatives in two different problems: data prefetching and instruction fetch thread selection in simultaneous multithreaded processors. We believe that Bandit’s simplicity, reusability, and small storage overhead make online RL more practical for microarchitecture.
Published in: IEEE Micro ( Volume: 44, Issue: 4, July-Aug. 2024)
Page(s): 80 - 87
Date of Publication: 05 June 2024

ISSN Information:

Funding Agency:


References

References is not available for this document.