Reinforcement learning (RL) is revolutionizing the field of Artificial Intelligence (AI) and represents a step ahead towards building an optimal and autonomous system with a higher level of understanding (Arulkumaran et al. in IEEE Signal Processing Mag 34(6):26–38, 2017). One of the main goals for AI is to produce fully autonomous agents to interact with several features and learn the optimal behavior to optimize. Applications vary in data access patterns and a static hardware configuration is not idea for all phases of a workload. Today Xeon cores have multiple data prefetchers which fetch the next sets of data to be used, however, there are problems with these prefetchers as they may interact in destructive ways. This destructive behavior can cause several problems such as an increase in cache pollution, bottlenecks in the memory bandwidth, and additional occupancy to critical path demand queues. Managing the aggressiveness of the prefetchers are necessary to mitigate these problems. Current hardware prefetchers manage the aggressiveness of prefetchers by monitoring telemetry such as memory bandwidth and accuracy. However, there are problems with this approach as the telemetry data does not necessarily correlate with the overall system performance. In addition, other solutions show optimizing prefetchers individually to manage the system performance rather than allowing multiple features to work together. This research introduces hierarchical smart agents using reinforcement learning to find the optimal aggressiveness for the MLC prefetchers on runtime managed by the Smart Prefetchers Manager (SPM). We have expanded our previous work and evaluated more workloads on a hierarchical model and applied reinforcement learning in addition to offline training approach. This approach is implemented and evaluated on single core, single process environment to optimize the three Mid-level Cache (MLC) prefetchers on run time. Results demonstrated that using the reinforcement learning can optimize up to 7.18% improvement in instructions per cycle (IPC) over the state-of-the-art hardware solution.

No datasets were generated or analysed during the current study.
Fargo, F., Diamond, M., Franza, O. et al. Intelligent cache prefetchers in HPC architecture. Cluster Comput 28, 154 (2025).
