ABSTRACT
Learning classifier tables (LCTs) are lightweight, classifier based, hardware implemented reinforcement learning (RL) building blocks which enable self-adaptivity and self-optimization properties in multicore systems. LCTs are deployed per-core to learn and optimize potentially conflicting objectives and constraints. Experience replay (ER) is a replay memory technique in RL, where agents experiences are stored in a buffer and are used to improve the learning process. Implementing an ER buffer in hardware requires memory and is expensive. We introduce LCT-DER: LCT with dynamic-sized experience replay, where the classifier population and experiences share the same memory by exploiting the concept of macro-classifiers. LCT-DER performing DVFS achieves 44.5% and 4.5% lower number of power budget overshoots and IPS difference compared to a standard LCT without requiring additional memory.
- Martin V Butz, Tim Kovacs, Pier Luca Lanzi, and Stewart W Wilson. 2001. How XCS evolves accurate classifiers. In Proceedings of the Third Genetic and Evolutionary Computation Conference (GECCO-2001).Google Scholar
- Bryan Donyanavard, Tiago Mück, Amir M Rahmani, Nikil Dutt, Armin Sadighi, Florian Maurer, and Andreas Herkersdorf. 2019. SOSA: Self-Optimizing Learning with Self-Adaptive Control for Hierarchical System-on-Chip Management. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture.Google ScholarDigital Library
- Jiri Gaisler, Edvin Catovic, Marko Isomaki, Kristoffer Glembo, and Sandi Habinc. 2007. GRLIB IP core user's manual. Gaisler research (2007).Google Scholar
- Matthew R Guthaus, Jeffrey S Ringenberg, Dan Ernst, Todd M Austin, Trevor Mudge, and Richard B Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the fourth annual IEEE international workshop on workload characterization. WWC-4 (Cat. No. 01EX538).Google ScholarCross Ref
- Michael Heider, David Pätzel, and Alexander RM Wagner. 2022. An overview of LCS research from 2021 to 2022. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. 2086--2094.Google ScholarDigital Library
- Florian Maurer, Bryan Donyanavard, Amir M Rahmani, Nikil Dutt, and Andreas Herkersdorf. 2020. Emergent control of MPSoC operation by a hierarchical supervisor/reinforcement learning approach. In 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE).Google Scholar
- Martin Rapp, Hussam Amrouch, Yibo Lin, Bei Yu, David Z Pan, Marilyn Wolf, and Jörg Henkel. 2021. Mlcad: A survey of research in machine learning for cad keynote paper. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2021).Google Scholar
- Lukas Rosenbauer, Anthony Stein, David Pätzel, and Jöorg Hähner. 2020. XCSF with experience replay for automatic test case prioritization. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE.Google ScholarCross Ref
- Anthony Stein, Roland Maier, Lukas Rosenbauer, and Jörg Hähner. 2020. XCS classifier system with experience replay. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference.Google ScholarDigital Library
- Anmol Surhonne, Nguyen Anh Vu Doan, Florian Maurer, Thomas Wild, and Andreas Herkersdorf. 2022. GAE-LCT: A Run-Time GA-Based Classifier Evolution Method for Hardware LCT Controlled SoC Performance-Power Optimization. In International Conference on Architecture of Computing Systems.Google Scholar
- Johannes Zeppenfeld, Abdelmajid Bouajila, Walter Stechele, and Andreas Herkersdorf. 2008. Learning classifier tables for autonomic systems on chip. INFORMATIK 2008. Beherrschbare Systeme-dank Informatik. Band 2 (2008).Google Scholar
Index Terms
- LCT-DER: Learning Classifier Table with Dynamic-Sized Experience Replay for Run-time SoC Performance-Power Optimization
Recommendations
GAE-LCT: A Run-Time GA-Based Classifier Evolution Method for Hardware LCT Controlled SoC Performance-Power Optimization
Architecture of Computing SystemsAbstractLearning classifier tables (LCTs) are classifier based and lightweight hardware reinforcement learning building blocks which inherit the concepts of learning classifier systems. LCTs are used as a per-core low level controllers to learn and ...
Real-time reinforcement learning by sequential Actor-Critics and experience replay
Actor-Critics constitute an important class of reinforcement learning algorithms that can deal with continuous actions and states in an easy and natural way. This paper shows how these algorithms can be augmented by the technique of experience replay ...
Hindsight-Combined and Hindsight-Prioritized Experience Replay
Neural Information ProcessingAbstractReinforcement learning has proved to be of great utility; execution, however, may be costly due to sampling inefficiency. An efficient method for training is experience replay, which recalls past experiences. Several experience replay techniques, ...
Comments