Abstract:
Manycore systems emerged as a solution to the limitations of single-core processors in meeting modern computational demands. Effective task mapping and migration are esse...Show MoreMetadata
Abstract:
Manycore systems emerged as a solution to the limitations of single-core processors in meeting modern computational demands. Effective task mapping and migration are essential in these systems to optimize computational performance without exceeding the Thermal Design Power (TDP) constraints. Additionally, temperature management is crucial for ensuring the system’s long-term reliability. This research proposes a lightweight and scalable heuristic for reliability-aware task mapping and migration. Our approach employs machine learning techniques, specifically Reinforcement Learning (RL), to optimize system mapping and migration. The proposed method utilizes a lookup table, which is pre-trained using Q-learning. The pre-training enables dynamic task distribution adjustments in response to the task mapping and their power consumption. Experimental results demonstrate that it outperforms other strategies. Our proposed method effectively manages peak temperatures and improves the system’s Mean Time To Failure (MTTF). This study provides a robust framework for task management in manycore systems and sets the groundwork for future explorations into autonomous system optimization.
Published in: 2024 37th SBC/SBMicro/IEEE Symposium on Integrated Circuits and Systems Design (SBCCI)
Date of Conference: 02-06 September 2024
Date Added to IEEE Xplore: 09 October 2024
ISBN Information: