skip to main content
10.1145/3218603.3218630acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article

Pareto-Optimal Power- and Cache-Aware Task Mapping for Many-Cores with Distributed Shared Last-Level Cache

Published: 23 July 2018 Publication History

Abstract

Two factors primarily affect performance of multi-threaded tasks on many-core processors with both shared and physically distributed Last-Level Cache (LLC): the power budget associated with a certain task mapping that aims to guarantee thermally safe operation and the non-uniform LLC access latency of threads running on different cores. Spatially distributing threads across the many-core increases the power budget, but unfortunately also increases the associated LLC latency. On the other side, mapping more threads to cores near the center of the many-core decreases the LLC latency, but unfortunately also decreases the power budget. Consequently, both metrics (LLC latency and power budget) cannot be simultaneously optimal, which leads to a Pareto-optimization that has formerly not been exploited. We are the first to present a run-time task mapping algorithm called PCMap that exploits this trade-off. Our approach results in up to 8.6% reduction in the average task response time accompanied by a reduction of up to 8.5% in the energy consumption compared to the state-of-the-art.

References

[1]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC Benchmark Suite: Characterization and Architectural Implications. In Parallel Architectures and Compilation Techniques (PACT), 2008.
[2]
T. E. Carlson, W. Heirmant, and L. Eeckhout. Sniper: Exploring the Level of Abstraction for Scalable and Accurate Parallel Multi-Core Simulation. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2011.
[3]
A. K. Coskun, T. S. Rosing, K. A. Whisnant, and K. C. Gross. Temperature-Aware MPSoC Scheduling for Reducing Hot Spots and Gradients. In Asia and South Pacific Design Automation Conference (ASP-DAC), 2008.
[4]
E. L. de Souza Carvalho, N. L. V. Calazans, and F. G. Moraes. Dynamic Task Mapping for MPSoCs. IEEE Design & Test of Computers, 2010.
[5]
T. Ebi, D. Kramer, W. Karl, and J. Henkel. Economic Learning for Thermal-Aware Power Budgeting in Many-Core Architectures. In Conference on Hardware/Software Codesign and System Synthesis (CODES), 2011.
[6]
M. R. Garey and D. S. Johnson. Complexity Results for Multiprocessor Scheduling under Resource Constraints. SIAM Journal on Computing, 1975.
[7]
W. Huang, S. Ghosh, S. Velusamy, K. Sankaranarayanan, K. Skadron, and M. R. Stan. HotSpot: A Compact Thermal Modeling mMethodology for Early-Stage VLSI Design. Transactions on Very Large Scale Integration Systems, 2006.
[8]
A. Kanduri, M.-H. Haghbayan, A.-M. Rahmani, M. Shafique, A. Jantsch, and P. Liljeberg. adBoost: Thermal Aware Performance Boosting through Dark Silicon Patterning. Transactions on Computers (TC), 2018.
[9]
C. Kim, D. Burger, and S. W. Keckler. An Adaptive, NonUniform Cache Structure for Wire-Delay Dominated OnChip Caches. In Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2002.
[10]
S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi. The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing. Transactions on Architecture and Code Optimization (TACO), 2013.
[11]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In ACM SIGPLAN Notices, 2005.
[12]
S. Mittal. A Survey of Architectural Techniques for Improving Cache Power Efficiency. Sustainable Computing: Informatics and Systems (SUSCOM), 2014.
[13]
T. S. Muthukaruppan, M. Pricopi, V. Venkataramani, T. Mitra, and S. Vishin. Hierarchical Power Management for Asymmetric Multi-Core in Dark Silicon Era. In Design Automation Conference (DAC), 2013.
[14]
S. Pagani, H. Khdr, W. Munawar, J.-J. Chen, M. Shafique, M. Li, and J. Henkel. TSP: Thermal Safe Power: Efficient Power Budgeting for Many-Core Systems in Dark Silicon. In Conference on Hardware/Software Codesign and System Synthesis (CODES), 2014.
[15]
A. Pathania and J. Henkel. Task Scheduling for Many-Cores with S-NUCA Caches. In Design, Automation and Test in Europe (DATE), 2018.
[16]
X. Wang, A. K. Singh, B. Li, Y. Yang, T. Mak, and H. Li. Bubble Budgeting: Throughput Optimization for Dynamic Workloads by Exploiting Dark Cores in Many Core Systems. In International Symposium on Networks-on-Chip, 2016.
[17]
S. Wildermann, M. Glaß, and J. Teich. Multi-Objective Distributed Run-time Resource Management for Many-Cores. In Design, Automation & Test in Europe (DATE), 2014.
[18]
D. Zhu, L. Chen, T. M. Pinkston, and M. Pedram. TAPP: Temperature-Aware Application Mapping for NoC-Based Many-Core Processors. In Design, Automation & Test in Europe (DATE), 2015.

Cited By

View all
  • (2023)Thermal Management for S-NUCA Many-Cores via Synchronous Thread Rotations2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10136895(1-6)Online publication date: Apr-2023
  • (2023)HDSAP: heterogeneity-aware dynamic scheduling algorithm to improve performance of nanoscale many-core processors for unknown workloadsThe Journal of Supercomputing10.1007/s11227-023-05159-679:12(13341-13369)Online publication date: 23-Mar-2023
  • (2020)PkMin: Peak Power Minimization for Multi-Threaded Many-Core ApplicationsJournal of Low Power Electronics and Applications10.3390/jlpea1004003110:4(31)Online publication date: 30-Sep-2020
  • Show More Cited By

Index Terms

  1. Pareto-Optimal Power- and Cache-Aware Task Mapping for Many-Cores with Distributed Shared Last-Level Cache

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ISLPED '18: Proceedings of the International Symposium on Low Power Electronics and Design
      July 2018
      327 pages
      ISBN:9781450357043
      DOI:10.1145/3218603
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 July 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Cache Memory
      2. Dark Silicon
      3. Power Dissipation
      4. Processor Scheduling
      5. Thermal Stability

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      ISLPED '18
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 398 of 1,159 submissions, 34%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)8
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 18 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Thermal Management for S-NUCA Many-Cores via Synchronous Thread Rotations2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10136895(1-6)Online publication date: Apr-2023
      • (2023)HDSAP: heterogeneity-aware dynamic scheduling algorithm to improve performance of nanoscale many-core processors for unknown workloadsThe Journal of Supercomputing10.1007/s11227-023-05159-679:12(13341-13369)Online publication date: 23-Mar-2023
      • (2020)PkMin: Peak Power Minimization for Multi-Threaded Many-Core ApplicationsJournal of Low Power Electronics and Applications10.3390/jlpea1004003110:4(31)Online publication date: 30-Sep-2020
      • (2020)Thermal Load-aware Adaptive Scheduling for Heterogeneous Platforms2020 33rd International Conference on VLSI Design and 2020 19th International Conference on Embedded Systems (VLSID)10.1109/VLSID49098.2020.00039(125-130)Online publication date: Jan-2020
      • (2020)DASH: Dynamic Scheduling Algorithm for Single-ISA Heterogeneous Nano-scale Many-Cores2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)10.1109/ICCKE50421.2020.9303673(447-452)Online publication date: 29-Oct-2020
      • (2019)Smart Thermal Management for Heterogeneous Multicores2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE.2019.8715001(132-137)Online publication date: Mar-2019
      • (2019)Prediction-Based Task Migration on S-NUCA Many-Cores2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE.2019.8714974(1579-1582)Online publication date: Mar-2019
      • (2019)Unified Testing and Security Framework for Wireless Network-on-Chip Enabled Multi-Core ChipsACM Transactions on Embedded Computing Systems10.1145/335821218:5s(1-20)Online publication date: 8-Oct-2019

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media