skip to main content
research-article

HASS: a scheduler for heterogeneous multicore systems

Published: 21 April 2009 Publication History

Abstract

Future heterogeneous single-ISA multicore processors will have an edge in potential performance per watt over comparable homogeneous processors. To fully tap into that potential, the OS scheduler needs to be heterogeneity-aware, so it can match jobs to cores according to characteristics of both. We propose a Heterogeneity-Aware Signature-Supported scheduling algorithm that does the matching using per-thread architectural signatures, which are compact summaries of threads' architectural properties collected offline. The resulting algorithm does not rely on dynamic profiling, and is comparatively simple and scalable. We implemented HASS in OpenSolaris, and achieved average workload speedups of up to 13%, matching best static assignment, achievable only by an oracle. We have also implemented a dynamic IPC-driven algorithm proposed earlier that relies on online profiling. We found that the complexity, load imbalance and associated performance degradation resulting from dynamic profiling are significant challenges to using this algorithm successfully. As a result it failed to deliver expected performance gains and to outperform HASS.

References

[1]
K. Asanovic et al. The Landscape of Parallel Computing Research: A View from Berkeley. UC Berkeley Technical Report UCB/EECS-2006-183, 2006.
[2]
S. Balakrishnan, R. Rajwar, M. Upton, and K. Lai. The Impact of Performance Asymmetry in Emerging Multicore Architectures. In Proceedings of the 32nd Annual International Symposium on Computer Architecture (Madison, Wisconsin USA, June 04-08, 2005). ISCA '05. IEEE Computer Society, Washington, DC, USA, 506--517.
[3]
M. Becchi and P. Crowley. Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures. In Proceedings of the 3rd Conference on Computing Frontiers (Ischia, Italy, May 02-05, 2006). Computing Frontiers '06. ACM, New York, NY, USA, 29--40.
[4]
E. Berg and E. Hargersten. StatCache: A Probabilistic Approach to Efficient and Accurate Data Locality Analysis. In Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software (Austin, Texas, USA, March 10-12, 2004). ISPASS '04. IEEE Computer Society, Washington, DC, USA, 20--27.
[5]
S. Borkar. Thousand Core Chips--A Technology Perspective. In Proceedings of the 44th Annual Conference on Design Automation (San Diego, California, USA, June 04-08, 2007). DAC '07. ACM, New York, NY, USA, 746--749.
[6]
B. Cantrill, M. Shapiro, and A. Levinthal. Dynamic Instrumentation of Production Systems. In Proceedings of the USENIX Annual Technical Conference (Boston, MA, USA, June 27--July 02, 2004). USENIX '04. USENIX Association, Berkeley, CA, USA, 2.
[7]
D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting Inter-Thread Cache Contention on a Multi-Processor Architecture. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture (San Francisco, California, USA, February 12-16, 2005). HPCA '05. IEEE Computer Society, Washington, DC, USA, 340--351.
[8]
C. Ding, Y. Zhong. Predicting Whole-program Locality through Reuse Distance Analysis. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (San Diego, California, USA, June 09-11, 2003). PLDI '03. ACM, New York, NY, USA, 245--257.
[9]
V. Freeh et al. Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications. IEEE Transactions on Parallel and Distributed Systems, 18, 6 (June 2007). IEEE Press, Piscataway, NJ, USA, 835--848.
[10]
M. Hill and M. Marty. Amdahl's Law in the Multicore Era. IEEE Computer, 41, 7 (July 2008). IEEE Computer Society Press, Los Alamitos, CA, USA, 33--38.
[11]
M. Hill and A. Smith. Evaluating Associativity in CPU Caches. IEEE Transactions on Computers, 38, 12 (December 1989). IEEE Computer Society, Washington, DC, USA, 1612--1630.
[12]
K. Hoste and L. Eeckhout. Microarchitecture-Independent Workload Characterization. IEEE Micro, 27(3), 2007. IEEE Computer Society Press, Los Alamitos, CA, USA, 63--72.
[13]
E. Humenay, D. Tarjan, and K. Skadron. Impact of Process Variations on Multicore Performance Symmetry. In Proceedings of the Conference on Design, Automation and Test in Europe (Nice, France, April 16-20, 2007). DATE '07. EDA Consortium, San Jose, CA, USA, 1653--1658.
[14]
R. Kumar et al. Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (San Diego, California, USA, December 03-05, 2003). MICRO '03. IEEE Computer Society, Washington, DC, USA, 81.
[15]
R. Kumar et al. Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance. In Proceedings of the 31st Annual International Symposium on Computer Architecture (München, Germany, June 19-23, 2004). ISCA '04. IEEE Computer Society, Washington, DC, USA, 64.
[16]
T. Li, D. Baumberger, D.A. Koufaty, and Scott Hahn. Efficient Operating System Scheduling for Performance-Asymmetric Multi-Core Architectures. In Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (Reno, Nevada, USA, November 10-16, 2007). SC '07. ACM, New York, NY, USA, No. 53.
[17]
C.K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. Reddi, K. Hazelwood. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (Chicago, Illinois, USA, June 11-15, 2005). PLDI '05. ACM, New York, NY, USA, 190--200.
[18]
J. Mogul et al. Using Asymmetric Single-ISA CMPs to Save Energy on Operating Systems. IEEE Micro, 28, 3 (May 2008). IEEE Computer Society Press, Los Alamitos, CA, USA, 26--41.
[19]
D. Shelepov and A. Fedorova. Scheduling on Heterogeneous Multicore Processors Using Architectural Signatures. In Proceedings of the Workshop on the Interaction between Operating Systems and Computer Architecture, in conjunction with the 35th International Symposium on Computer Architecture (Beijing, China, June 21-25, 2008). WIOSCA '08.
[20]
T. Sherwood, S. Sair, and B. Calder. Phase Tracking and Prediction. In Proceedings of the 30th Annual International Symposium on Computer Architecture (San Diego, California, USA, June 09-11, 2003). ISCA '03. ACM, New York, NY, USA, 336--349.
[21]
A. Smith. A Comparative Study of Set Associative Memory Mapping Algorithms and Their Use for Cache and Main Memory. IEEE Transactions on Software Engineering, 4, 2 (March 1978). IEEE Press, Piscataway, NJ, USA, 121--130.
[22]
R. Teodorescu and J. Torrellas. Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors. In Proceedings of the 35th International Symposium on Computer Architecture (Beijing, China, June 21-25, 2008). ISCA '08. IEEE Computer Society, Washington, DC, USA, 363--374.

Cited By

View all
  • (2024)Exploiting Elasticity via OS-Runtime Cooperation to Improve CPU Utilization in Multicore Systems2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00014(35-43)Online publication date: 20-Mar-2024
  • (2023)A Neural Network-Based Approach to Dynamic Core Morphing for AMPs2023 IEEE International Symposium on Smart Electronic Systems (iSES)10.1109/iSES58672.2023.00013(4-9)Online publication date: 18-Dec-2023
  • (2023)Divide&Content: A Fair OS-Level Resource Manager for Contention Balancing on NUMA MulticoresIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.330999934:11(2928-2945)Online publication date: 30-Aug-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review  Volume 43, Issue 2
April 2009
119 pages
ISSN:0163-5980
DOI:10.1145/1531793
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 April 2009
Published in SIGOPS Volume 43, Issue 2

Check for updates

Author Tags

  1. architectural signatures
  2. asymmetric
  3. heterogeneous
  4. multicore
  5. scheduling

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)3
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Exploiting Elasticity via OS-Runtime Cooperation to Improve CPU Utilization in Multicore Systems2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00014(35-43)Online publication date: 20-Mar-2024
  • (2023)A Neural Network-Based Approach to Dynamic Core Morphing for AMPs2023 IEEE International Symposium on Smart Electronic Systems (iSES)10.1109/iSES58672.2023.00013(4-9)Online publication date: 18-Dec-2023
  • (2023)Divide&Content: A Fair OS-Level Resource Manager for Contention Balancing on NUMA MulticoresIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.330999934:11(2928-2945)Online publication date: 30-Aug-2023
  • (2023)Dependency Prediction of Long-Time Resource Uses in HPC EnvironmentIEEE Access10.1109/ACCESS.2023.334104611(141871-141888)Online publication date: 2023
  • (2023)Flexible system software scheduling for asymmetric multicore systems with PMCSched: A case for Intel Alder LakeConcurrency and Computation: Practice and Experience10.1002/cpe.781435:25Online publication date: 6-Jun-2023
  • (2022)Evaluation of the Intel thread director technology on an Alder Lake processorProceedings of the 13th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3546591.3547532(61-67)Online publication date: 23-Aug-2022
  • (2022)Guaranteeing Performance SLAs of Cloud Applications Under Resource StormsIEEE Transactions on Cloud Computing10.1109/TCC.2020.298537210:2(1329-1343)Online publication date: 1-Apr-2022
  • (2022)Prediction of multicore CPU performance through parallel data mining on public datasetsDisplays10.1016/j.displa.2021.10211271(102112)Online publication date: Jan-2022
  • (2021)Mapping Computations in Heterogeneous Multicore Systems with Statistical Regression on Program InputsACM Transactions on Embedded Computing Systems10.1145/347828820:6(1-35)Online publication date: 18-Oct-2021
  • (2021)Warehouse-scale video acceleration: co-design and deployment in the wildProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3445814.3446723(600-615)Online publication date: 19-Apr-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media