research-article

Contention-Aware Scheduling on Multicore Systems

Authors:

Sergey Blagodurov,

Sergey Zhuravlev,

Alexandra FedorovaAuthors Info & Claims

ACM Transactions on Computer Systems (TOCS), Volume 28, Issue 4

Article No.: 8, Pages 1 - 45

https://doi.org/10.1145/1880018.1880019

Published: 01 December 2010 Publication History

Abstract

Contention for shared resources on multicore processors remains an unsolved problem in existing systems despite significant research efforts dedicated to this problem in the past. Previous solutions focused primarily on hardware techniques and software page coloring to mitigate this problem. Our goal is to investigate how and to what extent contention for shared resource can be mitigated via thread scheduling. Scheduling is an attractive tool, because it does not require extra hardware and is relatively easy to integrate into the system. Our study is the first to provide a comprehensive analysis of contention-mitigating techniques that use only scheduling. The most difficult part of the problem is to find a classification scheme for threads, which would determine how they affect each other when competing for shared resources. We provide a comprehensive analysis of such classification schemes using a newly proposed methodology that enables to evaluate these schemes separately from the scheduling algorithm itself and to compare them to the optimal. As a result of this analysis we discovered a classification scheme that addresses not only contention for cache space, but contention for other shared resources, such as the memory controller, memory bus and prefetching hardware. To show the applicability of our analysis we design a new scheduling algorithm, which we prototype at user level, and demonstrate that it performs within 2% of the optimal. We also conclude that the highest impact of contention-aware scheduling techniques is not in improving performance of a workload as a whole but in improving quality of service or performance isolation for individual applications and in optimizing system energy consumption.

References

[1]

An Mey, D., Sarholz, S., Terboven, C., van der Pas, R., and Loh, E. 2007. The RWTH Aachen SMP-Cluster User’s Guide, Version 6.2.

[2]

Blagodurov, S., Zhuravlev, S., Lansiquot, S., and Fedorova, A. 2009. Addressing contention on multicore processors via scheduling. Tech. Rep., Simon Fraser University 2009-16.

[3]

Chandra, D., Guo, F., Kim, S., and Solihin, Y. 2005. Predicting inter-thread cache contention on a chip multi-processor architecture. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA’05). 340--351.

Digital Library

[4]

Cho, S. and Jin, L. 2006. Managing distributed, shared l2 caches through os-level page allocation. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’39). 455--468.

Digital Library

[5]

Das, R., Mutlu, O., Moscibroda, T., and Das, C. R. 2009. Application-aware prioritization mechanisms for on-chip networks. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’42). 280--291.

Digital Library

[6]

Dhiman, G., Marchetti, G., and Rosing, T. 2009. vGreen: A system for energy efficient computing in virtualized environments. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED).

Digital Library

[7]

Ebrahimi, E., Mutlu, O., Lee, C. J., and Patt, Y. N. 2009. Coordinated control of multiple prefetchers in multicore systems. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’42). 316--326.

Digital Library

[8]

Fedorova, A., Seltzer, M. I., and Smith, M. D. 2007. Improving performance isolation on chip multiprocessors via an operating system scheduler. In Proceedings of the 16th International Conference on Parallel Architectures and Compilation Techniques (PACT’07). 25--38.

Digital Library

[9]

Gonzalez, R. and Horowitz, M. 1996. Energy dissipation in general purpose microprocessors. IEEE J. Solid-State Circ. 31, 1277--1284.

[10]

Grot, B., Keckler, S. W., and Mutlu, O. 2009. Preemptive virtual clock: A flexible, efficient, and cost-effective qos scheme for networks-on-chip. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’42). 268--279.

Digital Library

[11]

Herdrich, A., Illikkal, R., Iyer, R., Newell, D., Chadha, V., and Moses, J. 2009. Rate-based QoS techniques for cache/memory in cmp platforms. In Proceedings of the 23rd International Conference on Supercomputing (ICS’09). 479--488.

Digital Library

[12]

Hoste, K. and Eeckhout, L. 2007. Microarchitecture-independent workload characterization. IEEE Micro 27, 3, 63--72.

Digital Library

[13]

Jiang, Y., Shen, X., Chen, J., and Tripathi, R. 2008. Analysis and approximation of optimal co-scheduling on chip multiprocessors. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT’08). 220--229.

Digital Library

[14]

Kim, Y., Han, D., Mutlu, O., and Harchol-balter, M. 2010. Atlas: A scalable and high-performance scheduling algorithm for multiple memory controllers. In Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’41).

[15]

Knauerhase, R., Brett, P., Hohlt, B., Li, T., and Hahn, S. 2008. Using OS observations to improve performance in multicore systems. IEEE Micro 28, 3, 54--66.

Digital Library

[16]

Lee, C. J., Mutlu, O., Narasiman, V., and Patt, Y. N. 2008. Prefetch-aware dram controllers. In Proceedings of the 16th IEEE International Symposium on High-Performance Computer Architecture (HPCA’08). 200--209.

Digital Library

[17]

Liedtke, J., Haertig, H., and Hohmuth, M. 1997. OS-controlled cache predictability for real-time systems. In Proceedings of the 3rd IEEE Real-Time Technology and Applications Symposium (RTAS’97). 213.

Digital Library

[18]

Lin, J., Lu, Q., Ding, X., Zhang, Z., Zhang, X., and Sadayappan, P. 2008. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’08). 367--378.

[19]

Luk, C.-K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V. J., and Hazelwood, K. 2005. PIN: building customized program analysis tools with dynamic instrumentation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). 190--200.

Digital Library

[20]

Moscibroda, T. and Mutlu, O. 2007. Memory performance attacks: denial of memory service in multicore systems. In Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium (SS’07). 1--18.

Digital Library

[21]

Mutlu, O. and Moscibroda, T. 2007. Stall-time fair memory access scheduling for chip multiprocessors. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’40). 146--160.

Digital Library

[22]

Mutlu, O. and Moscibroda, T. 2008. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared dram systems. In Proceedings of the 35th Annual International Symposium on Computer Architecture (ISCA’08). 63--74.

Digital Library

[23]

Qureshi, M. K. and Patt, Y. N. 2006. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’39). 423--432.

Digital Library

[24]

Shelepov, D., and Fedorova, A. 2008. Scheduling on heterogeneous multicore processors using architectural signatures. In Proceedings of the Workshop on the Interaction between Operating Systems and Computer Architecture (WIOSCA).

[25]

Snavely, A. and Tullsen, D. M. 2000. Symbiotic jobscheduling for a simultaneous multithreaded processor. SIGARCH Comput. Archit. News 28, 5, 234--244.

Digital Library

[26]

Suh, G. E., Devadas, S., and Rudolph, L. 2002. A new memory monitoring scheme for memory-aware scheduling and partitioning. In Proceedings of the 8th International Symposium on High-Performance Computer Architecture (HPCA’02). 117.

Digital Library

[27]

Tam, D., Azimi, R., and Stumm, M. 2007. Thread clustering: sharing-aware acheduling on smp-cmp-smt multiprocessors. In Proceedings of the 2nd ACM European Conference on Computer Systems (EuroSys’07).

Digital Library

[28]

Tam, D. K., Azimi, R., Soares, L. B., and Stumm, M. 2009. Rapidmrc: Approximating l2 miss rate curves on commodity systems for online optimizations. In Proceeding of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’09). 121--132.

Digital Library

[29]

Thomas, M. H., Indermaur, T., and Gonzalez, R. 1994. Low-power digital design. In Proceedings of the IEEE Symposium on Low Power Electronics. 8--11.

[30]

van der Pas, R. 2005. The OMPlab on sun systems. In Proceedings of the 1st International Workshop on OpenMP.

[31]

Xie, Y. and Loh, G. 2008. Dynamic classification of program memory behaviors in CMPs. In Proceedings of CMP-MSI, (held in conjunction with ISCA-35).

[32]

Zhang, X., Dwarkadas, S., and Shen, K. 2009. Towards practical page coloring-based multicore cache management. In Proceedings of the 4th ACM European Conference on Computer Systems (EuroSys’09). 89--102.

Digital Library

Cited By

Dai WLeitão PTsang KShi YHancke GShu LBehnam MHaase JVyatkin V(2024)Synergies of Operation, Information, and Communication Technology for Solving New Societal and Industrial Challenges: Future DirectionsIEEE Industrial Electronics Magazine10.1109/MIE.2023.332139018:2(6-16)Online publication date: Jun-2024
https://doi.org/10.1109/MIE.2023.3321390
Palomo XMolina C(2024)ITER: an ITERative approach for inter-core timing analysis in statically scheduled cyclic executive systems on COTS multicore platforms for CRTESThe Journal of Supercomputing10.1007/s11227-024-06208-480:13(19719-19770)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s11227-024-06208-4
Regassa DYeom HHwang J(2023)ESH: Design and Implementation of an Optimal Hashing Scheme for Persistent MemoryApplied Sciences10.3390/app13201152813:20(11528)Online publication date: 20-Oct-2023
https://doi.org/10.3390/app132011528
Show More Cited By

Index Terms

Contention-Aware Scheduling on Multicore Systems
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Process management
        Scheduling

Recommendations

Addressing shared resource contention in multicore processors via scheduling
ASPLOS '10

Contention for shared resources on multicore processors remains an unsolved problem in existing systems despite significant research efforts dedicated to this problem in the past. Previous solutions focused primarily on hardware techniques and software ...
Addressing shared resource contention in multicore processors via scheduling
ASPLOS '10

Contention for shared resources on multicore processors remains an unsolved problem in existing systems despite significant research efforts dedicated to this problem in the past. Previous solutions focused primarily on hardware techniques and software ...
A case for NUMA-aware contention management on multicore systems
PACT '10: Proceedings of the 19th international conference on Parallel architectures and compilation techniques

On multicore systems contention for shared resources occurs when memory-intensive threads are co-scheduled on cores that share parts of the memory hierarchy, such as last-level caches and memory controllers. Previous work investigated how contention ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems

ACM Transactions on Computer Systems Volume 28, Issue 4

December 2010

100 pages

ISSN:0734-2071

EISSN:1557-7333

DOI:10.1145/1880018

Issue’s Table of Contents

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2010

Accepted: 01 October 2010

Received: 01 May 2010

Published in TOCS Volume 28, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

127
Total Citations
View Citations
2,920
Total Downloads

Downloads (Last 12 months)82
Downloads (Last 6 weeks)6

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Dai WLeitão PTsang KShi YHancke GShu LBehnam MHaase JVyatkin V(2024)Synergies of Operation, Information, and Communication Technology for Solving New Societal and Industrial Challenges: Future DirectionsIEEE Industrial Electronics Magazine10.1109/MIE.2023.332139018:2(6-16)Online publication date: Jun-2024
https://doi.org/10.1109/MIE.2023.3321390
Palomo XMolina C(2024)ITER: an ITERative approach for inter-core timing analysis in statically scheduled cyclic executive systems on COTS multicore platforms for CRTESThe Journal of Supercomputing10.1007/s11227-024-06208-480:13(19719-19770)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1007/s11227-024-06208-4
Regassa DYeom HHwang J(2023)ESH: Design and Implementation of an Optimal Hashing Scheme for Persistent MemoryApplied Sciences10.3390/app13201152813:20(11528)Online publication date: 20-Oct-2023
https://doi.org/10.3390/app132011528
Garcia PSritriratanarak W(2023)We need a theoretical framework for the modernization of industrial legacy systemsFrontiers in Industrial Engineering10.3389/fieng.2023.12666511Online publication date: 23-Nov-2023
https://doi.org/10.3389/fieng.2023.1266651
Ishiguro KYasuno NAublin PKono K(2023)Revisiting VM-Agnostic KVM vCPU Scheduler for Mitigating Excessive vCPU SpinningIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.329768834:10(2615-2628)Online publication date: Oct-2023
https://doi.org/10.1109/TPDS.2023.3297688
Tang BWu CYew PZhang YXie MLai YKang YWang WWei QWang Z(2023)SpecWands: An Efficient Priority-Based Scheduler Against Speculation Contention AttacksIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.328429042:12(4477-4490)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1109/TCAD.2023.3284290
Ge RFeng XZou PAllen T(2023)The Paradigm of Power Bounded High-Performance ComputingJournal of Computer Science and Technology10.1007/s11390-023-2885-738:1(87-102)Online publication date: 31-Jan-2023
https://doi.org/10.1007/s11390-023-2885-7
Canosa-Reyes RTchernykh ACortés-Mendoza JPulido-Gaytan BRivera-Rodriguez RLozano-Rizk JConcepción-Morales ECastro Barrera HBarrios-Hernandez CMedrano-Jaimes FAvetisyan ABabenko MDrozdov A(2022)Dynamic performance–Energy tradeoff consolidation with contention-aware resource provisioning in containerized cloudsPLOS ONE10.1371/journal.pone.026185617:1(e0261856)Online publication date: 20-Jan-2022
https://doi.org/10.1371/journal.pone.0261856
Spantidi OMarinakis TAnagnostopoulos I(2022)Fair Scheduling Through Collaborative Filtering on Multicore Systems2022 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS48785.2022.9937409(1551-1555)Online publication date: 28-May-2022
https://doi.org/10.1109/ISCAS48785.2022.9937409
Masouros DXydis SSoudris D(2021)Rusty: Runtime Interference-Aware Predictive Monitoring for Modern Multi-Tenant SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.301394832:1(184-198)Online publication date: 1-Jan-2021
https://doi.org/10.1109/TPDS.2020.3013948
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents