skip to main content
10.1145/1854273.1854306acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

On mitigating memory bandwidth contention through bandwidth-aware scheduling

Published: 11 September 2010 Publication History

Abstract

Shared-memory multiprocessors have dominated all platforms from high-end to desktop computers. On such platforms, it is well known that the interconnect between the processors and the main memory has become a major bottleneck. The bandwidth-aware job scheduling is an effective and relatively easy-to-implement way to relieve the bandwidth contention. Previous policies understood that bandwidth saturation hurt the throughput of parallel jobs so they scheduled the jobs to let the total bandwidth requirement equal to the system peak bandwidth. However, we found that intra-quantum fine-grained bandwidth contention still happened due to a program's irregular fluctuation in memory access intensity, which is mostly ignored in previous policies.
In this paper, we quantify the impact of bandwidth contention on overall performance. We found that concurrent jobs could achieve a higher memory bandwidth utilization at the expense of super-linear performance degradation. Based on such an observation, we proposed a new workload scheduling policy. Its basic idea is that interference due to bandwidth contention could be minimized when bandwidth utilization is maintained at the level of average bandwidth requirement of the workload. Our evaluation is based on both SPEC 2006 and NPB workloads. The evaluation results on randomly generated workloads show that our policy could improve the system throughput by 4.1% on average over the native OS scheduler, and up to 11.7% improvement has been observed.

References

[1]
}}Nas parallel benchmarks. http://www.nas.nasa.gov/resources/software/npb.html.
[2]
}}The perfmon2 website. http://perfmon2.sourceforge.net/.
[3]
}}The sream benchmark website. http://www.streambench.org/.
[4]
}}C. D. Antonopoulos, D. S. Nikolopoulos, and T. S. Papatheodorou. Scheduling algorithms with bus bandwidth considerations for smps. In Proceedings of the 2003 International Conference on Parallel Processing (ICPP'03), page 547, Oct 2003.
[5]
}}C. D. Antonopoulos, D. S. Nikolopoulos, and T. S. Papatheodorou. Realistic workload scheduling policies for taming the memory bandwidth bottleneck of smps. In Proceedings of the 2004 IEEE/ACM International Conference on High Performance Computing (HiPC'04), pages 286--296, 2004.
[6]
}}D. Burger, J. R. Goodman, and A. Kägi. Memory bandwidth limitations of future microprocessors. In Proceedings of the 23rd annual international symposium on Computer architecture (ISCA'96), pages 78--89, New York, NY, USA, 1996. ACM.
[7]
}}A. S. Dhodapkar and J. E. Smith. Comparing program phase detection techniques. In Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture (MICRO'03), page 217, Washington, DC, USA, 2003. IEEE Computer Society.
[8]
}}E. Ebrahimi, O. Mutlu, and Y. N. Patt. Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems. In HPCA-15, 2009.
[9]
}}F. Guo, Y. Solihin, L. Zhao, and R. Iyer. A framework for providing quality of service in chip multi-processors. In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'07), pages 343--355, Washington, DC, USA, 2007. IEEE Computer Society.
[10]
}}E. Koukis and N. Koziris. Memory and network bandwidth aware scheduling of multiprogrammed workloads on clusters of smps. In Proceedings of the 12th International Conference on Parallel and Distributed Systems (ICPADS'06), pages 345--354, Washington, DC, USA, 2006. IEEE Computer Society.
[11]
}}A. Krste, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, W. L. Patterson, David A. andPlishker, J. Shalf, S. W. Williams, and K. A. Yelick. The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-183, University of California, Berkeley, 2006.
[12]
}}J. Liedtke, M. Völp, and K. Elphinstone. Preliminary thoughts on memory-bus scheduling. In Proceedings of the 9th workshop on ACM SIGOPS European workshop, pages 207--210, New York, NY, USA, 2000. ACM.
[13]
}}J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In IEEE 14th International Symposium on High Performance Computer Architecture, pages 367--378, 2008.
[14]
}}N. R. Mahapatra and B. Venkatrao. The processor-memory bottleneck: problems and solutions. Crossroads, page 2.
[15]
}}R. L. McGregor, C. D. Antonopoulos, and D. S. Nikolopoulos. Scheduling algorithms for effective thread pairing on hybrid multiprocessors. In Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05), page 28.1, 2005.
[16]
}}C. McNairy and R. Bhatia. Montecito: A dual-core, dual-thread itanium processor. IEEE Micro, 25:10--20, 2005.
[17]
}}T. Sherwood, S. Sair, and B. Calder. Phase tracking and prediction. SIGARCH Comput. Archit. News, 31(2):336--349, 2003.
[18]
}}D. K. Tam, R. Azimi, L. B. Soares, and M. Stumm. Rapidmrc: approximating l2 miss rate curves on commodity systems for online optimizations. In Proceeding of the 14th international conference on Architectural support for programming languages and operating systems (ASPLOS'09), pages 121--132, 2009.
[19]
}}J. Wang, S. Zhou, K. Ahmed, and W. Long. Lsbatch: A distributed load sharing batch system. Technical report, Computer Systems Research Institute, University of Toronto, 1993.

Cited By

View all
  • (2024)Software Resource Disaggregation for HPC with Serverless Computing2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00021(139-156)Online publication date: 27-May-2024
  • (2023)CABARRE: Request Response Arbitration for Shared Cache ManagementACM Transactions on Embedded Computing Systems10.1145/360809622:5s(1-24)Online publication date: 31-Oct-2023
  • (2023)DRL-based Task Scheduling and Shared Resource Allocation for Multi-Core Real-Time Systems2023 IEEE 3rd International Conference on Intelligent Technology and Embedded Systems (ICITES)10.1109/ICITES59818.2023.10356887(144-150)Online publication date: 27-Oct-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PACT '10: Proceedings of the 19th international conference on Parallel architectures and compilation techniques
September 2010
596 pages
ISBN:9781450301787
DOI:10.1145/1854273
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bus contention
  2. memory bandwidth
  3. process scheduling

Qualifiers

  • Research-article

Conference

PACT '10
Sponsor:
  • IFIP WG 10.3
  • IEEE CS TCPP
  • SIGARCH
  • IEEE CS TCAA

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)52
  • Downloads (Last 6 weeks)6
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Software Resource Disaggregation for HPC with Serverless Computing2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00021(139-156)Online publication date: 27-May-2024
  • (2023)CABARRE: Request Response Arbitration for Shared Cache ManagementACM Transactions on Embedded Computing Systems10.1145/360809622:5s(1-24)Online publication date: 31-Oct-2023
  • (2023)DRL-based Task Scheduling and Shared Resource Allocation for Multi-Core Real-Time Systems2023 IEEE 3rd International Conference on Intelligent Technology and Embedded Systems (ICITES)10.1109/ICITES59818.2023.10356887(144-150)Online publication date: 27-Oct-2023
  • (2022)DeepP: Deep Learning Multi-Program Prefetch Configuration for the IBM POWER 8IEEE Transactions on Computers10.1109/TC.2021.313999771:10(2646-2658)Online publication date: 1-Oct-2022
  • (2022)Self-Optimizing Memory Controllers: Proposing Request-level Scheduling2022 Second International Conference on Computer Science, Engineering and Applications (ICCSEA)10.1109/ICCSEA54677.2022.9936277(1-5)Online publication date: 8-Sep-2022
  • (2022)Kronos: towards bus contention-aware job scheduling in warehouse scale computersFrontiers of Computer Science10.1007/s11704-021-0418-517:1Online publication date: 8-Aug-2022
  • (2021)Data-Intensive Computing Modules for Teaching Parallel and Distributed Computing2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW52791.2021.00062(350-357)Online publication date: Jun-2021
  • (2021)A Simulator for Intelligent Workload Managers in Heterogeneous Clusters2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid51090.2021.00029(196-205)Online publication date: May-2021
  • (2020)Towards workload-adaptive scheduling for HPC clusters2020 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER49012.2020.00064(449-453)Online publication date: Sep-2020
  • (2019)VeniceACM Transactions on Computer Systems10.1145/331036036:1(1-26)Online publication date: 14-Mar-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media