skip to main content
10.1145/1958746.1958815acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
research-article

In search for contention-descriptive metrics in HPC cluster environment

Published: 14 March 2011 Publication History

Abstract

In this paper, we argue that the modern HPC cluster environments contain several bottlenecks both within cluster multicore nodes and between them in the cluster interconnects. These bottlenecks represent resources that can be of high demand to several jobs, concurrently executing on the cluster. As such, the jobs can compete for accessing these resources and experience performance degradation due to contention. We point out, that, although the contention for shared resources like memory hierarchy of the cluster nodes, accessing the cluster interconnects or sharing the floating point unit can incur severe performance degradation to the cluster workload, the state-of-the-art cluster schedulers do not contain adequate means of addressing it. To fill this gap, we propose a new set of metrics that models shared resource contention and represents a fine-grained information about each job's resource utilization and communication patterns. The necessary information can be obtained with the performance counters within cluster nodes and cluster interconnect monitoring between them.

References

[1]
Benchmark and market analysis of worker node for HEP farms. {Online} Available: http://www.infn.it/CCR/server/.
[2]
Bro Quick Start Guide. {Online} Available: http://www.bro-ids.org/Bro-quick-start.pdf.
[3]
Capstats: a quick hack to get some NIC statistics. {Online} Available: http://www.icir.org/robin/capstats/.
[4]
CoolThreads Selection Tool. {Online} Available: http://www.opensparc.net/sunsource/cooltools/www/cooltst/.
[5]
Designing research computing solutions for the CERN/ATLAS program. {Online} Available: http://www.dell.com/downloads/global/power/ps4q09-20100175-Stemple.pdf.
[6]
Hpl application note. {Online} Available: http://software.intel.com/en-us/articles/performance-tools-for-software-developers-hpl-application-note/.
[7]
Oversubscribing nodes. {Online} Available: http://docs.sun.com/source/819-7480-11/ExecutingPrograms.html#50634758_489%29.
[8]
UltraSPARC T1. {Online} Available: http://en.wikipedia.org/wiki/UltraSPARC_T1.
[9]
D. an Mey, S. Sarholz, and C. Terboven et al. The RWTH Aachen SMP-Cluster User's Guide, Version 6.2. 2007.
[10]
S. Blagodurov, S. Zhuravlev, and A. Fedorova. Contention-aware scheduling on multicore systems. ACM Trans. Comput. Syst., 28:8:1--8:45, December 2010.
[11]
R. van der Pas. The OMPlab on Sun Systems. In Proc. of IWOMP'05, 2005.
[12]
Y. Xie and G. Loh. Dynamic Classification of Program Memory Behaviors in CMPs. In CMP-MSI, 2008.
[13]
S. Zhuravlev, S. Blagodurov, and A. Fedorova. Addressing Contention on Multicore Processors via Scheduling. In ASPLOS, 2010.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPE '11: Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
March 2011
470 pages
ISBN:9781450305198
DOI:10.1145/1958746
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 March 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hpc clusters
  2. multicore systems
  3. scheduling
  4. shared resource contention

Qualifiers

  • Research-article

Conference

ICPE'11

Acceptance Rates

Overall Acceptance Rate 252 of 851 submissions, 30%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 154
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media