Article

User-guided symbiotic space-sharing of real workloads

Authors:

Jonathan Weinberg,

Allan SnavelyAuthors Info & Claims

ICS '06: Proceedings of the 20th annual international conference on Supercomputing

Pages 345 - 352

https://doi.org/10.1145/1183401.1183450

Published: 28 June 2006 Publication History

Abstract

Symbiotic space-sharing is a technique that can improve system throughput by executing parallel applications in combinations and configurations that alleviate pressure on shared resources. We have shown prototype schedulers that leverage such techniques to improve throughput by 20% over conventional space-sharing schedulers when resource bottlenecks are known. Such evaluations have utilized benchmark workloads and proposed that schedulers be informed of resource bottlenecks by users at job submission time; in this work, we investigate the accuracy with which users can actually identify resource bottlenecks in real applications and the implications of these predictions for symbiotic space-sharing of production workloads. Using a large HPC platform, a representative application workload, and a sampling of expert users, we show that user inputs are of value and that for our chosen workload, user-guided symbiotic scheduling can improve throughput over conventional space-sharing by 15-22%.

References

[1]

http://icl.cs.utk.edu/projectsfiles/hpcc/RandomAccess/.]]

[2]

http://www.cs.virginia.edu/stream/.]]

[3]

http://www.npaci.edu/DataStar/guide/home.html.]]

[4]

http://www.nsf.gov/pubs/2005/nsf05625/nsf05625.htm.]]

[5]

C. D. Antonopoulos, D. S. Nikolopoulos, and T. S. Papatheodorou. Scheduling Algorithms with Bus Bandwidth Considerations for SMPs. icpp, 00:547, 2003.]]

[6]

D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, D. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The NAS Parallel Benchmarks. The International Journal of Supercomputer Applications, 5(3):63--73, Fall 1991.]]

Digital Library

[7]

A. Batat and D. G. Feitelson. Gang Scheduling with Memory Considerations. In 14th Intl. Parallel Distributed Processing Symp., pages 109--114, 2000.]]

Digital Library

[8]

R. Gibbons. A Historical Application Profiler for Use by Parallel Schedulers. In IPPS '97: Proceedings of the Job Scheduling Strategies for Parallel Processing, pages 58--77, London, UK, 1997. Springer-Verlag.]]

Digital Library

[9]

S. Kannan, P. Mayes, M. Roberts, D. Brelsford, and J. Skovira. Workload Management with LoadLeveler. IBM, November 2001.]]

[10]

E. Koukis and N. Koziris. Memory Bandwidth Aware Scheduling for SMP Cluster Nodes. In PDP '05: Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP'05), pages 187--196, Washington, DC, USA, 2005. IEEE Computer Society.]]

Digital Library

[11]

W. Leinberger, G. Karypis, and V. Kumar. Job scheduling in the presence of multiple resource requirements. In Supercomputing '99: Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), page 47, New York, NY, USA, 1999. ACM Press.]]

Digital Library

[12]

J. Liedtke, M. Volp, and K. Elphinstone. Preliminary thoughts on memory-bus scheduling. In EW 9: Proceedings of the 9th workshop on ACM SIGOPS European workshop, pages 207--210, New York, NY, USA, 2000. ACM Press.]]

Digital Library

[13]

D. A. Lifka. The ANL/IBM SP Scheduling System. In IPPS 1995 Workshop on Job Scheduling Strategies for Parallel Processing, volume 949, pages 295--303, 1995.]]

Digital Library

[14]

P. Luszczek, J. Dongarra, D. Koester, R. Rabenseifner, B. Lucas, J. Kepner, J. McCalpin, D. Baily, and D. Takahashi. Introduction to the HPC Challenge Benchmark Suite, April 2005. Paper LBNL-57493.]]

[15]

J. Mache, V. Lo, and S. Garg. Job Scheduling that Minimizes Network Contention due to both Communication and I/O. In 14th International Parallel and Distributed Processing Symposium, page 457, Washington, DC, USA, 2000. IEEE Computer Society.]]

Digital Library

[16]

J. Mache, V. Lo, M. Livingston, and S. Garg. The impact of spatial layout of jobs on parallel I/O performance. In IOPADS '99: Proceedings of the sixth workshop on I/O in parallel and distributed systems, pages 45--56, New York, NY, USA, 1999. ACM Press.]]

Digital Library

[17]

R. L. McGregor, C. Antonopoulos, and D. Nikolopoulos. Scheduling Algorithms for Effective Thread Pairing on Hybrid Multiprocessors. In Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium, Denver, CO, April 2005. IEEE Computer Society Press.]]

Digital Library

[18]

A. Mu'alem and D. Feitelson. Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. In 12th Intl. Parallel Processing Symposium, pages 542--546, April 1998.]]

Digital Library

[19]

E. W. Parsons and K. C. Sevcik. Coordinated allocation of memory and processors in multiprocessors. In SIGMETRICS '96: Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, pages 57--67, New York, NY, USA, 1996. ACM Press.]]

Digital Library

[20]

V. G. J. Peris, M. S. Squillante, and V. K. Naik. Analysis of the impact of memory in distributed parallel processing systems. In SIGMETRICS '94: Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems, pages 5--18, New York, NY, USA, 1994. ACM Press.]]

Digital Library

[21]

W. Smith, V. Taylor, and I. Foster. Using Run-Time Predictions to Estimate Queue Wait Times and Improve Scheduler Performance. In D. G. Feitelson and L. Rudolph, editors, Job Scheduling Strategies for Parallel Processing, pages 202--219. Springer Verlag, 1999.]]

Digital Library

[22]

A. Snavely and D. Tullsen. Symbiotic Job Scheduling for a Simultaneous Multithreading Processor. In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 234--244, November 2000.]]

Digital Library

[23]

A. Snavely, D. Tullsen, and G. Voelker. Symbiotic Jobscheduling for a Simultaneous Multithreading Processor. In Proceedings of the ACM 2002 Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS. 2002), pages 66--76, Marina Del Rey, June 2002.]]

Digital Library

[24]

M. Squillante and E. Lazowska. Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling. IEEE Transactions on Parallel and Distributed Systemse, 4(2):131--143, February 1993.]]

Digital Library

[25]

G. E. Suh, L. Rudolph, and S. Devadas. Effects of Memory Performance on Parallel Job Scheduling. In JSSPP '01: Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing, pages 116--132, London, UK, 2001. Springer-Verlag.]]

Digital Library

[26]

K. Sundaramoorthy, Z. Purser, and E. Rotenberg. Slipstream Processors: Improving both Performance and Fault Tolerance. In Architectural Support for Programming Languages and Operating Systems, pages 257--268, 2000.]]

Digital Library

[27]

J. Torrellas, A. Tucker, and A. Gupta. Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors. Journal of Parallel and Distributed Computing, 24(2):139, February 1995.]]

Digital Library

[28]

G. Utrera, J. Corbal, and J. Labarta. Using moldability to improve the performance of supercomputer jobs Source. Journal of Parallel and Distributed Computing, 62(10):1571--1601, October 2002.]]

Digital Library

[29]

R. Vaswani and J. Zahorjan. The Implications of Cache Affinity on Processor Scheduling for Multiprogrammed Shared Memory Multiprocessors. In Proceedings of the 13th ACM Symposium on Operating System Principles, pages 26--40, Pacific Grove, CA, October 1991.]]

Digital Library

[30]

J. Weinberg and A. Snavely. Symbiotic Space-Sharing on SDSC's Datastar System. In The 12th Workshop on Job Scheduling Strategies for Parallel Processing, St. Malo, France, June 2006.]]

Digital Library

[31]

Y. Wiseman and D. Feitelson. Paired Gang Scheduling. In IEEE Transactions on Parallel and Distributed Systems, volume 14, pages 581--592, 2003.]]

Digital Library

[32]

P. Wong and R. V. der Wijngaart. NAS Parallel Benchmarks I/O Version 2.4. Technical report, NASA Ames Research Center, Moffett Field, CA 94035-1000, January 2003. NAS Technical Report NAS-03-002.]]

Cited By

Copik MChrapek MSchmid LCalotoiu AHoefler T(2024)Software Resource Disaggregation for HPC with Serverless Computing2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00021(139-156)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPS57955.2024.00021
Breslow APorter LTiwari ALaurenzano MCarrington LTullsen DSnavely A(2016)The case for colocation of high performance computing workloadsConcurrency and Computation: Practice & Experience10.1002/cpe.318728:2(232-251)Online publication date: 1-Feb-2016
https://dl.acm.org/doi/10.1002/cpe.3187
Sasaki HTanimoto TInoue KNakamura HYew PCho SDeRose LLilja D(2012)Scalability-based manycore partitioningProceedings of the 21st international conference on Parallel architectures and compilation techniques10.1145/2370816.2370833(107-116)Online publication date: 19-Sep-2012
https://dl.acm.org/doi/10.1145/2370816.2370833
Show More Cited By

Index Terms

Recommendations

Throughput Enhancement through Selective Time Sharing and Dynamic Grouping
IPDPS '13: Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing

Space sharing approaches are widely used in job scheduling for HPC systems. The main drawback of these approaches is the blocking of short jobs, which results in low throughput. The research on gang scheduling has shown the potential of time sharing in ...
Symbiotic jobscheduling with priorities for a simultaneous multithreading processor
Measurement and modeling of computer systems

Simultaneous Multithreading machines benefit from jobscheduling software that monitors how well coscheduled jobs share CPU resources, and coschedules jobs that interact well to make more efficient use of those resources. As a result, informed ...
Adaptive time/space sharing with SCOJO

Time-shared execution of parallel jobs via gang scheduling is known to yield better average response times than space sharing. We incorporate adaptive CPU/node-resource allocation to consider varying system load and to reduce fragmentation. As main ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICS '06: Proceedings of the 20th annual international conference on Supercomputing

June 2006

385 pages

ISBN:1595932828

DOI:10.1145/1183401

General Chairs:
Greg Egan
Monash University
,
Yoichi Muraoka
Waseda University

Copyright © 2006 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ICS06

Sponsor:

ICS06: International Conference on Supercomputing 2006

June 28 - July 1, 2006

Queensland, Cairns, Australia

Acceptance Rates

ICS '06 Paper Acceptance Rate 37 of 141 submissions, 26%;

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
229
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)1

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Copik MChrapek MSchmid LCalotoiu AHoefler T(2024)Software Resource Disaggregation for HPC with Serverless Computing2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00021(139-156)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPS57955.2024.00021
Breslow APorter LTiwari ALaurenzano MCarrington LTullsen DSnavely A(2016)The case for colocation of high performance computing workloadsConcurrency and Computation: Practice & Experience10.1002/cpe.318728:2(232-251)Online publication date: 1-Feb-2016
https://dl.acm.org/doi/10.1002/cpe.3187
Sasaki HTanimoto TInoue KNakamura HYew PCho SDeRose LLilja D(2012)Scalability-based manycore partitioningProceedings of the 21st international conference on Parallel architectures and compilation techniques10.1145/2370816.2370833(107-116)Online publication date: 19-Sep-2012
https://dl.acm.org/doi/10.1145/2370816.2370833
He JSnavely AWijngaart RFrumkin M(2011)Automatic Recognition of Performance Idioms in Scientific ApplicationsProceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium10.1109/IPDPS.2011.21(118-127)Online publication date: 16-May-2011
https://dl.acm.org/doi/10.1109/IPDPS.2011.21
Iancu CHofmeyr SBlagojevic FZheng Y(2010)Oversubscription on multicore processors2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)10.1109/IPDPS.2010.5470434(1-11)Online publication date: Apr-2010
https://doi.org/10.1109/IPDPS.2010.5470434
Koop MLuo MPanda D(2009)Reducing network contention with mixed workloads on modern multicore, clusters2009 IEEE International Conference on Cluster Computing and Workshops10.1109/CLUSTR.2009.5289162(1-10)Online publication date: Aug-2009
https://doi.org/10.1109/CLUSTR.2009.5289162

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten