skip to main content
10.1145/1183401.1183450acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article

User-guided symbiotic space-sharing of real workloads

Published:28 June 2006Publication History

ABSTRACT

Symbiotic space-sharing is a technique that can improve system throughput by executing parallel applications in combinations and configurations that alleviate pressure on shared resources. We have shown prototype schedulers that leverage such techniques to improve throughput by 20% over conventional space-sharing schedulers when resource bottlenecks are known. Such evaluations have utilized benchmark workloads and proposed that schedulers be informed of resource bottlenecks by users at job submission time; in this work, we investigate the accuracy with which users can actually identify resource bottlenecks in real applications and the implications of these predictions for symbiotic space-sharing of production workloads. Using a large HPC platform, a representative application workload, and a sampling of expert users, we show that user inputs are of value and that for our chosen workload, user-guided symbiotic scheduling can improve throughput over conventional space-sharing by 15-22%.

References

  1. http://icl.cs.utk.edu/projectsfiles/hpcc/RandomAccess/.]]Google ScholarGoogle Scholar
  2. http://www.cs.virginia.edu/stream/.]]Google ScholarGoogle Scholar
  3. http://www.npaci.edu/DataStar/guide/home.html.]]Google ScholarGoogle Scholar
  4. http://www.nsf.gov/pubs/2005/nsf05625/nsf05625.htm.]]Google ScholarGoogle Scholar
  5. C. D. Antonopoulos, D. S. Nikolopoulos, and T. S. Papatheodorou. Scheduling Algorithms with Bus Bandwidth Considerations for SMPs. icpp, 00:547, 2003.]]Google ScholarGoogle Scholar
  6. D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, D. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The NAS Parallel Benchmarks. The International Journal of Supercomputer Applications, 5(3):63--73, Fall 1991.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Batat and D. G. Feitelson. Gang Scheduling with Memory Considerations. In 14th Intl. Parallel Distributed Processing Symp., pages 109--114, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Gibbons. A Historical Application Profiler for Use by Parallel Schedulers. In IPPS '97: Proceedings of the Job Scheduling Strategies for Parallel Processing, pages 58--77, London, UK, 1997. Springer-Verlag.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Kannan, P. Mayes, M. Roberts, D. Brelsford, and J. Skovira. Workload Management with LoadLeveler. IBM, November 2001.]]Google ScholarGoogle Scholar
  10. E. Koukis and N. Koziris. Memory Bandwidth Aware Scheduling for SMP Cluster Nodes. In PDP '05: Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP'05), pages 187--196, Washington, DC, USA, 2005. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. W. Leinberger, G. Karypis, and V. Kumar. Job scheduling in the presence of multiple resource requirements. In Supercomputing '99: Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM), page 47, New York, NY, USA, 1999. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Liedtke, M. Volp, and K. Elphinstone. Preliminary thoughts on memory-bus scheduling. In EW 9: Proceedings of the 9th workshop on ACM SIGOPS European workshop, pages 207--210, New York, NY, USA, 2000. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. A. Lifka. The ANL/IBM SP Scheduling System. In IPPS 1995 Workshop on Job Scheduling Strategies for Parallel Processing, volume 949, pages 295--303, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Luszczek, J. Dongarra, D. Koester, R. Rabenseifner, B. Lucas, J. Kepner, J. McCalpin, D. Baily, and D. Takahashi. Introduction to the HPC Challenge Benchmark Suite, April 2005. Paper LBNL-57493.]]Google ScholarGoogle Scholar
  15. J. Mache, V. Lo, and S. Garg. Job Scheduling that Minimizes Network Contention due to both Communication and I/O. In 14th International Parallel and Distributed Processing Symposium, page 457, Washington, DC, USA, 2000. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Mache, V. Lo, M. Livingston, and S. Garg. The impact of spatial layout of jobs on parallel I/O performance. In IOPADS '99: Proceedings of the sixth workshop on I/O in parallel and distributed systems, pages 45--56, New York, NY, USA, 1999. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. L. McGregor, C. Antonopoulos, and D. Nikolopoulos. Scheduling Algorithms for Effective Thread Pairing on Hybrid Multiprocessors. In Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium, Denver, CO, April 2005. IEEE Computer Society Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Mu'alem and D. Feitelson. Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. In 12th Intl. Parallel Processing Symposium, pages 542--546, April 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. E. W. Parsons and K. C. Sevcik. Coordinated allocation of memory and processors in multiprocessors. In SIGMETRICS '96: Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, pages 57--67, New York, NY, USA, 1996. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. V. G. J. Peris, M. S. Squillante, and V. K. Naik. Analysis of the impact of memory in distributed parallel processing systems. In SIGMETRICS '94: Proceedings of the 1994 ACM SIGMETRICS conference on Measurement and modeling of computer systems, pages 5--18, New York, NY, USA, 1994. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. W. Smith, V. Taylor, and I. Foster. Using Run-Time Predictions to Estimate Queue Wait Times and Improve Scheduler Performance. In D. G. Feitelson and L. Rudolph, editors, Job Scheduling Strategies for Parallel Processing, pages 202--219. Springer Verlag, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. Snavely and D. Tullsen. Symbiotic Job Scheduling for a Simultaneous Multithreading Processor. In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 234--244, November 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Snavely, D. Tullsen, and G. Voelker. Symbiotic Jobscheduling for a Simultaneous Multithreading Processor. In Proceedings of the ACM 2002 Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS. 2002), pages 66--76, Marina Del Rey, June 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Squillante and E. Lazowska. Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling. IEEE Transactions on Parallel and Distributed Systemse, 4(2):131--143, February 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. G. E. Suh, L. Rudolph, and S. Devadas. Effects of Memory Performance on Parallel Job Scheduling. In JSSPP '01: Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing, pages 116--132, London, UK, 2001. Springer-Verlag.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. K. Sundaramoorthy, Z. Purser, and E. Rotenberg. Slipstream Processors: Improving both Performance and Fault Tolerance. In Architectural Support for Programming Languages and Operating Systems, pages 257--268, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. J. Torrellas, A. Tucker, and A. Gupta. Evaluating the Performance of Cache-Affinity Scheduling in Shared-Memory Multiprocessors. Journal of Parallel and Distributed Computing, 24(2):139, February 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Utrera, J. Corbal, and J. Labarta. Using moldability to improve the performance of supercomputer jobs Source. Journal of Parallel and Distributed Computing, 62(10):1571--1601, October 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. R. Vaswani and J. Zahorjan. The Implications of Cache Affinity on Processor Scheduling for Multiprogrammed Shared Memory Multiprocessors. In Proceedings of the 13th ACM Symposium on Operating System Principles, pages 26--40, Pacific Grove, CA, October 1991.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. Weinberg and A. Snavely. Symbiotic Space-Sharing on SDSC's Datastar System. In The 12th Workshop on Job Scheduling Strategies for Parallel Processing, St. Malo, France, June 2006.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Y. Wiseman and D. Feitelson. Paired Gang Scheduling. In IEEE Transactions on Parallel and Distributed Systems, volume 14, pages 581--592, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. P. Wong and R. V. der Wijngaart. NAS Parallel Benchmarks I/O Version 2.4. Technical report, NASA Ames Research Center, Moffett Field, CA 94035-1000, January 2003. NAS Technical Report NAS-03-002.]]Google ScholarGoogle Scholar

Index Terms

  1. User-guided symbiotic space-sharing of real workloads

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image ACM Conferences
                  ICS '06: Proceedings of the 20th annual international conference on Supercomputing
                  June 2006
                  385 pages
                  ISBN:1595932828
                  DOI:10.1145/1183401

                  Copyright © 2006 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 28 June 2006

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • Article

                  Acceptance Rates

                  ICS '06 Paper Acceptance Rate37of141submissions,26%Overall Acceptance Rate584of2,055submissions,28%

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader