Skip to main content

Parallel Job Scheduling under Dynamic Workloads

  • Conference paper
Job Scheduling Strategies for Parallel Processing (JSSPP 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2862))

Included in the following conference series:

Abstract

Jobs that run on parallel systems that use gang scheduling for multiprogramming may interact with each other in various ways. These interactions are affected by system parameters such as the level of multiprogramming and the scheduling time quantum. A careful evaluation is therefore required in order to find parameter values that lead to optimal performance. We perform a detailed performance evaluation of three factors affecting scheduling systems running dynamic workloads: multiprogramming level, time quantum, and the use of backfilling for queue management — and how they depend on offered load. Our evaluation is based on synthetic MPI applications running on a real cluster that actually implements the various scheduling schemes. Our results demonstrate the importance of both components of the gang-scheduling plus backfilling combination: gang scheduling reduces response time and slowdown, and backfilling allows doing so with a limited multiprogramming level. This is further improved by using flexible coscheduling rather than strict gang scheduling, as this reduces the constraints and allows for a denser packing.

This work was supported by the U.S. Department of Energy through Los Alamos National Laboratory contract W-7405-ENG-36, and by the Israel Science Foundation (grant no. 219/99).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arpaci-Dusseau, A.C.: Implicit Coscheduling: Coordinated Scheduling with Implicit Information in Distributed Systems. ACM Transactions on Computer Systems 19(3), 283–331 (2001)

    Article  Google Scholar 

  2. Batat, A., Feitelson, D.G.: Gang Scheduling with Memory Considerations. In: International Parallel and Distributed Processing Symposium, May 2000, vol. 14, pp. 109–114 (2000)

    Google Scholar 

  3. Etsion, Y., Tsafrir, D., Feitelson, D.G.: Effects of Clock Resolution on the Scheduling of Interactive and Soft Real-Time Processes. In: SIGMETRICS Conf. Measurement and Modeling of Comput. Syst. (June 2003) (to appear)

    Google Scholar 

  4. Feitelson, D.G.: A Survey of Scheduling in Multiprogrammed Parallel Systems. Research Report RC 19790 (87657), IBM T. J. Watson Research Center (October 1994)

    Google Scholar 

  5. Feitelson, D.G.: The Forgotten Factor: Facts; on Performance Evaluation and Its Dependence on Workloads. In: Monien, B., Feldmann, R.L. (eds.) Euro-Par 2002. LNCS, vol. 2400, pp. 49–60. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  6. Feitelson, D.G., Rudolph, L.: Gang Scheduling Performance Benefits for Fine-Grain Synchronization. Journal of Parallel and Distributed Computing 16(4), 306–318 (1992)

    Article  MATH  Google Scholar 

  7. Feitelson, D.G., Rudolph, L.: Metrics and Benchmarking for Parallel Job Scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1998, SPDP-WS 1998, and JSSPP 1998. LNCS, vol. 1459, pp. 1–24. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  8. Feitelson, D.G., Rudolph, L., Schwiegelshohn, U., Sevcik, K.C., Wong, P.: Theory and Practice in Parallel Job Scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 1–34. Springer, Heidelberg (1997)

    Google Scholar 

  9. Frachtenberg, E., Feitelson, D.G., Petrini, F., Fernandez, J.: Flexible CoScheduling: Mitigating load imbalance and improving utilization of heterogeneous resources. In: International Parallel and Distributed Processing Symposium (April 2003, vol. 17 (2003)

    Google Scholar 

  10. Frachtenberg, E., Petrini, F., Fernandez, J., Pakin, S., Coll, S.: STORM: Lightning-Fast Resource Management. In: Supercomputing 2002, Baltimore, MD (November 2002)

    Google Scholar 

  11. Gupta, A., Tucker, A., Urushibara, S.: The Impact of Operating System Scheduling Policies and Synchronization Methods on the Performance of Parallel Applications. In: SIGMETRICS Conf. Measurement and Modeling of Comput. Syst., May 1991, pp. 120–132 (1991)

    Google Scholar 

  12. Lifka, D.: The ANL/IBM SP Scheduling System. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 295–303. Springer, Heidelberg (1995)

    Google Scholar 

  13. Lublin, U., Feitelson, D.G.: The Workload on Parallel Supercomputers: Modeling the Characteristics of Rigid Jobs. Journal of Parallel and Distributed Computing (2003) (to appear)

    Google Scholar 

  14. Moreira, J.E., Chan, W., Fong, L.L., Franke, H., Jette, M.A.: An Infrastructure for Efficient Parallel Job Execution in Terascale Computing Environments. In: Supercomputing 1998 (November 1998)

    Google Scholar 

  15. Mualem, A.W., Feitelson, D.G.: Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling. IEEE Transactions on Parallel and Distributed Systems 12(6), 529–543 (2001)

    Article  Google Scholar 

  16. Ousterhout, J.K.: Scheduling Techniques for Concurrent Systems. In: 3rd Intl. Conf. Distributed Comput. Syst. (ICDCS), October 1982, pp. 22–30 (1982)

    Google Scholar 

  17. Petrini, F., Feng, W.c., Hoisie, A., Coll, S., Frachtenberg, E.: The Quadrics Network: High Performance Clustering Technology. IEEE Micro 22(l), 46–57 (2002)

    Article  Google Scholar 

  18. Quadrics Supercomputers World Ltd. Elan Reference Manual (January 1999)

    Google Scholar 

  19. Quadrics Supercomputers World Ltd. Elan Programming Manual (May 2002)

    Google Scholar 

  20. Talby, D., Feitelson, D.G., Raveh, A.: Comparing Logs and Models ofParallel Workloads Using the Co-Plot Method. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1999, IPPS-WS 1999, and SPDP-WS 1999. LNCS, vol. 1659, pp. 43–66. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  21. Valiant, L.G.: A Bridging Model for Parallel Computation. Communications of the ACM 33(8), 103–111 (1990)

    Article  Google Scholar 

  22. Zhang, Y., Franke, H., Moreira, J.E., Sivasubramaniam, A.: Improving Parallel Job Scheduling by Combining Gang Scheduling and Backfilling Techniques. In: Intl. Parallel & Distributed Processing Symp., May 2000, vol. 14, pp. 133–142 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Frachtenberg, E., Feitelson, D.G., Fernandez, J., Petrini, F. (2003). Parallel Job Scheduling under Dynamic Workloads. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2003. Lecture Notes in Computer Science, vol 2862. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10968987_11

Download citation

  • DOI: https://doi.org/10.1007/10968987_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20405-3

  • Online ISBN: 978-3-540-39727-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics