Skip to main content

Job Scheduling

  • Reference work entry
Encyclopedia of Parallel Computing
  • 257 Accesses

Synonyms

Node allocation; Processor allocation

Definition

A parallel job scheduler allocates nodes for parallel jobs and coordinates the order in which jobs are run. With enough resources available, a system can execute multiple parallel jobs simultaneously, while other jobs are enqueued and wait for nodes to become available. The job scheduler manages the queues of waiting jobs and oversees node allocation. The goals of a scheduler are to optimize throughput of a system (number of jobs completed per time unit), provide response time guarantees (finish a job by a deadline), and keep utilization of compute resources high.

Discussion

Introduction

Users of a parallel system submit their jobs by specifying which application they would like to run and how many nodes they need. It is then the task of the job scheduler to find and allocate the appropriate number of nodes. This is different from a sequential system where the Operating System (OS) is responsible for scheduling processes and...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,600.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 1,799.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Bibliography

  1. Bender MA, Bunde DP, Demaine ED, Fekete SP, Leung VJ, Meijer H, Phillips CA (2008) Communication-aware processor allocation for supercomputers: finding point sets of small average distance. Algorithmica 50(2):279–298

    Article  MATH  MathSciNet  Google Scholar 

  2. Feitelson D (Nov 2009) Workshops on job scheduling strategies for parallel processing. http://www.cs.huji.ac.il/~feit/parsched/

  3. Feitelson DG, Rudolph L, Schwiegelshohn U, Sevcik KC, Wong P (1997) Theory and practice in parallel job scheduling. In: IPPS ’97: Proceedings of the job scheduling strategies for parallel processing, Geneva. Springer, London, pp 1–34

    Google Scholar 

  4. Feitelson DG, Rudolph L, Schwiegelshohn U (2004) Parallel job scheduling – a status report. In: Feitelson DG, Rudolph L, Schwiegelshohn U (eds) Job scheduling strategies for parallel processing. Lecture Notes in Computer Science, vol 3277. Springer, Berlin, pp 1–16

    Chapter  Google Scholar 

  5. Frachtenberg E, Schwiegelshohn U (2007) New challenges of parallel job scheduling. In: Frachtenberg E, Schwiegelshohn U (eds) Job scheduling strategies for parallel processing. Lecture Notes in Computer Science, vol 4942. Springer, Berlin, pp 1–23

    Chapter  Google Scholar 

  6. Henderson RL (1995) Job scheduling under the portable batch system. In: IPPS ’95: Proceedings of the workshop on job scheduling strategies for parallel processing, Santa Barbara. Springer, London, pp 279–294

    Google Scholar 

  7. Lawrence Livermore National Laboratory (Nov 2009) SLURM: a highly scalable resource manager. https://computing.llnl.gov/linux/slurm/

  8. Lee CB, Schwartzman Y, Hardy J, Snavely A (2004) Are user runtime estimates inherently inaccurate? In: Feitelson DG, Rudolph L, Schwiegelshohn U (eds) Job scheduling strategies for parallel processing. Lecture Notes in Computer Science, vol 3277. Springer, Berlin, pp 253–263

    Google Scholar 

  9. Leung VJ, Arkin EM, Bender MA, Bunde D, Johnston J, Lal A, Mitchell JSB, Phillips C, Seiden SS (2002) Processor allocation on C plant: achieving general processor locality using one-dimensional allocation strategies. In: CLUSTER ’02: proceedings of the IEEE international conference on cluster computing, Chicago. IEEE Computer Society, Washington, DC, pp 296

    Book  Google Scholar 

  10. Liao X, Jigang W, Srikanthan T (2008) A temperature-aware virtual submesh allocation scheme for NoC-based manycore chips. In: SPAA ’08: proceedings of the twentieth annual symposium on parallelism in algorithms and architectures, Munich. ACM, New York, pp 182–184

    Google Scholar 

  11. Moreira JE, Salapura V, Almasi G, Archer C, Bellofatto R, Bergner P, Bickford R, Blumrich M, Brunheroto JR, Bright AA, Brutman M, Castaños JG, Chen D, Coteus P, Crumley P, Ellis S, Engelsiepen T, Gara A, Giampapa M, Gooding T, Hall S, Haring RA, Haskin R, Heidelberger P, Hoenicke D, Inglett T, Kopcsay GV, Lieber D, Limpert D, McCarthy P, Megerian M, Mundy M, Ohmacht M, Parker J, Rand RA, Reed D, Sahoo R, Sanomiya A, Shok R, Smith B, Stewart GG, Takken T, Vranas P, Wallenfelt B, Michael B, Ratterman J (2007) The Blue Gene/L supercomputer: a hardware and software story. Int J Parallel Program 35:181–206

    Article  Google Scholar 

  12. Pascual JA, Navaridas J, Miguel-Alonso J (2009) Effects of topology aware allocation policies on scheduling performance. In: Frachtenberg E, Schwiegelshohn U (eds) Job scheduling strategies for parallel processing. Lecture Notes in Computer Science, vol 5798. Springer, Berlin, pp 138–156

    Chapter  Google Scholar 

  13. Smith W, Foster I, Taylor V (1998) Predicting application run times using historical information. In: Feitelson DG, Rudolph L (eds) Job scheduling strategies for parallel processing. Lecture Notes in Computer Science, vol 1459. Springer, Berlin, pp 122–142

    Chapter  Google Scholar 

  14. Yoo AB, Jette MA, Grondona M (2003) SLURM: simple Linux utility for resource management. In: Feitelson DG, Rudolph L, Schwiegelshohn U (eds) Job scheduling strategies for parallel processing. Lecture Notes in Computer Science, vol 2862. Springer, Berlin, pp 44–60

    Chapter  Google Scholar 

  15. Zhang Y, Franke H, Moreira J, Sivasubramaniam A (2003) An integrated approach to parallel scheduling using gang-scheduling, backfilling, and migration. IEEE T Parall Distr 14(3):236–247

    Article  Google Scholar 

  16. Zhou S, Zheng X, Wang J, Delisle P (1993) Utopia: a load sharing facility for large, heterogeneous distributed computer systems. Softw Pract Exp 23(12):1305–1336

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this entry

Cite this entry

Riesen, R., Maccabe, A.B. (2011). Job Scheduling. In: Padua, D. (eds) Encyclopedia of Parallel Computing. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09766-4_212

Download citation

Publish with us

Policies and ethics