Abstract
Slurm is an open source, fault-tolerant, and highly scalable workload manager used on many of the world’s supercomputers and computer clusters. As a cluster workload manager, Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources for some duration of time. Second, it provides a framework for starting, executing, and monitoring work on the allocated resources. Finally, it arbitrates contention for resources by managing queues of pending work and enforcing administrative policies. This paper describes the current design and capabilities of Slurm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Slurm was originally an acronym for “Simple Linux Utility for Resource Management”, and stylized as “SLURM”. The acronym was dropped in 2012 and the preferred capitalization changed to “Slurm”.
- 2.
The “partition” term was inherited from Quadrics RMS, which required a strict partitioning of compute nodes within the environment. The requirement that nodes be in disjoint partitions was discarded very early on, but the terminology has persisted.
- 3.
This capability was successfully used at King Abdullah University of Science and Technology (KAUST) for a period when a power availability was limited. We are unaware of any other organization currently using of this capability.
References
Balle, S.M., Palermo, D.J.: Enhancing an open source resource manager with multi-core/multi-threaded support. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2007. LNCS, vol. 4942, pp. 37–50. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78699-3_3
Cox, R., Morrison, L.: Fair tree: fairshare algorithm for slurm. In: Proceedings of the Slurm User Group Meeting (2014). https://slurm.schedmd.com/SC14/BYU_Fair_Tree.pdf. Accessed 28 Mar 2023
Docker home page. https://www.docker.com/. Accessed 28 Mar 2023
Frontier user guide. https://docs.olcf.ornl.gov/systems/frontier_user_guide.html. Accessed 3 Feb 2023
Garfinkel, S., Spafford, G., Schwartz, A.: Practical UNIX and internet security, pp. 94–96. SO’Reilly (2003)
Hdf5 download page from the hdf group. https://www.hdfgroup.org/downloads/hdf5. Accessed 25 Mar 2023
Hdfview download page from the hdf group. https://www.hdfgroup.org/downloads/hdfview. Accessed 25 Mar 2023
Henseler, D., Landsteiner, B., Petesch, D., Wright, C., Wright, N.: Architecture and design of cray datawarp. In: Proceedings of the Cray User Group (2016). https://cug.org/proceedings/cug2016_proceedings/includes/files/pap105s2-file1.pdf. Accessed 28 Mar 2023
Hmac description. https://en.wikipedia.org/wiki/HMAC. Accessed 1 May 2023
Jackson, D., Snell, Q., Clement, M.: Core algorithms of the maui scheduler. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 87–102. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45540-X_6
Jette, M.A.: Expanding symmetric multiprocessor capability through gang scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1998. LNCS, vol. 1459, pp. 199–216. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0053988
Jette, M.: Slurm power management support. In: Proceedings of the Slurm User Group (2015). https://slurm.schedmd.com/SLUG15/Power_mgmt.pdf. Accessed 26 Mar 2023
Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple linux utility for resource management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 44–60. Springer, Heidelberg (2003). https://doi.org/10.1007/10968987_3
Jwt home page. https://jwt.io/. Accessed 1 May 2023
Mariadb foundation home page. https://mariadb.org/. Accessed 23 Mar 2023
Munge home page. https://dun.github.io/munge/. Accessed 26 Apr 2023
Mysql home page. https://www.mysql.com/. Accessed 23 Mar 2023
Name service switch description. https://guix.gnu.org/manual/en/html_node/Name-Service-Switch.html. Accessed 3 Feb 2023
Open container initiative organization home page. https://opencontainers.org. Accessed 28 Mar 2023
Ondrejka, P., Majorsinova, E., Prpic, M., Landmann, R., Silas, D.: Resource management guide. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/resource_management_guide/index. Accessed 23 Mar 2023
Openpbs home page. https://www.openpbs.org/. Accessed 28 Mar 2023
Podman home page. https://podman.io/. Accessed 28 Mar 2023
Pritchard, H., Roweth, D., Henseler, D., Cassella, P.: Leveraging the cray linux environment core specialization feature to realize mpi asynchronous progress on cray xe systems. In: Proceedings of the Cray User Group (2012)
Quadrics in linux clusters (presentation). https://hsi.web.cern.ch/HNF-Europe/sem3_2001/hnf.pdf. Accessed 3 Feb 2023
Singularity plugin for slurm. https://github.com/sol-eng/singularity-rstudio/blob/main/slurm-singularity-exec.md. Accessed 4 Feb 2023
Slurm code repository. https://github.com/SchedMD/slurm.git. Accessed 3 Feb 2023
Slurm scheduling configuration guide. https://slurm.schedmd.com/sched_config.html. Accessed 31 Mar 2023
Slurm container guide. https://slurm.schedmd.com/containers.html. Accessed 4 Feb 2023
Slurm documentation. https://slurm.schedmd.com/. Accessed 4 Feb 2023
Slurm high throughput computing administration guide. https://slurm.schedmd.com/high_throughput.html. Accessed 30 Mar 2023
Name service switch implementation for slurm. https://slurm.schedmd.com/nss_slurm.html. Accessed 3 Feb 2023
Slurm scheduling diagnostic documentation. https://slurm.schedmd.com/sdiag.html. Accessed 31 Mar 2023
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jette, M.A., Wickberg, T. (2023). Architecture of the Slurm Workload Manager. In: Klusáček, D., Corbalán, J., Rodrigo, G.P. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2023. Lecture Notes in Computer Science, vol 14283. Springer, Cham. https://doi.org/10.1007/978-3-031-43943-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-43943-8_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43942-1
Online ISBN: 978-3-031-43943-8
eBook Packages: Computer ScienceComputer Science (R0)