Skip to main content
Log in

Empirical performance evaluation of schedulers for cluster of workstations

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Cluster computing is receiving exponential popularity as a choice for high performance computing. This is mainly due to its effective cost performance ratio. Resource management systems (RMS) are the key component to manage the resources of clusters efficiently and have a very vital role in the performance of distributed parallel systems especially a job scheduling module. In this paper, we have empirically evaluated four resource management systems (SGE, TORQUE, and MAUI Scheduler and SLURM) with special focus on job scheduler component. These schedulers have been evaluated on a more comprehensive set of metrics such as throughput, CPU, memory and network utilization. Experiments were carried out on three different size testbeds with a range of scheduler configurations such as FCFS, Backfilling, Fair share and SJF scheduling techniques.

A head-to-head comparison of different scheduling techniques has also been presented which highlights the effect of RMS on the performance of scheduling techniques. It has been observed from results that relative difference among the performance of scheduling techniques reached up to 63%. We conclude from the experiments that there is no single choice of RMS which can be identified as the best but SLURM performs better than others in most of the cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. http://www.advancedclustering.com/cms/types_of_clusters.html 19 May 17, 2008

  2. http://www.top500.org/lists/2007, 19 May 17, 2008

  3. http://www.clusterbuilder.org/pages/software/clustermiddleware/resource-manager.php

  4. Baker, M.A., Fox, G.C., Yau, H.W.: Cluster computing review. Northeast Parallel Architectures Center, Syracuse University, Nov. (1995)

  5. Jones, J.P.: NAS requirements checklist for job queuing/scheduling software. AS Technical Report NAS-96-003 April (1996)

  6. Patton, J.: Evaluation of job queuing/scheduling software: phase 1 report. NAS Technical Report NAS-96-009, July (1996)

  7. Byun, C., Duncan, C.: A comparison of job management systems in supporting HPC ClusterTools. Presentation for SUPerG Vancouver, Fall 2000 HES Engineering-HPC, Sun Microsystems, Inc. Stephanie Burks University Information Technology Services, Indiana University

  8. El-Ghazawi, T., et al.: Conceptual comparative study of job management systems. A Report for the NSA LUCITE Task Order Productive Use of Distributed ReconFigurable Computing George Mason University, February 21 (2001)

  9. Hassaine, O.: Issues in selecting a job management system. CPRE Engineering-HPC Sun BluePrints™OnLine, January (2002)

  10. Imamagi, E., Radi, B., Dobreni, D.: Job management systems analysis. In: 6th CARNet users conference (2004)

  11. El-Ghazawi, T., et al.: Experimental comparative study of job management systems. A report for the NSA LUCITE Task Order Productive Use of Distributed ReconFigurable Computing, July 20 (2001)

  12. El-Ghazawi, T., Gaj, K., et al.: A performance study of job management systems. J. Concurr. Comput. Pract. Experience 16(13), 1229–1246 (2004)

    Article  Google Scholar 

  13. Yan, Y., Chapman, B.: Comparative study of distributed management systems—SGE, LSF, PBS. Department of Computer Science University of Houston, “Torque Resource management system” Administration manual available http://www.clusterresources.com/products/torque/docs20/torqueadmin.shtml

  14. Bode, B., Halstead, D.M., Kendall, R., Lei, Z.: PorTable Batch system and the MAUI scheduler on Linux clusters. In: The Proceedings of the 4th Annual Linux Showcase and Conference, Atlanta, October 10–14, 2000, Atlanta, Georgia, USA (2000)

  15. Jette, Grondona, M.: SLURM: simple Linux utility for resource management. In: Proceedings of ClusterWorld Conference and Expo. San Jose, California, June (2003)

  16. Engine, Sun Grid: Resource management System: Administration manual http://gridengine.sunsource.net

  17. http://lxer.com/module/newswire/view/46194/index.html

  18. http://www.buyya.com/cluster/?source=LinuxHPC.org

  19. Sherwani, J., Ali, N., Lotia, N., Hayat, Z., Buyya, R.: Libra: a computational economy-based job scheduling system for clusters. J. Concurr. Comput. Pract. Experience 34(6), 573–590 (2004)

    Article  Google Scholar 

  20. Franke, H., et al.: Evaluation of parallel job scheduling for ASCI Blue pacific. In: Supercomputing, 13–18 Nov. 1999, pp. 45–55. ACM/IEEE, New York (1999)

    Chapter  Google Scholar 

  21. Zahng, Y., Franke, H. Moreira, J.E., Sivasubramaniam, A.: Improving parallel job scheduling by combining gang scheduling and backfilling techniques. In: Parallel and Distributed Processing Symposium, (IPDPS 2000), pp. 133–142 (2000)

  22. Zhang, Y., Sivasubramaniam, A., Moreira, J., Franke, H.: Impact of workload and system parameters on next generation cluster scheduling mechanisms. IEEE Trans. Parallel Distrib. Syst. 12(9), 967–985 (2001)

    Article  Google Scholar 

  23. Frachtenberg, E., Feitelson, D.G., Petrini, F., Fernandez, J.: Flexible coscheduling: mitigating load imbalance and improving utilization of heterogeneous resources. In: Parallel and Distributed Processing Symposium, 22–26 April 2003, 10 pp

  24. Yu, J.-L., Kim, J.-S., Maeng, S.-R.: A runtime resolution scheme for priority boost conflict in implicit coscheduling. J. Supercomput. 40(1), 1–28 (2007)

    Article  Google Scholar 

  25. NAS Technical Report NAS -03-010 July (2003)

  26. Subrahmaniam, R.: Implementing coscheduling heuristics for windows NT clusters. Master’s thesis, Dept. of Computer Science and EngPennsylvania State Univ., October (1999)

  27. Lazowska, E.D., Zahorjan, J., Graham, G.S., Sevcik, K.C.: Quantitative System Performance: Computer System Analysis Using Queueing Network Models. Prentice-Hall, New York (1984)

    Google Scholar 

  28. Massie, M.L., Chun, B.N., Culler, D.E.: Ganglia distributed monitoring system: design, implementation, and experience. J. Parallel Comput. (2004)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Syed Munir Hussain Shah.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qureshi, K., Shah, S.M.H. & Manuel, P. Empirical performance evaluation of schedulers for cluster of workstations. Cluster Comput 14, 101–113 (2011). https://doi.org/10.1007/s10586-010-0128-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-010-0128-5

Keywords

Navigation