Skip to main content

Modeling User Runtime Estimates

  • Conference paper
Job Scheduling Strategies for Parallel Processing (JSSPP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3834))

Included in the following conference series:

Abstract

User estimates of job runtimes have emerged as an important component of the workload on parallel machines, and can have a significant impact on how a scheduler treats different jobs, and thus on overall performance. It is therefore highly desirable to have a good model of the relationship between parallel jobs and their associated estimates. We construct such a model based on a detailed analysis of several workload traces. The model incorporates those features that are consistent in all of the logs, most notably the inherently modal nature of estimates (e.g. only 20 different values are used as estimates for about 90% of the jobs). We find that the behavior of users, as manifested through the estimate distributions, is remarkably similar across the different workload traces. Indeed, providing our model with only the maximal allowed estimate value, along with the percentage of jobs that have used it, yields results that are very similar to the original. The remaining difference (if any) is largely eliminated by providing information on one or two additional popular estimates. Consequently, in comparison to previous models, simulations that utilize our model are better in reproducing scheduling behavior similar to that observed when using real estimates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Chiang, S.-H., Arpaci-Dusseau, A., Vernon, M.K.: The impact of more accurate requested runtimes on production job scheduling performance. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2002. LNCS, vol. 2537, pp. 103–127. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  2. Chiang, S.-H., Vernon, M.K.: Characteristics of a large shared memory production workload. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 159–187. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  3. Cirne, W., Berman, F.: A comprehensive model of the supercomputer workload. In: 4th Workshop on Workload Characterization (December 2001)

    Google Scholar 

  4. Cirne, W., Berman, F.: A model for moldable supercomputer jobs. In: 15th Intl. Parallel & Distributed Processing Symp. (April 2001)

    Google Scholar 

  5. Crovella, M.E.: Performance evaluation with heavy tailed distributions. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 1–10. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  6. Downey, A.B.: A parallel workload model and its implications for processor allocation. In: 6th Intl. Symp. High Performance Distributed Comput., August 1997, pp. 112–124 (1997)

    Google Scholar 

  7. Etsion, Y., Tsafrir, D.: A Short Survey of Commercial Cluster Batch Schedulers. Technical Report 2005-13, Hebrew University (May 2005)

    Google Scholar 

  8. Feitelson, D.G.: Experimental analysis of the root causes of performance evaluation results: a backfilling case study. IEEE Trans. Parallel & Distributed Syst. 16(2), 175–182 (2005)

    Article  Google Scholar 

  9. Feitelson, D.G.: Parallel workloads archive, http://www.cs.huji.ac.il/labs/parallel/workload

  10. Feitelson, D.G., Jette, M.A.: Improved utilization and responsiveness with gang scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 238–261. Springer, Heidelberg (1997)

    Google Scholar 

  11. Feitelson, D.G., Mu’alem Weil, A.: Utilization and predictability in scheduling the IBM SP2 with backfilling. In: 12th Intl. Parallel Processing Symp., April 1998, pp. 542–546 (1998)

    Google Scholar 

  12. Feitelson, D.G., Nitzberg, B.: Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 337–360. Springer, Heidelberg (1995)

    Google Scholar 

  13. Frachtenberg, E., Feitelson, D.G., Fernandez, J., Petrini, F.: Parallel job scheduling under dynamic workloads. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 208–227. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  14. Gibbons, R.: A historical application profiler for use by parallel schedulers. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 58–77. Springer, Heidelberg (1997)

    Google Scholar 

  15. Jann, J., Pattnaik, P., Franke, H., Wang, F., Skovira, J., Riodan, J.: Modeling of workload in MPPs. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 95–116. Springer, Heidelberg (1997)

    Google Scholar 

  16. Keleher, P.J., Zotkin, D., Perkovic, D.: Attacking the bottlenecks of backfilling schedulers. Cluster Comput. 3(4), 255–263 (2000)

    Article  Google Scholar 

  17. Lee, C.B., Schwartzman, Y., Hardy, J., Snavely, A.: Are user runtime estimates inherently inaccurate? In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2004. LNCS, vol. 3277, pp. 253–263. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  18. Li, H., Groep, D., Wolters, J.T.L.: Predicting job start times on clusters. In: International Symposium on Cluster Computing and the Grid, CCGrid (2004)

    Google Scholar 

  19. Lifka, D.: The ANL/IBM SP scheduling system. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1995 and JSSPP 1995. LNCS, vol. 949, pp. 295–303. Springer, Heidelberg (1995)

    Google Scholar 

  20. Lublin, U., Feitelson, D.G.: The workload on parallel supercomputers: modeling the characteristics of rigid jobs. J. Parallel & Distributed Comput. 63(11), 1105–1122 (2003)

    Article  MATH  Google Scholar 

  21. Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel & Distributed Syst. 12(6), 529–543 (2001)

    Article  Google Scholar 

  22. Perkovic, D., Keleher, P.J.: Randomization, speculation, and adaptation in batch schedulers. In: Supercomputing, September 2000, p. 7 (2000)

    Google Scholar 

  23. Smith, W., Foster, I., Taylor, V.: Predicting application run times using historical information. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1998, SPDP-WS 1998, and JSSPP 1998. LNCS, vol. 1459, pp. 122–142. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  24. Talby, D.: User Modeling of Parallel Workloads. PhD thesis, The Hebrew University of Jerusalem, Israel (2000) (in preparation)

    Google Scholar 

  25. Tsafrir, D., Etsion, Y., Feitelson, D.G.: Backfilling Using Runtime Predictions Rather Than User Estimates. Technical Report 2005-5, Hebrew University (February 2005)

    Google Scholar 

  26. Tsafrir, D., Feitelson, D.G.: Workload Flurries. Technical Report 2003-85, Hebrew University (November 2003)

    Google Scholar 

  27. Zhang, Y., Franke, H., Moreira, J.E., Sivasubramaniam, A.: An integrated approach to parallel scheduling using gang-scheduling, backfilling, and migration. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 133–158. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  28. Zilber, J., Amit, O., Talby, D.: What is worth learning from parallel workloads? A user and session based analysis. In: Intl. Conf. Supercomputing (June 2005)

    Google Scholar 

  29. Zotkin, D., Keleher, P.J.: Job-length estimation and performance in backfilling schedulers. In: 8th Intl. Symp. High Performance Distributed Comput. (August 1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tsafrir, D., Etsion, Y., Feitelson, D.G. (2005). Modeling User Runtime Estimates. In: Feitelson, D., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2005. Lecture Notes in Computer Science, vol 3834. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11605300_1

Download citation

  • DOI: https://doi.org/10.1007/11605300_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31024-2

  • Online ISBN: 978-3-540-31617-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics