Skip to main content
Log in

Statistical Analysis and Modeling of Jobs in a Grid Environment

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

The existence of good probabilistic models for the job arrival process and the delay components introduced at different stages of job processing in a Grid environment is important for the improved understanding of the Grid computing concept. In this study, we present a thorough analysis of the job arrival process in the EGEE infrastructure and of the time durations a job spends at different states in the EGEE environment. We define four delay components of the total job delay and model each component separately. We observe that the job inter-arrival times at the Grid level can be adequately modelled by a rounded exponential distribution, while the total job delay (from the time it is generated until the time it completes execution) is dominated by the computing element’s register and queuing times and the worker node’s execution times. Further, we evaluate the efficiency of the EGEE environment by comparing the job total delay performance with that of a hypothetical ideal super-cluster and conclude that we would obtain similar performance if we submitted the same workload to a super-cluster of size equal to 34% of the total average number of CPUs participating in the EGEE infrastructure. We also analyze the job inter-arrival times, the CE’s queuing times, the WN’s execution times, and the data sizes exchanged at the kallisto.hellasgrid.gr cluster, which is node in the EGEE infrastructure. In contrast to the Grid level, we find that at the cluster level the job arrival process exhibits self-similarity/long-range dependence. Finally, we propose simple and intuitive models for the job arrival process and the execution times at the cluster level.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure, 2nd edn. (Morgan Kaufman, San Francisco, 2003)

  2. Feitelson, D.: Workload modeling for computer systems performance evaluation”, http://www.cs.huji.ac.il/~feit/wlmod

  3. Cirne, W., Berman, F.: A Comprehensive Model of the Supercomputer Workload. Proceedings of the 4th IEEE Annual Workshop on Workload Characterization (2001)

  4. Song, B., Ernemann, C., Yahyapour, R.: Parallel Computer Workload Modeling with Markov Chains. Proceedings of the 10th JSSPP (2004)

  5. Denneulin, Y., Romagnoli, E., Trystram, D.: A Synthetic Workload Generator for Cluster Computing. Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS) (2004)

  6. Medernach, E.: Workload Analysis of a Cluster in a Grid Environment. Proceedings of the 11th JSSPP (2005)

  7. Li, H., Muskulus, M., Wolters, L.: Modeling Job Arrivals in a Data-Intensive Grid. Proceedings of the 12th JSSPP (2006)

  8. Real Time Monitor: http://gridportal.hep.ph.ic.ac.uk/rtm/

  9. Li, H., Heusdens, R., Muskulus, M., Wolters L.: Analysis and Synthesis of Pseudo-Periodic Job arrivals in Grids: A matching Pursuit Approach. Proceedings of CCGrid07 (2007)

  10. Nurmi, D., Mandal, A., Brevik, J., Koelbel, C., Wolski, R., Kennedy, K.: Grid Scheduling and Protocols – Evaluation of a Workflow Scheduler Using Integrated Performance Modelling and Batch Queue Wait Time Prediction. Proceedings of Supercomputing (2006)

  11. The EGEE project homepage: http://public.eu-egee.org/

  12. gLite-3 user’s guide: https://edms.cern.ch/file/722398//gLite-3-UserGuide.pdf

  13. Job description language: How To. Publicly available at http://www.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-0_2-Document.pdf

  14. Fischer, W., Meier-Hellstern, K.: The Markov-modulated Poisson process (MMPP) cookbook. Perform. Eval. 18(2), 149–171 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  15. The EMpht program: publicly available at http://home.imf.au.dk/asmus/pspapers.html

  16. Maui Scheduler: http://supercluster.org/maui

  17. Open PBS: http://www.openpbs.org/

  18. Karagiannis, T., Faloutsos, M., Molle, M.: A User-Friendly Self-Similarity Analysis Tool. ACM SIGCOMM Computer Communication Review (2003)

  19. HellasGrid task force: http://www.hellasgrid.gr/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantinos Christodoulopoulos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Christodoulopoulos, K., Gkamas, V. & Varvarigos, E.A. Statistical Analysis and Modeling of Jobs in a Grid Environment. J Grid Computing 6, 77–101 (2008). https://doi.org/10.1007/s10723-007-9089-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-007-9089-1

Keywords

Navigation