Skip to main content
Log in

Prediction of Resource Availability in Fine-Grained Cycle Sharing Systems Empirical Evaluation

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

Fine-Grained Cycle Sharing (FGCS) systems aim at utilizing the large amount of computational resources available on the Internet. In FGCS, host computers allow guest jobs to utilize the CPU cycles if the jobs do not significantly impact the local users. Such resources are generally provided voluntarily and their availability fluctuates highly. Guest jobs may fail unexpectedly, as resources become unavailable. To improve this situation, we consider methods to predict resource availability. This paper presents empirical studies on resource availability in FGCS systems and a prediction method. From studies on resource contention among guest jobs and local users, we derive a multi-state availability model. The model enables us to detect resource unavailability in a non-intrusive way. We analyzed the traces collected from a production FGCS system for 3 months. The results suggest the feasibility of predicting resource availability, and motivate our method of applying semi-Markov Process models for the prediction. We describe the prediction framework and its implementation in a production FGCS system, named iShare. Through the experiments on an iShare testbed, we demonstrate that the prediction achieves an accuracy of 86% on average and outperforms linear time series models, while the computational cost is negligible. Our experimental results also show that the prediction is robust in the presence of irregular resource availability. We tested the effectiveness of the prediction in a proactive scheduler. Initial results show that applying availability prediction to job scheduling reduces the number of jobs failed due to resource unavailability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Altinok, Y., Kolcak, D.: An application of the semi-markov model for earthquake occurrences in north anatolia, turkey. Journal of the Balkan Geophysical Society 2(4), 90–99 (1999)

    Google Scholar 

  2. Anderson, D.P.: Boinc: A system for public-resource computing and storage. In: 5th IEEE/ACM International Workshop on Grid Computing, November (2004)

  3. Anderson, D.P., Fedak, G.: The computation and storage potential of volunteer computing. In: Proceedings of CCGrid ’06, pp. 73–80 (2006)

  4. Armstrong, B., Eigenmann, R.: A methodology for scientific benchmarking with large-scale application. In: Performance Evaluation and Benchmarking with Realistic Applications, pp. 109–127 (2001)

  5. Bolosky, W., Douceur, J., Ely, D., Theimer, M.: Feasibility of a serverless distributed file system deployed on an existing set of desktop pcs. In: ACM SIGMETRICS Performance Evaluation Review, pp. 34–43 June (2000)

  6. Brevik, J., Nurmi, D., Wolski, R.: Automatic methods for predicting machine availability in desktop Grid and peer-to-peer systems. In: Proceedings of CCGrid’04, pp. 190–199 (2004)

  7. Buyya, R., Murshed, M.: GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for Grid computing. Concurrency and Computation: Practice and Experience, 14 (2002)

  8. Catlett, C.: The philosophy of TeraGrid: Building an open, extensible, distributed terascale facility. In: Proceedings of CCGrid’02 (2002)

  9. Chien, A., Calder, B., Elbert, S., Bhatia, K.: Entropia: architecture and performance of an enterprise desktop Grid system. J. Parallel Distrib. Comput. 63(5), 597–610 (2003)

    Article  Google Scholar 

  10. Dinda, P., O ’Hallaron, D.: An extensible toolkit for resource prediction in distributed systems. Technical Report CMU-CS-99-138, School of Computer Science, Carnegie Mellon University, July (1999)

  11. Dinda, P.A., O’Halaron, D.R.: An evaluation of linear models for host load prediction. In: Proceedings of HPDC’99, August, p. 10 (1999)

  12. Foster, I., Kesselman, C.: Globus: A metacomputing infrastructure toolkit. Int. J. Supercomput. Appl. High Perform. Comput. 11, 115–128 (1997)

    Article  Google Scholar 

  13. Hofmann, M., Jost, S.: Static prediction of heap space usage for first-order functional programs. In: Proceedings of the ACM POPL’03, pp. 185–197 (2003)

  14. http://setiathome.ssl.berkeley.edu/. SETI@home: Search for extraterrestrial intelligence at home

  15. http://www.spec.org/osg/cpu2000. “spec cpu2000 benchmark”

  16. Kapadia, N.H., Fortes, J.A.B., Brodley, C.E.: Predictive application-performance modeling in a computational Grid environment. In: Proceedings of HPDC’99, pp. 47–54 (1999)

  17. Kondo, D., Taufer, M., Brooks, C.L., Casanova, H., Chien, A.A.: Characterizing and evaluating desktop Grids: an empirical study. In: Proceedings of IPDPS’04, April (2004)

  18. Long, D., Muri, A., Golding, R.: A longitudinal survey of internet host reliability. In: 14th Symposium on Reliable Distributed Systems, September, pp. 2–9 (1995)

  19. Malhotra, M., Reibman, A.: Selecting and implementing phase approximations for semi-Markov models. Commun. Stat. Stoch. Models 9(4), 473–506 (1993)

    MATH  MathSciNet  Google Scholar 

  20. McDonell, K.: Taking performance evaluation out of the ‘stone age.’ In: Proceedings of the Summer USENIX Conference, pp. 8–12 (1987)

  21. Mutka, M.W.: Estimating capacity for sharing in a privately owned workstation environment. IEEE Trans. Softw. Eng. 18(4), 319–328 (1992)

    Article  Google Scholar 

  22. Oliner, A.J., Sahoo, R., Moreira, J., Gupta, M., Sivasubramaniam, A.: Fault-aware job scheduling for bluegene/l systems. In: Proceedings of IPDPS’04, April, pp. 64–73 (2004)

  23. Plank, J., Elwasif, W.: Experimental assessment of workstation failures and their impact on checkpointing systems. In: 28th International Symposium on Fault-Tolerant Computing, June, pp. 48–57 (1998)

  24. Ren, X., Eigenmann, R.: ishare – open internet sharing built on p2p and web. In: Proceedings of EGC’05, February, pp. 1117–1127 (2005)

  25. Ren, X., Pan, Z., Eigenmann, R., Hu, Y.C.: Decentralized and hierarchical discovery of software applications in the ishare internet sharing system. In: Proceedings of PDCS’04, pp. 124–130 (2004)

  26. Ryu, K.D. Hollingsworth, J.: Resource policing to support fine-grain cycle stealing in networks of workstations. IEEE Trans. Parallel Distrib. Syst. 15(9), 878–891 (2004)

    Article  Google Scholar 

  27. Sahoo, R., Bae, M., Vilalta, R., Moreira, J., Ma, S., et al.: Providing persistent and consistent resources through event log analysis and predictions for large-scale computing systems. In: Workshop on Self-Healing, Adaptive, and Self-Managed Systems, June (2002)

  28. Sahoo, R., Oliner, A.J., Rish, I., Gupta, M., et al. Critical event prediction for proactive management in large-scale computing clusters. In: Proceedings of the ACM SIGKDD, August, pp. 426–435 (2003)

  29. Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: The condor experience. Concurrency – Practice and Experience, 17(2-4), 323–356 (2004)

    Article  Google Scholar 

  30. Trivedi, K., Vaidyanathan, K.: A measurement-based model for estimation of resource exhaustion in operational software systems. In: Proceedings of ISSRE’99, November, pp. 84–93 (1999)

  31. Vyas, D., Subhlok, J.: Volunteer computing on clusters. In: 12th Workshop on Job Scheduling Strategies for Parallel Processing (2006)

  32. Wolski, R. Experiences with predicting resource performance on-line in computational Grid settings. ACM SIGMETRICS Performance Evaluation Review 30(4), 41–49 (2003)

    Article  Google Scholar 

  33. Wolski, R., Spring, N., Hayes, J.: Predicting the cpu availability of time-shared unix systems on the computational Grid. Cluster Comput. 3(4), 293–301 (2000)

    Article  Google Scholar 

  34. Yang, L., Schopf, J.M., Foster, I.: Conservative scheduling: Using predicted variance to improve scheduling decisions in dynamic environments. In: Proceedings of the ACM/IEEE conference on Supercomputing, p. 31 (2003)

  35. Zhang, Y.Y., Squillante, M., Sivasubramaniam, A., Sahoo, R.K.: Performance implications of failures in large-scale cluster scheduling. In: 10th Workshop on Job Scheduling Strategies for Parallel Processing, June (2004)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaojuan Ren.

Additional information

This work was supported, in part, by the National Science Foundation under Grants No. 0103582-EIA, 0429535-CCF, and 0650016-CNS. We thank Ruben Torres for his help with the reference prediction algorithms used in our experiments.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ren, X., Lee, S., Eigenmann, R. et al. Prediction of Resource Availability in Fine-Grained Cycle Sharing Systems Empirical Evaluation. J Grid Computing 5, 173–195 (2007). https://doi.org/10.1007/s10723-007-9077-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-007-9077-5

Keywords

Navigation