Skip to main content
Log in

Robust task scheduling for volunteer computing systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Performance perturbations are a natural phenomenon in volunteer computing systems. Scheduling parallel applications with precedence-constraints is emerging as a new challenge in these systems. In this paper, we propose two novel robust task scheduling heuristics, which identify best task-resource matches in terms of makespan and robustness. Our approach for both heuristics is based on a proactive reallocation (or schedule expansion) scheme enabling output schedules to tolerate a certain degree of performance degradation. Schedules are initially generated by focusing on their makespan. These schedules are scrutinized for possible rescheduling using additional volunteer computing resources to increase their robustness. Specifically, their robustness is improved by maximizing either the total allowable delay time or the minimum relative allowable delay time over all allocated volunteer resources. Allowable delay times may occur due to precedence constraints. In this paper, two proposed heuristics are evaluated with an extensive set of simulations. Based on simulation results, our approach significantly contributes to improving the robustness of the resulting schedules.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Anderson DP, Cobb J, Korpela E, Lebofsky M, Werthimer D (2002) SETI@home: an experiment in public-resource computing. Commun ACM 45(11):56–61

    Article  Google Scholar 

  2. Folding@home (2009) http://folding.stanford.edu/

  3. Einstein@Home (2009) http://einstein.phys.uwm.edu/

  4. Darbha S, Agrawal DP (1998) Optimal scheduling algorithm for distributed-memory machines. IEEE Trans Parallel Distrib Syst 9(1):87–95

    Article  Google Scholar 

  5. Zomaya AY, Ward C, Macey BS (1999) Genetic scheduling for parallel processor systems: comparative studies and performance issues. IEEE Trans Parallel Distrib Syst 10(8):795–812

    Article  Google Scholar 

  6. Topcuouglu H, Hariri S, Wu M-Y (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274

    Article  Google Scholar 

  7. Lee YC, Zomaya AY (2008) A novel state transition method for metaheuristic-based scheduling in heterogeneous computing systems. IEEE Trans Parallel Distrib Syst 19(9):1215–1223

    Article  Google Scholar 

  8. Lee YC, Subrata R, Zomaya AY (2009) On the performance of a dual-objective optimization model for workflow applications on grid platforms. IEEE Trans Parallel Distrib Syst 20(9):1273–1284

    Article  Google Scholar 

  9. Shivle S, Sugavanam P, Siegel HJ, Maciejewski AA, Banka T, Chindam K, Dussinger S, Kutruff A, Penumarthy P, Pichumani P, Satyasekaran P, Sendek D, Smith J, Sousa J, Sridharan J, Velazco J (2005) Mapping subtasks with multiple versions on an ad hoc grid. Parallel Comput 31(7):671–690. Special Issue on Heterogeneous Computing

    Article  Google Scholar 

  10. Shivle S, Siegel HJ, Maciejewski AA, Sugavanam P, Banka T, Castain R, Chindam K, Dussinger S, Pichumani P, Satyasekaran P, Saylor W, Sendek D, Sousa J, Sridharan J, Velazco J (2006) Static allocation of resources to communicating subtasks in a heterogeneous ad hoc grid environment. J Parallel Distrib Comput 66(4):600–611. Special Issue on Algorithms for Wireless and Ad-hoc Networks

    Article  MATH  Google Scholar 

  11. Braun TD, Siegel HJ, Maciejewski AA, Hong Y (2008) Static resource allocation for heterogeneous computing environments with tasks having dependencies, priorities, deadlines, and multiple versions. J Parallel Distrib Comput 68(11):1504–1516

    Article  Google Scholar 

  12. Ali S, Maciejewski AA, Siegel HJ, Kim J-K (2004) Measuring the robustness of a resource allocation. IEEE Trans Parallel Distrib Syst 15(7):630–641

    Article  Google Scholar 

  13. Smith J, Briceño LD, Maciejewski AA, Siegel HJ, Renner T, Shestak V, Ladd J, Sutton A, Janovy D, Govindasamy S, Alqudah A, Dewri R, Prakash P (2007) Measuring the robustness of resource allocations in a stochastic dynamic environment. In: Proc international parallel and distributed processing symposium (IPDPS 2007), Mar 2007

  14. Chtepen M, Claeys FHA, Dhoedt B, De Turck F, Demeester P, Vanrolleghem PA (2009) Adaptive task checkpointing and replication: toward efficient fault-tolerant grids. IEEE Trans Parallel Distrib Syst 20(2):180–190

    Article  Google Scholar 

  15. Ali S, Kim J-K, Siegel HJ, Maciejewski AA (2008) Static heuristics for robust resource allocation of continuously executing applications. J Parallel Distrib Comput 68(8):1070–1080

    Article  Google Scholar 

  16. Sugavanam P, Siegel HJ, Maciejewski AA, Oltikar M, Mehta A, Pichel R, Horiuchi A, Shestak V, Al-Otaibi M, Krishnamurthy Y, Ali S, Zhang J, Aydin M, Lee P, Guru K, Raskey M, Pippin A (2007) Robust static allocation of resources for independent tasks under makespan and dollar cost constraints. J Parallel Distrib Comput 67(4):400–416

    Article  MATH  Google Scholar 

  17. Mehta AM, Smith J, Siegel HJ, Maciejewski AA, Jayaseelan A, Ye B (2007) Dynamic resource allocation heuristics that manage tradeoff between makespan and robustness. J Supercomput 42(1):33–58. Special Issue on Grid Technology

    Article  Google Scholar 

  18. Shestak V, Smith J, Maciejewski AA, Siegel HJ (2008) Stochastic robustness metric and its use for static resource allocations. J Parallel Distrib Comput 68(8):1157–1173

    Article  Google Scholar 

  19. Deb K, Gupta H (2006) Introducing robustness in multi-objective optimization. Evol Comput 14(4):463–494

    Article  Google Scholar 

  20. Qin X, Jiang H (2005) A dynamic and reliability driven scheduling algorithm for parallel real-time jobs executing on heterogeneous clusters. J Parallel Distrib Comput 65(8):885–900

    Article  MATH  Google Scholar 

  21. Dongarra J, Jeannot E, Saule E, Shi Z (2007) Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems. In: Proc 19th annual ACM symposium on parallel algorithms and architectures (SPAA’07), 2007, pp 280–288

  22. Dogan A, Ozguner F (2002) Matching and scheduling algorithms for minimizing execution time and failure probability of applications in heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):308–323

    Article  Google Scholar 

  23. Benoit A, Hakem M, Robert Y (2008) Fault tolerant scheduling of precedence task graphs on heterogeneous platforms. In: Proc international parallel and distributed processing symposium (IPDPS), 2008

  24. Byun E, Choi S, Baik M, Hwang C, Park C, Jung SY (2005) Scheduling scheme based on dedication rate in volunteer computing environment. In: Proc 4th international symposium on parallel and distributed computing (ISPDC), 2005, pp 234–241

  25. Wu M-Y, Gajski DD (1990) Hypertool: a programming aid for message-passing systems. IEEE Trans Parallel Distrib Syst 1(3):330–343

    Article  Google Scholar 

  26. Lord RE, Kowalik JS, Kumar SP (1983) Solving linear algebraic equations on an MIMD computer. J ACM 30(1):103–117

    Article  MATH  MathSciNet  Google Scholar 

  27. Cormen TH, Leiserson CE, Rivest RL (1990) Introduction to algorithms. MIT Press, Cambridge

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Albert Y. Zomaya.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, Y.C., Zomaya, A.Y. & Siegel, H.J. Robust task scheduling for volunteer computing systems. J Supercomput 53, 163–181 (2010). https://doi.org/10.1007/s11227-009-0326-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-009-0326-1

Keywords

Navigation