Skip to main content
Log in

Fault-aware grid scheduling using performance prediction by workload modeling

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Computational grids hold great promise in utilizing geographically separated heterogeneous resources to solve large-scale complex problems. However, they suffer from a number of major technical hurdles, including distributed resource management and effective job scheduling. The main focus of this work is devoted on online scheduling of real time applications in distributed environments such as grids. Specifically, we are interested in applications with several independent tasks, each task with a prespecified lifecycle called deadline. Here, our goal is to schedule applications within an optimum overall time considering the specified deadlines. To achieve this, the resource performance prediction based on workload modeling and with the help of queuing techniques is employed. Afterward, a mathematical neural model is used to schedule the subtasks of the application. The main contributions of this work is to incorporate the impatiency factor as well as resource fault in performance modeling of nondedicated distributed systems, and also presenting an efficient and fast parallel scheduling algorithm under time constraint and heterogeneous resources. The proposed model is appropriate for implementation on parallel machines and in O(1) time. The new model was implemented on GridSim toolkit and under various conditions and with different parameters to evaluate the performance of scheduling algorithm. Simulation outcomes have shown that approximately in 87.8% of cases, our model schedules the tasks in such a way that all constraints are satisfied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Foster I, Kesselman C (eds) (2004) The grid: blueprint for a future computing infrastructure. Morgan Kaufmann, San Mateo

    Google Scholar 

  2. Aktaruzzaman M Liteature review and survey: resource discovery in computational grids. School of Computer Science, Univerity of Windsor, Ontario, Canada

  3. Vadhiyar S, Dongarra J (2003) A performance oriented migration framework for the grid. In: Proc. of the 3rd international symposium on cluster computing and the grid (CCGrid’03), Tokyo, Japan, May 2003, pp 130–139

  4. Yang L, Schopf JM, Foster I (2003) Conservative scheduling: using predicted variance to improve scheduling decisions in dynamic environments. In: Proc. of the ACM/IEEE super computing 2003 conference, Phoenix, Arizona, USA, November 2003, pp 31–46

  5. Sensor networks. http://www.sensornetworks.net.au/network.html

  6. Rodriguez A, Gonzalez A, Malumbers MP (2004) Performance evaluation of parallel mpeg-4 video coding algorithms on clusters of workstations. In: Int conf parallel computing in electrical engineering (PARELEC’04), pp 354–357

  7. Smallen S, Casanova H, Berman F (2001) Tunable on-line parallel tomography. In: Proceedings of SuperComputing’01, Denver, Colorado, November 2001

  8. Kak AC, Slaney M (1998) Principles of computerized tomography imaging. IEEE Press, New York

    Google Scholar 

  9. Frank J, Radermacher M (1986) Three-dimensional reconstruction of nonperiodic macromolecular assemblies from electron micrographs. In: Koehler JK (ed) Advanced techniques in biological electron microscopy III. Springer, Berlin

    Google Scholar 

  10. Movaghar A (1997) Optimal assignment of impatient customers to parallel queues with blocking. Sci Iran 3:137–146

    MATH  MathSciNet  Google Scholar 

  11. Movaghar A (2005) Optimal control of parallel queues with impatient customers. Perform Eval 60:327–343

    Article  Google Scholar 

  12. Kleinrock L, Korfhage W (1993) Collecting unused processing capacity: an analysis of transient distributed systems. IEEE Trans Parallel Distrib Syst 4(5):535–546

    Article  Google Scholar 

  13. Leutenegger S, Sun XH (2003) Distributed computing feasibility in a non-dedicated homogeneous workstations. In: Proceedings of the 1993 ACM/IEEE conference on supercomputing, pp 143–152

  14. Leutenegger S, Sun XH (1997) Limitation of cycle stealing of parallel processing on a network of homogeneous workstations. J Parallel Distrib Comput 43(2):169–178

    Article  Google Scholar 

  15. Gong L, Sun X, Watson E (2002) Performance modeling and prediction of nondedicated network computing. IEEE Trans Comput 51(9):1041–1055

    Article  MathSciNet  Google Scholar 

  16. Adler M, Gong Y, Rosenberg A (2003) Optimal sharing of bags of tasks in heterogeneous clusters. In: Proceedings of the fifteenth annual ACM symposium on parallel algorithms and architectures, San Diego, California

  17. Li Y, Mascagni M (2003) Improving performance via computational replication on a large-scale computational grid. In: Proc. of the IEEE international symposium on cluster computing and the grid (CCGrid’03), May 2003

  18. Ghare G, Leutenegger L (2004) Improving speedup and response times by replicating parallel programs on a SNOW. In: Proceedings of the 10th workshop on job scheduling strategies for parallel processing, June 2004

  19. Bhatt S, Leighton CFF, Rosenberg A (1997) An optimal strategies for cycle-stealing in networks of workstations. IEEE Trans Comput 46(5):545–557

    Article  MathSciNet  Google Scholar 

  20. Foo YPS, Takefuji Y (1991) Integer linear programming neural networks for job-shop scheduling. In: IEEE int conf neural networks, pp 1361–1366

  21. Chang CY, Jeng MD (1995) Experimental study of a neural model for scheduling job shop. In: IEEE int conf system, man, cybernetics, vol 1, pp 536–540

  22. Gallone JM, Charpillet F, Alexandre F (1995) Anytime scheduling with neural networks. In: Proc 1NRIA/IEEE Symp

  23. Huang YM, Chen RM (1999) Scheduling multiprocessor job with resources and timing constraints using neural networks. IEEE Trans Syst Man Cybern 29(4):559–565

    Article  MathSciNet  Google Scholar 

  24. Abawajy JH (2002) Job scheduling policy for high throughput computing environments. In: Proc of 9th int conf parallel and distributed systems (ICPADS)

  25. Cinlar E (1975) Introduction to stochastic processes. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  26. Movaghar A (1998) On queuing with customer impatience until the beginning of service. Queuing Syst 29:337–350

    Article  MATH  MathSciNet  Google Scholar 

  27. Hopfield JJ, Tank DW (1985) Neural computation of decision in optimization. J Biol Cybern 52:141–152

    MATH  MathSciNet  Google Scholar 

  28. Wilson GV, Pawley GS (1988) On stability of the traveling salesman problem algorithm of Hopfield and Tank. Biol Cybern 58:63–70

    Article  MATH  MathSciNet  Google Scholar 

  29. Paielli RA (1988) Simulation tests of the optimization method of Hopfield and Tank using neural networks. NASA technical memorandom 101047

  30. Takefuji Y (1992) Neural network parallel computing. Kluwer Academic, Dordrecht

    MATH  Google Scholar 

  31. Gridbus project website. http://www.gridbus.org/gridsim/

  32. Parallel workload archive. http://www.cs.huji.ac.il/labs/parallel/workload/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Kalantari.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kalantari, M., Akbari, M.K. Fault-aware grid scheduling using performance prediction by workload modeling. J Supercomput 46, 15–39 (2008). https://doi.org/10.1007/s11227-008-0183-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-008-0183-3

Keywords

Navigation