Abstract
Scheduling large-scale application in heterogeneous grid systems is a fundamental NP-complete problem that is critical to obtain good performance and execution cost. To achieve high performance in a grid system it requires effective task partitioning, resource management and load balancing. The heterogeneous and dynamic nature of a grid, as well as the diverse demands of applications running on the grid, makes grid scheduling a major task. Existing schedulers in wide-area heterogeneous systems require a large amount of information about the application and the grid environment to produce reasonable schedules. However, this required information may not be available, may be too expensive to collect, or may increase the runtime overhead of the scheduler such that the scheduler is rendered ineffective. We believe that no one scheduler is appropriate for all grid systems and applications. This is because while data parallel applications in which further data partitioning is possible can be further improved by efficient management of resources, smart selection of resources and load balancing can be possible, in functional/not-dividable-task parallel applications such partitioning is either not possible or difficult or expensive in term of performance. In this paper, we propose a scheduler for data parallel applications (SDPA) which offers an efficient task partitioning and load balancing strategy for data parallel applications in grid environment. The proposed SDPA offers two major features: maintaining job priority even if insufficient number of free resources is available and pre-task assignment to cut the idle time of nodes. The SDPA selects nodes smartly according to the nature of task and the nodes’ resources availability. Simulation results conducted reveal that SDPA achieves performance improvement over reported strategies in the reviewed literature in terms of execution time, throughput and waiting time.
Similar content being viewed by others
References
Yu J, Buyya R (2005) A taxonomy of scientific workflow systems for grid computing. Special issue on scientific workflows, ACM SIGMOD record, 34(3), ACM Press, New York, pp 44–49
Torkestani JA (2011) A new approach to the job scheduling problem in computational grids. J Clust Comput. doi:10.1007/s10586-011-0192-5
Qureshi K, Rehman A, Manuel P (2011) Enhanced GridSim architecture with load balancing. J Supercomput 57(3):265–275
Khan FG, Qureshi K, Nazir B (2010) Performance evaluation of fault tolerance techniques in grid computing system. Int J Comput Electr Eng 36(6):1110–1122
Qureshi K, Majeed B, Kazmi JH, Madani SA (2012) Task partitioning, scheduling and load balancing strategy for mixed nature of tasks. J Supercomput 59(3):1348–1359
Gao Y, Rong H et al (2004) Adaptive grid job scheduling with genetic algorithms, future generation computer systems. Elsevier 21:151–161
Nazir B, Qureshi K, Manuel P (2009) Adaptive checkpointing strategy to tolerate faults in economy based grid. J Supercomput 50(1):1–18
Qureshi K, Rehman A, Manuel P (2011) Enhanced GridSim architecture with load balancing. J Supercomput 57(3):265–275
Foster I, Kesselman C (eds) (2003) The grid: blueprint for a new computing infrastructure, 2nd edn. Morgan Kaufmann
Abraham A, Buyya R, Nath B (2000) Nature’s heuristics for scheduling jobs on computational grids. In: the 8th IEEE international conference on advanced computing and communications, Cochin, pp 45–52
Braun R, Siegel H, Beck N, Boloni L, Maheswaran M, Reuther A, Robertson J, Theys M, Yao B, Hensgen D, Freund R (2001) A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J Parallel Distrib Comput 61(6):810–837
Lee H, Lee D, Ramakrishna RS (2006) An enhanced grid scheduling with job priority and equitable interval job distribution. In: Advances in grid and pervasive computing, Springer, Berlin, pp 53–62
Tsafrir D, Etsion Y, Feitelson DG (2007) Backfilling using system-generated predictions rather than user runtime estimates. IEEE Trans Parallel Distrib Syst 18(6):789–807
Lawson BG, Smirni E (2002) Multiple-queue backfilling scheduling with priorities and reservations for parallel systems, LNCS 2537. Springer, Berlin
Di Martino V, Mililotti M (2004) Sub optimal scheduling in a grid using genetic algorithms. Parallel Comput 30:553–565
Gao Y, Rong H, Zhexue Huang J (2005) Adaptive Grid job scheduling with genetic algorithms. J Future Gener Comput Syst 21:151–161
Carretero J, Xhafa F (2006) Using genetic algorithms for scheduling jobs in large scale grid applications. J Technol Econ Dev 12(1):11–17
Cheng W, Congfeng J, Xiaohu L (2007) Fuzzy logic-based secure and fault tolerant job scheduling in grid. Tsinghua Sci Technol 12(S1):45–50
Liu H, Abraham A, Hassanien AE (2010) Scheduling jobs on computational grids using a fuzzy particle swarm optimization algorithm. Future Gener Comput Syst 26:1336–1343
Korkhov VV, Korkhov JT, Krzhizhanovskaya VV (2009) Dynamic workload balancing of parallel applications with user-level scheduling on the grid. Future Gener Comput Syst Elsevier 25:28–34
Berman FD, Wolski R, Figueira S, Schopf J, Shao G (1996) Application-level scheduling on distributed heterogeneous networks. In: Proceedings of the 1996 ACM/IEEE conference on supercomputing, Pittsburgh, Pennsylvania, ISBN:0-89791-854
Siyambalapitiya R, Sandirigama M (2011) Improvements to first-come-serve multiprocessor scheduling with gang scheduling. IUP J Comput Sci 5(3):11–17
Baruah S, Funk S, Goossens J (2003) Robustness results concerning EDF scheduling upon uniform multiprocessors. IEEE Trans Comput 52(9):1185–1195
Moaddeli HR, Dastghaibyfard Gh, Moosavi MR Flexible advance reservation impact on backfilling scheduling strategies. Grid and cooperative computing, IEEE-2008. GCC\(\backslash \)08, seventh International conference, pp 151–159. ISBN:978-0-7695-3449-7
Techiouba AD, Capannini G, Baraglia R, Puppin D, Pasquali M (2008) Backfilling strategies for scheduling streams of jobs on computational farms. In: CoreGRID workshop on grid programming model, grid and P2P systems architecture, grid systems, tools and environments, Springer
Lifka DA (1995) The ANL/IBM SP scheduling system. In: Job scheduling strategies for parallel processing, volume 949 of LNCS, Springer, pp 295–303
Chiang S-H, Arpaci-Dusseau A, Vernon MK (2002) The impact of more accurate requested runtimes on production job scheduling performance, Springer, LNCS, vol 2537, pp 103–127
Klusáček D, Rudová H, Baraglia R, Pasquali M, Capannini G (2008) Comparison of multi-criteria scheduling techniques. pp 173–184. ISBN: 978-0-387-09457-1
Terashima-Marin H et al (2007) Comparing two models to generate hyper-heuristics for the 2d-regular bin-packing problem. In: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, New York, pp 2182–2189. ISBN:978-1-59593-697-4
Klusáek D et al (2008) Alea-grid scheduling simulation environment. In: 7th international conference on parallel processing and applied mathematics, Springer, Berlin
Buyya R, Murshed M (2007) GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. Concur Comput Pract Exp 14(13–15):1175–1220 Special issue: grid computing environments
Feitelson et al (2006) Workload sanitation for performance evaluation. In: IEEE international symposium onperformance analysis of systems and software, pp 221–230. ISBN:1-4244-0186-0
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Khan, K.H., Qureshi, K. & Abd-El-Barr, M. An efficient grid scheduling strategy for data parallel applications. J Supercomput 68, 1487–1502 (2014). https://doi.org/10.1007/s11227-014-1114-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-014-1114-0