Abstract
Workflows are prevailing in scientific computation. Multicluster environments emerge and provide more resources, benefiting workflows but also challenging the traditional workflow scheduling heuristics. In a multicluster environment, each cluster has its own independent workload management system. Jobs are queued up before getting executed, they experience different resource availability and wait time if dispatched to different clusters. However, existing scheduling heuristics neither consider the queue wait time nor balance the performance gain with data movement cost. The proposed algorithm leverages the advancement of queue wait time prediction techniques and empirically studies if the tunability of resource requirements helps scheduling. The extensive experiment with both real workload traces and test bench shows that the queue wait time aware algorithm improves workflow performance by 3 to 10 times in terms of average makespan with relatively very low cost of data movement.
Similar content being viewed by others
References
NSF teragrid. http://www.teragrid.org/.
Li H, Groep D, Wolters L. Workload characteristics of a multi-cluster supercomputer. In Proc. the 10th International Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP 2004), New York, USA, June 13, 2004, pp.176-193.
Nurmi D, Brevik J, Wolski R. QBETS: Queue bounds estimation from time series. In Proc. the 2007 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS 2007), San Diego, USA, June 12-16, 2007, pp.379-380.
QBETS web service. http://spinner.cs.ucsb.edu/batchq/.
Aida K, Casanova H. Scheduling mixed-parallel applications with advance reservations. In Proc. the 17th International Symposium on High-Performance Distributed Computing (HPDC 2008), Boston, USA, June 23-27, 2008, pp.65-74.
N’Takpe T, Suter F, Casanova H. A comparison of scheduling approaches for mixed-parallel applications on heterogeneous platforms. In Proc. the Sixth International Symposium on Parallel and Distributed Computing (ISPDC 2007), Hagenberg, Austria, July 5-8, 2007, p.35.
Rauber T, Runger G. M-task-programming for heterogeneous systems and grid environments. In Proc. 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), Los Alamitos, USA, April 4-8, 2005, p.178b.
He L, Jarvis S, Spooner D, Nudd G. Performance evaluation of scheduling applications with DAG topologies on multiclusters with independent local schedulers. In Proc. the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Rhodes Island, Greece, April 25-29, 2006, pp.8-15.
Yu Z, Shi W. An adaptive rescheduling strategy for grid workflow applications. In Proc. the 21st International Parallel and Distributed Processing Symposium (IPDPS 2007), Long Beach, USA, March 26-30, 2007, p.115.
Yu Z, Shi W. A planner-guided scheduling strategy for multiple workflow applications. In Proc. the Fourth International Workshop on Scheduling and Resource Management for Parallel and Distributed Systems, in conjunction with ICPP 2008, Portland, USA, Sept. 8-12, 2008, pp.1-8.
Topcuouglu H, Hariri S, Wu M. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Transactions on Parallel and Distribution Systems, 2002, 13(3): 260-274.
The Globus Alliance. http://www.globus.org/.
Huang R, Casanova H, Chien A. Automatic resource specification generation for resource selection. In Proc. the 2007 ACM/IEEE Conference on Supercomputing (SC 2007), Reno, USA, Nov. 10-16, 2007, pp.1-11.
Hill M, Marty M. Amdahl’s law in the multicore era. Computer, 2008, 41(7): 33-38.
Hönig U, Schiffmann W. A comprehensive test bench for the evaluation of scheduling heuristics. In Proc. the 16th International Conference on Parallel and Distributed Computing and Systems (PDCS 2004), Cambridge, USA, Nov. 9-11, 2004, pp.437-442.
Sulistio A, Cibej U, Venugopal S, Robic B, Buyya R. A toolkit for modelling and simulating data grids: An extension to gridsim. Concurr. Comput.: Pract. Exper., 2008, 20(13): 1591-1609.
Portable Batch System. http://www.openpbs.org, Dec. 2008.
Sabin G, Kettimuthu R, Rajan A, Sadayappan P. Scheduling of parallel jobs in a heterogeneous multi-site environment. In the 9th International Workshop of Job Scheduling Strategies for Parallel Processing (JSSPP 2003), Seattle, USA, June 24, 2003, pp.87-104.
Yu Z. Toward practical multi-workflow scheduling in cluster and grid environments [Ph.D. Dissertation]. Wayne State University, 2008.
Shivaratri N, Krueger P, Singhal M. Load distributing for locally distributed systems. Computer, 1992, 25(12): 33-44.
Maheswaran M, Ali S, Siegel H, Hensgen D, Freund R. Dynamic mapping of a class of independent tasks onto heterogeneous computing systems. J. Parallel Distrib. Comput., 1999, 59(2): 107-131.
Hunold S, Rauber T, Runger G. Dynamic scheduling of multiprocessor tasks on clusters of clusters. In Proc. 2007 IEEE International Conference on Cluster Computing, Austin, USA, Sept. 17-21, 2007, pp.507-514.
Hunold S, Rauber T, Suter F. Scheduling dynamic workflows onto clusters of clusters using postponing. In Proc. the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid (CCGRID2008), Lyon, France, May 19-22, 2008, pp.669-674.
Nurmi D, Mandal A, Brevik J, Koelbel C, Wolski R, Kennedy K. Evaluation of a workflow scheduler using integrated performance modelling and batch queue wait time prediction. In Proc. the 2006 ACM/IEEE Conference on Supercomputing (SC2006), Tampa, USA, Nov. 11-17, 2006, Article No.119.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is in part supported by the US National Science Foundation CAREER Grant No. CCF-0643521.
Rights and permissions
About this article
Cite this article
Yu, ZF., Shi, WS. Queue Waiting Time Aware Dynamic Workflow Scheduling in Multicluster Environments. J. Comput. Sci. Technol. 25, 864–873 (2010). https://doi.org/10.1007/s11390-010-9371-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-010-9371-8