Abstract
Desktop Grids are popular platforms for high throughput applications, but due to their inherent resource volatility it is difficult to exploit them for applications that require rapid turnaround. Efficient desktop Grid execution of short-lived applications is an attractive proposition and we claim that it is achievable via intelligent resource selection. We propose three general techniques for resource selection: resource prioritization, resource exclusion, and task duplication. We use these techniques to instantiate several scheduling heuristics. We evaluate these heuristics through trace-driven simulations of four representative desktop Grid configurations. We find that ranking desktop resources according to their clock rates, without taking into account their availability history, is surprisingly effective in practice. Our main result is that a heuristic that uses the appropriate combination of resource prioritization, resource exclusion, and task replication can achieve performance within a factor of 1.7 of optimal in practice.
Similar content being viewed by others
References
Acharya, A., Edjlali, G., Saltz, J.: The utility of exploiting idle workstations for parallel computation. In: Proceedings of the 1997 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pp. 225–234 (1997)
Alexandrov, A.D., Ibel, M., Schauser, K.E., Scheiman, C.: SuperWeb: Towards a global web-based parallel computing infrastructure. In: Proc. of the 11th IEEE International Parallel Processing Symposium (IPPS) (1997)
Arpaci, R., Dusseau, A., Vahdat, A., Liu, L., Anderson, T., Patterson, D.: The interaction of parallel and sequential workloads on a network of workstations. In: Proceedings of SIGMETRICS’95, pp 267–278 (1995)
Barak, A., Guday, S., W.R.: The MOSIX distributed operating system, load balancing for UNIX. In: Lecture Notes in Computer Science, vol. 672. Springer, Berlin Heidelberg New York (1993)
Baratloo, A., Karaul, M., Kedem, Z., Wyckoff, P.: Charlotte: Metacomputing on the web. In: Proc. of the 9th International Conference on Parallel and Distributed Computing Systems (PDCS-96) (1996)
Bhagwan, R., Savage, S., Voelker, G.: Understanding availability. In: Proceedings of IPTPS’03 (2003)
Bolosky, W., Douceur, J., Ely, D., Theimer, M.: Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs. In: Proceedings of SIGMETRICS (2000)
Braun, T., Siegel, H., Beck, N.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61, 810–837 (2001)
Camiel, N., London, S., Nisan, N., Regev, O.: The PopCorn Project: Distributed computation over the internet in Java. In: Proc. of the 6th International World Wide Web Conference (1997)
CANCER. The Compute Against Cancer project. http://www.computeagainstcancer.org/
Cappello, P., Christiansen, B., Ionescu, M., Neary, M., Schauser, K., Wu, D.: Javelin: Internet-based parallel computing using Java. In: Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (1997)
Casanova, H., Legrand, A., Zagorodnov, D., Berman, F.: Heuristics for scheduling parameter sweep applications in Grid environments. In: Proceedings of the 9th Heterogeneous Computing Workshop (HCW’00), pp. 349–363 (2000)
Chien, A., Calder, B., Elbert, S., Bhatia, K.: Entropia: architecture and performance of an enterprise desktop Grid system. J. Parallel Distrib. Comput. 63, 597–610 (2003)
Chu, J., Labonte, K., Levine, B.: Availability and locality measurements of peer-to-peer file systems. In: Proceedings of ITCom: Scalability and Traffic Control in IP Networks (2003)
Dinda, P.: The statistical properties of host load. Sci. Program. 7, 3–4 (1999)
Dinda, P.: A prediction-based real-time scheduling advisor. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’02) (2002a)
Dinda, P.: Online prediction of the running time of tasks. Cluster Comput. 5(3), 225–236 (2002b)
Entropia. Entropia, Inc. http://www.entropia.com
Fedak, G., Germain, C., N’eri, V., Cappello, F.: XtremWeb: A generic global computing system. In: Proceedings of the IEEE International Symposium on Cluster Computing and the Grid (CCGRID’01) (2001)
FIGHTAIDS. The Fight Aids At Home project. http://www.fightaidsathome.org/
For Network Computing, T. B. O. I. http://boinc.berkeley.edu/
Frey, J., Tannenbaum, T., Livny, M., Foster, I., Tuecke, S.: Condor-G: a computation management agent for multi-institutional Grids. Cluster Comput. 5(3), 237–246 (2002)
Ghare, G., Leutenegger, L.: Improving speedup and response times by replicating parallel programs on a SNOW. In: Proceedings of the 10th Workshop on Job Scheduling Strategies for Parallel Processing (2004)
Ghormley, D., Petrou, D., Rodrigues, S., Vahdat, A., Anderson, T.: GLUnix: a global layer unix for a network of workstations. Softw. Pract. Exp. 28(9) (1998)
GIMPS. The Great Internet Mersene Prime Search (GIMPS). http://www.mersenne.org/
Hupp, S.: The “Worm” programs – early experience with distributed computation. Commun. ACM 3(25), (1982)
Kondo, D.: Scheduling task parallel applications on enterprise desktop Grids. Ph.D. thesis (2005)
Kondo, D., Casanova, H.: Computing the optimal makespan for jobs with identical and independent tasks scheduled on volatile hosts. Technical Report CS2004-0796, Dept. of Computer Science and Engineering, University of California at San Diego (2004)
Kondo, D., Taufer, M., Brooks, C., Casanova, H., Chien, A.: Characterizing and evaluating desktop Grids: An empirical study. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’04) (2004)
Kreaseck, B., Carter, L., Casanova, H., Ferrante, J.: Autonomous protocols for bandwidth-centric scheduling of independent-task applications. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’03) (2003)
Leutenegger, S., Sun, X.: Distributed computing feasibility in a non-dedicated homogeneous distributed system. In: Proc. of SC’93, Portland, Oregon (1993)
Litzkow, M., Livny, M., Mutka, M.: Condor – a hunter of idle workstations. In: Proceedings of the 8th International Conference of Distributed Computing Systems (ICDCS) (1988)
Lodygensky, O., Fedak, G., Neri, V., Cappello, F., Thain, D., Livny, M.: XtremWeb and condor: Sharing resources between internet connected condor pool. In: Proceedings of the IEEE International Symposium on Cluster Computing and the Grid (CCGRID’03) Workshop on Global Computing on Personal Devices (2003)
Long, D., Muir, A., Golding, R.: A longitudinal survey of internet host reliability. In: 14th Symposium on Reliable Distributed Systems, pp. 2–9 (1995)
Mutka, M.: Considering deadline constraints when allocating the shared capacity of private workstations. Inter. J. Comput. Simul. 4(1), 41–63 (1994)
Mutka, M., Livny, M.: The available capacity of a privately owned workstation environment. Perform. Eval. 4(12), (1991)
Nabrzyski, J., Schopf, J., Weglarz, J. (eds.): Grid resource management, Chapt. 26. Kluwer (2003)
Oram, A. (ed.): Peer-To-Peer: harnessing the power of disruptive technologies. O’Reilly & Associates, Sebastopol, CA (2001)
Pedroso, J., Silva, L., Silva, J.: Web-based metacomputing with JET. In: Proc. of the ACM PPoPP Workshop on Java for Science and Engineering Computation (1997)
Platform. Platform Computing Inc. http://www.platform.com/
Pruyne, J., Livny, M.: A worldwide flock of condors: load sharing among workstation clusters. Future Gener. Comput. Syst. 12, (1996)
Sarmenta, L.: Sabotage-tolerance mechanisms for volunteer computing systems. In: Proceedings of IEEE International Symposium on Cluster Computing and the Grid (2001)
Sarmenta, L., Hirano, S.: Bayanihan: Building and studying web-based volunteer computing systems using Java. Future Gener. Comput. Syst. 15(5–6), 675–686 (1999)
Saroiu, S., Gummadi, P., Gribble, S.: A measurement study of peer-to-peer file sharing systems. In: Proceedings of MMCN (2002)
SETI@home. The SETI@home project. http://setiathome.ssl.berkeley.edu/
Shirts, M., Pande, V.: Screen savers of the world, Unite!. Science 290, 1903–1904 (2000)
Smallen, S., Casanova, H., Berman, F.: Tunable on-line parallel tomography. In: Proceedings of SuperComputing’01, Denver, CO (2001)
Sullivan, W.T., Werthimer, D., Bowyer, S., Cobb, J., Gedye, G., Anderson, D.: A new major SETI project based on Project Serendip data and 100,000 personal computers. In: Proc. of the Fifth Intl. Conf. on Bioastronomy (1997)
Synapse. DataSynapse Inc. http://www.datasynapse.com
Taufer, M., An, C., Kerstens, A., C.L.B. III: Predictor@Home: A “protein structure prediction supercomputer” based on public-resource computing. In: IPDPS (2005a)
Taufer, M., Anderson, D., Cicotti, P., C.L.B. III: Homogeneous redundancy: A technique to ensure integrity of molecular simulation results using public computing. In: IPDPS (2005b)
UD. United Devices Inc. http://www.ud.com/
Wolski, R., Spring, N., Hayes, J.: Predicting the CPU availability of time-shared Unix systems. In: Proceedings of 8th IEEE High Performance Distributed Computing Conference (HPDC8) (1999)
Wyckoff, P., Johnson, T., Jeong, K.: Finding idle periods on networks of workstations. Technical Report CS761, Dept. of Computer Science, New York University (1998)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kondo, D., Chien, A.A. & Casanova, H. Scheduling Task Parallel Applications for Rapid Turnaround on Enterprise Desktop Grids. J Grid Computing 5, 379–405 (2007). https://doi.org/10.1007/s10723-007-9063-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-007-9063-y