Scheduling Task Parallel Applications for Rapid Turnaround on Enterprise Desktop Grids

Kondo, Derrick; Chien, Andrew A.; Casanova, Henri

doi:10.1007/s10723-007-9063-y

Scheduling Task Parallel Applications for Rapid Turnaround on Enterprise Desktop Grids

Published: 21 March 2007

Volume 5, pages 379–405, (2007)
Cite this article

Journal of Grid Computing Aims and scope Submit manuscript

Derrick Kondo¹,
Andrew A. Chien² &
Henri Casanova³

113 Accesses
29 Citations
Explore all metrics

Abstract

Desktop Grids are popular platforms for high throughput applications, but due to their inherent resource volatility it is difficult to exploit them for applications that require rapid turnaround. Efficient desktop Grid execution of short-lived applications is an attractive proposition and we claim that it is achievable via intelligent resource selection. We propose three general techniques for resource selection: resource prioritization, resource exclusion, and task duplication. We use these techniques to instantiate several scheduling heuristics. We evaluate these heuristics through trace-driven simulations of four representative desktop Grid configurations. We find that ranking desktop resources according to their clock rates, without taking into account their availability history, is surprisingly effective in practice. Our main result is that a heuristic that uses the appropriate combination of resource prioritization, resource exclusion, and task replication can achieve performance within a factor of 1.7 of optimal in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Effective Scheduling in Computing Clusters

Article 16 December 2019

D. A. Grushin & N. N. Kuzyurin

Replication of “Tail” Computations in a Desktop Grid Project

DARDIS: Distributed And Randomized DIspatching and Scheduling

References

Acharya, A., Edjlali, G., Saltz, J.: The utility of exploiting idle workstations for parallel computation. In: Proceedings of the 1997 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pp. 225–234 (1997)
Alexandrov, A.D., Ibel, M., Schauser, K.E., Scheiman, C.: SuperWeb: Towards a global web-based parallel computing infrastructure. In: Proc. of the 11th IEEE International Parallel Processing Symposium (IPPS) (1997)
Arpaci, R., Dusseau, A., Vahdat, A., Liu, L., Anderson, T., Patterson, D.: The interaction of parallel and sequential workloads on a network of workstations. In: Proceedings of SIGMETRICS’95, pp 267–278 (1995)
Barak, A., Guday, S., W.R.: The MOSIX distributed operating system, load balancing for UNIX. In: Lecture Notes in Computer Science, vol. 672. Springer, Berlin Heidelberg New York (1993)
Google Scholar
Baratloo, A., Karaul, M., Kedem, Z., Wyckoff, P.: Charlotte: Metacomputing on the web. In: Proc. of the 9th International Conference on Parallel and Distributed Computing Systems (PDCS-96) (1996)
Bhagwan, R., Savage, S., Voelker, G.: Understanding availability. In: Proceedings of IPTPS’03 (2003)
Bolosky, W., Douceur, J., Ely, D., Theimer, M.: Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs. In: Proceedings of SIGMETRICS (2000)
Braun, T., Siegel, H., Beck, N.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61, 810–837 (2001)
Article Google Scholar
Camiel, N., London, S., Nisan, N., Regev, O.: The PopCorn Project: Distributed computation over the internet in Java. In: Proc. of the 6th International World Wide Web Conference (1997)
CANCER. The Compute Against Cancer project. http://www.computeagainstcancer.org/
Cappello, P., Christiansen, B., Ionescu, M., Neary, M., Schauser, K., Wu, D.: Javelin: Internet-based parallel computing using Java. In: Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (1997)
Casanova, H., Legrand, A., Zagorodnov, D., Berman, F.: Heuristics for scheduling parameter sweep applications in Grid environments. In: Proceedings of the 9th Heterogeneous Computing Workshop (HCW’00), pp. 349–363 (2000)
Chien, A., Calder, B., Elbert, S., Bhatia, K.: Entropia: architecture and performance of an enterprise desktop Grid system. J. Parallel Distrib. Comput. 63, 597–610 (2003)
Article Google Scholar
Chu, J., Labonte, K., Levine, B.: Availability and locality measurements of peer-to-peer file systems. In: Proceedings of ITCom: Scalability and Traffic Control in IP Networks (2003)
Dinda, P.: The statistical properties of host load. Sci. Program. 7, 3–4 (1999)
Google Scholar
Dinda, P.: A prediction-based real-time scheduling advisor. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’02) (2002a)
Dinda, P.: Online prediction of the running time of tasks. Cluster Comput. 5(3), 225–236 (2002b)
Article Google Scholar
Entropia. Entropia, Inc. http://www.entropia.com
Fedak, G., Germain, C., N’eri, V., Cappello, F.: XtremWeb: A generic global computing system. In: Proceedings of the IEEE International Symposium on Cluster Computing and the Grid (CCGRID’01) (2001)
FIGHTAIDS. The Fight Aids At Home project. http://www.fightaidsathome.org/
For Network Computing, T. B. O. I. http://boinc.berkeley.edu/
Frey, J., Tannenbaum, T., Livny, M., Foster, I., Tuecke, S.: Condor-G: a computation management agent for multi-institutional Grids. Cluster Comput. 5(3), 237–246 (2002)
Article Google Scholar
Ghare, G., Leutenegger, L.: Improving speedup and response times by replicating parallel programs on a SNOW. In: Proceedings of the 10th Workshop on Job Scheduling Strategies for Parallel Processing (2004)
Ghormley, D., Petrou, D., Rodrigues, S., Vahdat, A., Anderson, T.: GLUnix: a global layer unix for a network of workstations. Softw. Pract. Exp. 28(9) (1998)
GIMPS. The Great Internet Mersene Prime Search (GIMPS). http://www.mersenne.org/
Hupp, S.: The “Worm” programs – early experience with distributed computation. Commun. ACM 3(25), (1982)
Kondo, D.: Scheduling task parallel applications on enterprise desktop Grids. Ph.D. thesis (2005)
Kondo, D., Casanova, H.: Computing the optimal makespan for jobs with identical and independent tasks scheduled on volatile hosts. Technical Report CS2004-0796, Dept. of Computer Science and Engineering, University of California at San Diego (2004)
Kondo, D., Taufer, M., Brooks, C., Casanova, H., Chien, A.: Characterizing and evaluating desktop Grids: An empirical study. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’04) (2004)
Kreaseck, B., Carter, L., Casanova, H., Ferrante, J.: Autonomous protocols for bandwidth-centric scheduling of independent-task applications. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS’03) (2003)
Leutenegger, S., Sun, X.: Distributed computing feasibility in a non-dedicated homogeneous distributed system. In: Proc. of SC’93, Portland, Oregon (1993)
Litzkow, M., Livny, M., Mutka, M.: Condor – a hunter of idle workstations. In: Proceedings of the 8th International Conference of Distributed Computing Systems (ICDCS) (1988)
Lodygensky, O., Fedak, G., Neri, V., Cappello, F., Thain, D., Livny, M.: XtremWeb and condor: Sharing resources between internet connected condor pool. In: Proceedings of the IEEE International Symposium on Cluster Computing and the Grid (CCGRID’03) Workshop on Global Computing on Personal Devices (2003)
Long, D., Muir, A., Golding, R.: A longitudinal survey of internet host reliability. In: 14th Symposium on Reliable Distributed Systems, pp. 2–9 (1995)
Mutka, M.: Considering deadline constraints when allocating the shared capacity of private workstations. Inter. J. Comput. Simul. 4(1), 41–63 (1994)
Google Scholar
Mutka, M., Livny, M.: The available capacity of a privately owned workstation environment. Perform. Eval. 4(12), (1991)
Nabrzyski, J., Schopf, J., Weglarz, J. (eds.): Grid resource management, Chapt. 26. Kluwer (2003)
Oram, A. (ed.): Peer-To-Peer: harnessing the power of disruptive technologies. O’Reilly & Associates, Sebastopol, CA (2001)
Google Scholar
Pedroso, J., Silva, L., Silva, J.: Web-based metacomputing with JET. In: Proc. of the ACM PPoPP Workshop on Java for Science and Engineering Computation (1997)
Platform. Platform Computing Inc. http://www.platform.com/
Pruyne, J., Livny, M.: A worldwide flock of condors: load sharing among workstation clusters. Future Gener. Comput. Syst. 12, (1996)
Sarmenta, L.: Sabotage-tolerance mechanisms for volunteer computing systems. In: Proceedings of IEEE International Symposium on Cluster Computing and the Grid (2001)
Sarmenta, L., Hirano, S.: Bayanihan: Building and studying web-based volunteer computing systems using Java. Future Gener. Comput. Syst. 15(5–6), 675–686 (1999)
Article Google Scholar
Saroiu, S., Gummadi, P., Gribble, S.: A measurement study of peer-to-peer file sharing systems. In: Proceedings of MMCN (2002)
SETI@home. The SETI@home project. http://setiathome.ssl.berkeley.edu/
Shirts, M., Pande, V.: Screen savers of the world, Unite!. Science 290, 1903–1904 (2000)
Article Google Scholar
Smallen, S., Casanova, H., Berman, F.: Tunable on-line parallel tomography. In: Proceedings of SuperComputing’01, Denver, CO (2001)
Sullivan, W.T., Werthimer, D., Bowyer, S., Cobb, J., Gedye, G., Anderson, D.: A new major SETI project based on Project Serendip data and 100,000 personal computers. In: Proc. of the Fifth Intl. Conf. on Bioastronomy (1997)
Synapse. DataSynapse Inc. http://www.datasynapse.com
Taufer, M., An, C., Kerstens, A., C.L.B. III: Predictor@Home: A “protein structure prediction supercomputer” based on public-resource computing. In: IPDPS (2005a)
Taufer, M., Anderson, D., Cicotti, P., C.L.B. III: Homogeneous redundancy: A technique to ensure integrity of molecular simulation results using public computing. In: IPDPS (2005b)
UD. United Devices Inc. http://www.ud.com/
Wolski, R., Spring, N., Hayes, J.: Predicting the CPU availability of time-shared Unix systems. In: Proceedings of 8th IEEE High Performance Distributed Computing Conference (HPDC8) (1999)
Wyckoff, P., Johnson, T., Jeong, K.: Finding idle periods on networks of workstations. Technical Report CS761, Dept. of Computer Science, New York University (1998)

Download references

Author information

Authors and Affiliations

Laboratoire de Recherche en Informatique/INRIA Futurs, Bâtiment 490, Université Paris Sud, Orsay, 92405, France
Derrick Kondo
Department of Computer Science and Engineering, University of California, San Diego, CA, USA
Andrew A. Chien
Department of Information and Computer Sciences, University of Hawai‘i, Manoa, Hawai‘i
Henri Casanova

Authors

Derrick Kondo
View author publications
You can also search for this author in PubMed Google Scholar
Andrew A. Chien
View author publications
You can also search for this author in PubMed Google Scholar
Henri Casanova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Derrick Kondo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kondo, D., Chien, A.A. & Casanova, H. Scheduling Task Parallel Applications for Rapid Turnaround on Enterprise Desktop Grids. J Grid Computing 5, 379–405 (2007). https://doi.org/10.1007/s10723-007-9063-y

Download citation

Received: 28 April 2006
Accepted: 11 January 2007
Published: 21 March 2007
Issue Date: December 2007
DOI: https://doi.org/10.1007/s10723-007-9063-y

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scheduling Task Parallel Applications for Rapid Turnaround on Enterprise Desktop Grids

Abstract

Access this article

Similar content being viewed by others

On Effective Scheduling in Computing Clusters

Replication of “Tail” Computations in a Desktop Grid Project

DARDIS: Distributed And Randomized DIspatching and Scheduling

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Key words

Navigation

Scheduling Task Parallel Applications for Rapid Turnaround on Enterprise Desktop Grids

Abstract

Access this article

Similar content being viewed by others

On Effective Scheduling in Computing Clusters

Replication of “Tail” Computations in a Desktop Grid Project

DARDIS: Distributed And Randomized DIspatching and Scheduling

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation