Abstract
The main advantage of a metacomputer is not its peak performance but better utilization of its machines. Therefore, efficient scheduling strategies are vitally important to any metacomputing project. A real metacomputer management system will not gain exclusive access to all its resources, because participating centers will not be willing to give up autonomy. As a consequence, the scheduling algorithm has to deal with a set of local sub-schedulers performing individual machine management. Based on the proposal made by Feitelson and Rudolph in 1998 we developed a scheduling model that takes these circumstances into account. It has been implemented as a generic simulation environment, which we make available to the public. Using this tool, we examined the behavior of several well known scheduling algorithms in a metacomputing scenario. The results demonstrate that interaction with the sub-schedulers, communication of parallel applications, and the huge size of the metacomputer are among the most important aspects for scheduling a metacomputer. Based upon these observations we developed a new technique that makes it possible to use scheduling algorithms developed for less realistic machine models for real world metacomputing projects. Simulation runs demonstrate that this technique leads to far better results than the algorithms currently used in metacomputer management systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Academic Computing Services Amsterdam. The SARA Metacomputing Project. WWW Page. http://www.sara.nl/hec/projects/meta/.
Carl Albing. Cray NQS: production batch for a distributed computing world. In Proceedings of the 11th Sun User Group Conference and Exhibition, pages 302–309, Brookline, MA, USA, December 1993. Sun User Group, Inc.
J. Almond and D. Snelling. UNICORE: Secure and Uniform Access to Distributed Resources via the World Wide Web, 1998. http://www.kfajuelich.de/zam/RD/coop/unicore/.
Stergios V. Anastasiadis and Kenneth C. Sevcik. Parallel application scheduling on networks of workstations. Journal of Parallel and Distributed Computing, 43 (2):109–124, June 1997.
T. E. Anderson, D. E. Culler, and D. A. Patterson. A case for NOW (Networks of Workstations). IEEE Micro, 15(1):54–64, February 1995.
R. Baraglia, R. Ferrini, D. Laforenza, and A. Lagana. Metacomputing to overcome the power limits of a single machine. Lecture Notes in Computer Science, 1225:982ff, 1997.
M. Calzarossa and G. Serazzi. A characterization of the variation in time of workload arrival patterns. IEEE Transactions on Computers, Vol.C-34:2, 156–162, 1985.
Olivier Catoni. Solving scheduling problems by simulated annealing. SIAM Journal on Control and Optimization, 36 (5):1539–1575, September 1998.
Steve J. Chapin, Dimitrios Katramatos, John Karpovich, and Andrew S. Grimshaw. Resource management in legion. Technical Report CS-98-09, Department of Computer Science, University of Virginia, February 11 1998. Wed, 19 Aug 199817:14:25 GMT.
Su-Hui Chiang, Rajesh K. Mansharamani, and Mary K. Vernon. Use of Application Characteristics and Limited Preemption for Run-To-Completion Parallel Processor Scheduling Policies. In Proceedings of the 1994 ACM SIGMETRICS Conference, pages 33–44, February 1994.
Cray Research. NQE. commercial product.
Thomas A. DeFanti, Ian Foster, Michael E. Papka, Rick Stevens, and Tim Kuhfuss. Overview of the I-WAY: Wide-area visual supercomputing. The International Journal of Supercomputer Applications and High Performance Computing, 10 (2/3):123–131, Summer/Fall 1996.
Jack Dongarra and Hans Meuer and Erich Strohmaier. Top 500 Report. WWW Page, 1998. http://www.netlib.org/benchmark/top500/top500.list.html. 186
Allen B. Downey. A parallel workload model and its implications for processor allocation. Technical Report CSD-96-922, University of California, Berkeley, November 6, 1996.
Allen B Downey. A model for speedup of parallel programs. Technical Report CSD-97-933, University of California, Berkeley, January 30, 1997.
D. G. Feitelson. Packing schemes for gang scheduling. Lecture Notes in Computer Science, 1162:89ff, 1996.
D. G. Feitelson and B. Nitzberg. Job characteristics of a production parallel scientific workload on the NASA ames iPSC/ 860. Lecture Notes in Computer Science, 949:337ff, 1995.
D. G. Feitelson and L. Rudolph. Metrics and benchmarking for parallel job scheduling. Lecture Notes in Computer Science, 1459:1ff, 1998.
D. G. Feitelson, L. Rudolph, U. Schwiegelshohn, and K. C. Sevcik. Theory and practice in parallel job scheduling. Lecture Notes in Computer Science, 1291:1ff, 1997.
I. Foster and C. Kesselman. Globus: A metacomputing infrastructure toolkit. The International Journal of Supercomputer Applications and High Performance Computing, 11 (2):115–128, Summer 1997.
J. Gehring and F. Ramme. Architecture-independent request-scheduling with tight waiting-time estimations. Lecture Notes in Computer Science, 1162:65ff, 1996.
J. Gehring, A. Reinefeld, and A. Weber. PHASE and MICA: Application specific metacomputing. In Proceedings of Europar 97, Passau, Germany, 1997.
Genias Software GmbH, Erzgebirgstr. 2B, D-93073 Neutraubling. CODINE User’s Guide, 1993. http://www.genias.de/genias/english/codine/.
Hoare. Quicksort. In C. A. A. Hoare and C. B. Jones (Eds.), Essays in Computing Science, Prentice Hall. 1989.
Chao-Ju Hou and Kang G. Shin. Implementation of decentralized load sharing in networked workstations using the Condor package. Journal of Parallel and Distributed Computing, 40 (2):173–184, February 1997.
IBM Corporation. Using and Administering LoadLeveler (Release 3.0), 4 edition, August 1996. Document Number SC23-3989-00.
K. Koski. A step towards large scale parallelism: building a parallel computing environment from heterogenous resources. Future Generation Computer Systems, 11 (4-5):491–498, August 1995.
Robert R. Lipman and Judith E. Devaney. Websubmit–running supercomputer applications via the web. In Supercomputing’ 96, Pittsburgh, PA, November 1996.
Walter T. Ludwig. Algorithms for scheduling malleable and nonmalleable parallel tasks. Technical Report CS-TR-95-1279, University ofWisconsin, Madison, August 1995.
The NRW Metacomputing Initiative. WWW Page. http://www.unipaderborn.de/pc2/nrwmc/.
B. J. Overeinder and P. M. A. Sloot. Breaking the curse of dynamics by task migration: Pilot experiments in the polder metacomputer.
E. W. Parsons and K. C. Sevcik. Implementing multiprocessor scheduling disciplines. Lecture Notes in Computer Science, 1291:166ff, 1997.
Platform Computing Corporation. LSF Product Information. WWW Page, October 1996. http://www.platform.com/.
F. Ramme and K. Kremer. Scheduling a metacomputer by an implicit voting system. In Int. IEEE Symposium on High-Perform94
A. Reinefeld, R. Baraglia, T. Decker, J. Gehring, D. Laforenza, F. Ramme, T. Rémke;, and J. Simon. The MOL project: An open, extensible metacomputer. In Debra Hensgen, editor, Proceedings of the 6th Heterogeneous Computing Workshop, pages 17–31, Washington, April 1 1997. IEEE Computer Society Press.
V. Sander, D. Erwin, and V. Huber. High-performance computer management based on Java. Lecture Notes in Computer Science, 1401:526ff, 1998.
M. Schwehm and T. Walter. Mapping and scheduling by genetic algorithms. Lecture Notes in Computer Science, 854:832ff, 1994.
Uwe Schwiegelshohn. Preemptive weighted completion time scheduling of parallel jobs. In Josep Díaz and Maria Serna, editors, Algorithms ESA’ 96, Fourth Annual European Symposium, volume 1136 of Lecture Notes in Computer Science, pages 39–51, Barcelona, Spain, 25-27 September 1996. Springer.
Uwe Schwiegelshohn and Ramin Yahyapour. Analysis of first-come-first-serve parallel job scheduling. In Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 629–638, San Francisco, California, 25-27 January 1998. i. A comparative analysis of static processor partitioning policies for parallel computers. In Internat. Workshop on Modeling and Simulation of Computer and Telecommunication Systems (MASCOTS), pages 283–286, January 1993.
Jon Siegel. CORBA: Fundamentals and Programming. John Wiley & Sons Inc., New York, 1 edition, 1996.
Larry Smarr and Charles E. Catlett. Metacomputing. Communications of the ACM, 35 (6):44–52, June 1992.
W. Smith, I. Foster, and V. Taylor. Predicting application run times using historical information. Lecture Notes in Computer Science, 1459:122ff, 1998.
A. W. van Halderen, Benno J. Overeinder, Peter M. A. Sloot, R. van Dantzig, Dick H. J. Epema, and Miron Livny. Hierarchical resource management in the polder metacomputing initiative. submitted to Parallel Computing, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gehring, J., Preiss, T. (1999). Scheduling a Metacomputer with Uncooperative Sub-schedulers. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1999. Lecture Notes in Computer Science, vol 1659. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47954-6_10
Download citation
DOI: https://doi.org/10.1007/3-540-47954-6_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66676-9
Online ISBN: 978-3-540-47954-3
eBook Packages: Springer Book Archive