Skip to main content

Scheduling a Metacomputer with Uncooperative Sub-schedulers

  • Conference paper
  • First Online:
Job Scheduling Strategies for Parallel Processing (JSSPP 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1659))

Included in the following conference series:

Abstract

The main advantage of a metacomputer is not its peak performance but better utilization of its machines. Therefore, efficient scheduling strategies are vitally important to any metacomputing project. A real metacomputer management system will not gain exclusive access to all its resources, because participating centers will not be willing to give up autonomy. As a consequence, the scheduling algorithm has to deal with a set of local sub-schedulers performing individual machine management. Based on the proposal made by Feitelson and Rudolph in 1998 we developed a scheduling model that takes these circumstances into account. It has been implemented as a generic simulation environment, which we make available to the public. Using this tool, we examined the behavior of several well known scheduling algorithms in a metacomputing scenario. The results demonstrate that interaction with the sub-schedulers, communication of parallel applications, and the huge size of the metacomputer are among the most important aspects for scheduling a metacomputer. Based upon these observations we developed a new technique that makes it possible to use scheduling algorithms developed for less realistic machine models for real world metacomputing projects. Simulation runs demonstrate that this technique leads to far better results than the algorithms currently used in metacomputer management systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Academic Computing Services Amsterdam. The SARA Metacomputing Project. WWW Page. http://www.sara.nl/hec/projects/meta/.

  2. Carl Albing. Cray NQS: production batch for a distributed computing world. In Proceedings of the 11th Sun User Group Conference and Exhibition, pages 302–309, Brookline, MA, USA, December 1993. Sun User Group, Inc.

    Google Scholar 

  3. J. Almond and D. Snelling. UNICORE: Secure and Uniform Access to Distributed Resources via the World Wide Web, 1998. http://www.kfajuelich.de/zam/RD/coop/unicore/.

  4. Stergios V. Anastasiadis and Kenneth C. Sevcik. Parallel application scheduling on networks of workstations. Journal of Parallel and Distributed Computing, 43 (2):109–124, June 1997.

    Article  Google Scholar 

  5. T. E. Anderson, D. E. Culler, and D. A. Patterson. A case for NOW (Networks of Workstations). IEEE Micro, 15(1):54–64, February 1995.

    Google Scholar 

  6. R. Baraglia, R. Ferrini, D. Laforenza, and A. Lagana. Metacomputing to overcome the power limits of a single machine. Lecture Notes in Computer Science, 1225:982ff, 1997.

    Google Scholar 

  7. M. Calzarossa and G. Serazzi. A characterization of the variation in time of workload arrival patterns. IEEE Transactions on Computers, Vol.C-34:2, 156–162, 1985.

    Article  Google Scholar 

  8. Olivier Catoni. Solving scheduling problems by simulated annealing. SIAM Journal on Control and Optimization, 36 (5):1539–1575, September 1998.

    Article  MATH  MathSciNet  Google Scholar 

  9. Steve J. Chapin, Dimitrios Katramatos, John Karpovich, and Andrew S. Grimshaw. Resource management in legion. Technical Report CS-98-09, Department of Computer Science, University of Virginia, February 11 1998. Wed, 19 Aug 199817:14:25 GMT.

    Google Scholar 

  10. Su-Hui Chiang, Rajesh K. Mansharamani, and Mary K. Vernon. Use of Application Characteristics and Limited Preemption for Run-To-Completion Parallel Processor Scheduling Policies. In Proceedings of the 1994 ACM SIGMETRICS Conference, pages 33–44, February 1994.

    Google Scholar 

  11. Cray Research. NQE. commercial product.

    Google Scholar 

  12. Thomas A. DeFanti, Ian Foster, Michael E. Papka, Rick Stevens, and Tim Kuhfuss. Overview of the I-WAY: Wide-area visual supercomputing. The International Journal of Supercomputer Applications and High Performance Computing, 10 (2/3):123–131, Summer/Fall 1996.

    Article  Google Scholar 

  13. Jack Dongarra and Hans Meuer and Erich Strohmaier. Top 500 Report. WWW Page, 1998. http://www.netlib.org/benchmark/top500/top500.list.html. 186

  14. Allen B. Downey. A parallel workload model and its implications for processor allocation. Technical Report CSD-96-922, University of California, Berkeley, November 6, 1996.

    Google Scholar 

  15. Allen B Downey. A model for speedup of parallel programs. Technical Report CSD-97-933, University of California, Berkeley, January 30, 1997.

    Google Scholar 

  16. D. G. Feitelson. Packing schemes for gang scheduling. Lecture Notes in Computer Science, 1162:89ff, 1996.

    Article  Google Scholar 

  17. D. G. Feitelson and B. Nitzberg. Job characteristics of a production parallel scientific workload on the NASA ames iPSC/ 860. Lecture Notes in Computer Science, 949:337ff, 1995.

    Google Scholar 

  18. D. G. Feitelson and L. Rudolph. Metrics and benchmarking for parallel job scheduling. Lecture Notes in Computer Science, 1459:1ff, 1998.

    Google Scholar 

  19. D. G. Feitelson, L. Rudolph, U. Schwiegelshohn, and K. C. Sevcik. Theory and practice in parallel job scheduling. Lecture Notes in Computer Science, 1291:1ff, 1997.

    Google Scholar 

  20. I. Foster and C. Kesselman. Globus: A metacomputing infrastructure toolkit. The International Journal of Supercomputer Applications and High Performance Computing, 11 (2):115–128, Summer 1997.

    Article  Google Scholar 

  21. J. Gehring and F. Ramme. Architecture-independent request-scheduling with tight waiting-time estimations. Lecture Notes in Computer Science, 1162:65ff, 1996.

    Google Scholar 

  22. J. Gehring, A. Reinefeld, and A. Weber. PHASE and MICA: Application specific metacomputing. In Proceedings of Europar 97, Passau, Germany, 1997.

    Google Scholar 

  23. Genias Software GmbH, Erzgebirgstr. 2B, D-93073 Neutraubling. CODINE User’s Guide, 1993. http://www.genias.de/genias/english/codine/.

  24. Hoare. Quicksort. In C. A. A. Hoare and C. B. Jones (Eds.), Essays in Computing Science, Prentice Hall. 1989.

    Google Scholar 

  25. Chao-Ju Hou and Kang G. Shin. Implementation of decentralized load sharing in networked workstations using the Condor package. Journal of Parallel and Distributed Computing, 40 (2):173–184, February 1997.

    Article  Google Scholar 

  26. IBM Corporation. Using and Administering LoadLeveler (Release 3.0), 4 edition, August 1996. Document Number SC23-3989-00.

    Google Scholar 

  27. K. Koski. A step towards large scale parallelism: building a parallel computing environment from heterogenous resources. Future Generation Computer Systems, 11 (4-5):491–498, August 1995.

    Article  Google Scholar 

  28. Robert R. Lipman and Judith E. Devaney. Websubmit–running supercomputer applications via the web. In Supercomputing’ 96, Pittsburgh, PA, November 1996.

    Google Scholar 

  29. Walter T. Ludwig. Algorithms for scheduling malleable and nonmalleable parallel tasks. Technical Report CS-TR-95-1279, University ofWisconsin, Madison, August 1995.

    Google Scholar 

  30. The NRW Metacomputing Initiative. WWW Page. http://www.unipaderborn.de/pc2/nrwmc/.

  31. B. J. Overeinder and P. M. A. Sloot. Breaking the curse of dynamics by task migration: Pilot experiments in the polder metacomputer.

    Google Scholar 

  32. E. W. Parsons and K. C. Sevcik. Implementing multiprocessor scheduling disciplines. Lecture Notes in Computer Science, 1291:166ff, 1997.

    Google Scholar 

  33. Platform Computing Corporation. LSF Product Information. WWW Page, October 1996. http://www.platform.com/.

  34. F. Ramme and K. Kremer. Scheduling a metacomputer by an implicit voting system. In Int. IEEE Symposium on High-Perform94

    Google Scholar 

  35. A. Reinefeld, R. Baraglia, T. Decker, J. Gehring, D. Laforenza, F. Ramme, T. Rémke;, and J. Simon. The MOL project: An open, extensible metacomputer. In Debra Hensgen, editor, Proceedings of the 6th Heterogeneous Computing Workshop, pages 17–31, Washington, April 1 1997. IEEE Computer Society Press.

    Google Scholar 

  36. V. Sander, D. Erwin, and V. Huber. High-performance computer management based on Java. Lecture Notes in Computer Science, 1401:526ff, 1998.

    Google Scholar 

  37. M. Schwehm and T. Walter. Mapping and scheduling by genetic algorithms. Lecture Notes in Computer Science, 854:832ff, 1994.

    Google Scholar 

  38. Uwe Schwiegelshohn. Preemptive weighted completion time scheduling of parallel jobs. In Josep Díaz and Maria Serna, editors, Algorithms ESA’ 96, Fourth Annual European Symposium, volume 1136 of Lecture Notes in Computer Science, pages 39–51, Barcelona, Spain, 25-27 September 1996. Springer.

    Google Scholar 

  39. Uwe Schwiegelshohn and Ramin Yahyapour. Analysis of first-come-first-serve parallel job scheduling. In Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 629–638, San Francisco, California, 25-27 January 1998. i. A comparative analysis of static processor partitioning policies for parallel computers. In Internat. Workshop on Modeling and Simulation of Computer and Telecommunication Systems (MASCOTS), pages 283–286, January 1993.

    Google Scholar 

  40. Jon Siegel. CORBA: Fundamentals and Programming. John Wiley & Sons Inc., New York, 1 edition, 1996.

    Google Scholar 

  41. Larry Smarr and Charles E. Catlett. Metacomputing. Communications of the ACM, 35 (6):44–52, June 1992.

    Article  Google Scholar 

  42. W. Smith, I. Foster, and V. Taylor. Predicting application run times using historical information. Lecture Notes in Computer Science, 1459:122ff, 1998.

    Google Scholar 

  43. A. W. van Halderen, Benno J. Overeinder, Peter M. A. Sloot, R. van Dantzig, Dick H. J. Epema, and Miron Livny. Hierarchical resource management in the polder metacomputing initiative. submitted to Parallel Computing, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gehring, J., Preiss, T. (1999). Scheduling a Metacomputer with Uncooperative Sub-schedulers. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1999. Lecture Notes in Computer Science, vol 1659. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47954-6_10

Download citation

  • DOI: https://doi.org/10.1007/3-540-47954-6_10

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66676-9

  • Online ISBN: 978-3-540-47954-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics