Abstract
A practical problem faced by users of high-performance computers is: How can I automatically load balance my jobs across different batchq ueues, whichare in di.erent administrative domains, if there is no existing grid infrastructure? It is common to have user accounts for a number of individual high-performance systems (e.g., departmental, university, regional) that are administered by different groups. Without an administration-deployed grid infrastructure, one can still create a purely user-level aggregation of individual computing systems. The Trellis Project is developing the techniques and tools to take advantage of a user-level overlay metacomputer. Because placeholder scheduling does not require superuser permissions to set up or configure, it is well-suited to overlay metacomputers. This paper contributes to the practical side of grid and metacomputing by empirically demonstrating that placeholder scheduling can work across different administrative domains, across different local schedulers (i.e., PBS and Sun Grid Engine), and across different programming models (i.e., Pthreads, MPI, and sequential). We also describe a new metaqueue system to manage jobs with explicit workflow dependencies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
D. J. Barrett and R. E. Silverman. SSH, the Secure Shell: The Definitive Guide. O’Reilly and Associates, Sebastopol, CA, 2001. 208, 211
J. Bester, I. Foster, C. Kesselman, J. Tedesco, and S. Tuecke. GASS: A Data Movement and Access Service for Wide Area Computing Systems. In Proceedings of the Sixth Workshop on I/O in Parallel and Distributed Systems, 1999. 225
K. Czajkowski, I. Foster, N. Karonis, S. Martin, W. Smith, and S. Tuecke. A Resource Management Architecture for Metacomputing Systems. In D. G. Feitelson and L. Rudolph, editors, Job Scheduling Strategies for Parallel Processing, volume 1459 of Lecture Notes in Computer Science, pages 62–82. Springer-Verlag, 1998. 222
D. G. Feitelson, L. Rudolph, U. Schwiegelshohn, K. C. Sevcik, and P. Wong. Theory and Practice in Parallel Job Scheduling. In D. G. Feitelson and L. Rudolph, editors, Job Scheduling Strategies for Parallel Processing, volume 1291 of Lecture Notes in Computer Science, pages 1–34. Springer-Verlag, 1997. 224
I. Foster and C. Kesselman. Globus: A Metacomputing Infrastructure Toolkit. International Journal of Supercomputer Applications, 11(2):115–128, 1997. 205, 206, 207
J. Frey, T. Tannenbaum, M. Livny, I. Foster, and S. Tuecke. Condor-G: A Computation Management Agent for Multi-Institutional Grids. In Proceedings of the 10th International Symposium on High Performance Distributed Computing (HPDC-10), San Francisco, California, U. S.A, August 7–9 2001. 205, 207, 222
M. Goldenberg. A System For Structured DAG Scheduling. Master’s thesis, Dept. of Computing Science, University of Alberta, Edmonton, Alberta, Canada, in preparation. 205
R. Lake, J. Schaeffer, and P. Lu. Solving Large Retrograde-Analysis Problems Using a Network of Workstations. In Proceedings of Advances in Computer Chess 7, pages 135–162, Maastricht, Netherlands, 1994. University of Limburg. 215, 216, 224
E. D. Lazowska, J. Zahorjan, G. S. Graham, and K. C. Sevcik. Quantitative System Performance. Computer Systems Analysis Using Queueing Network Models. Prentice Hall, Inc., 1984. 224
M. R. Leuze, L.W. Dowdy, and K. H. Park. Multiprogramming a Distributed-Memory Multiprocessor. Concurrency-Practice and Experience, 1(1):19–34, September 1989. 224
X. Li, P. Lu, J. Schaeffer, J. Shillington, P. S. Wong, and H. Shi. On the Versatility of Parallel Sorting by Regular Sampling. Parallel Computing, 19(10):1079–1103, October 1993. Available at http://www.cs.ualberta.ca/∼paullu/. 213
G. Ma and P. Lu. PBSWeb: A Web-based Interface to the Portable BatchS ystem. In Proceedings of the 12th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), pages 24–30, Las Vegas, Nevada, U. S.A., November 6–9 2000. Available at http://www.cs.ualberta.ca/∼paullu/. 205
C. Pinchak. Placeholder Scheduling for Overlay Metacomputers. Master’s thesis, Dept. of Computing Science, University of Alberta, Edmonton, Alberta, Canada, in preparation. 205
C. Pinchak and P. Lu. Placeholders for Dynamic Scheduling in Overlay Metacomputers: Design and Implementation. Journal of Parallel and Distributed Computing. Under submission to special issue on Computational Grids. 205, 206, 208, 209
L. Rudolph, M. Slivkin-Allalouf, and E. Upfal. A Simple Load Balancing Scheme for Task Allocation In Parallel Machines. In Proceedings of the 3rd Annual ACM Symposium on Parallel Algorithms and Architectures, pages 237–245, Hilton Head, South Carolina, U.S.A, July 21–24 1991. ACM Press. 224
J. Siegel and P. Lu. User-Level Remote Data Access in Overlay Metacomputers. In Proceedings of the 4th IEEE International Conference on Cluster Computing, September 2002. 205, 226
H. Stockinger, A. Samar, B. Allcock, I. Foster, K. Holtman, and B. Tierney. File and Object Replication in Data Grids. In Proceedings of the 10th International Symposium on High Performance Distributed Computing (HPDC-10), San Francisco, California, U.S.A, August 7–9 2001. 225
B. S. White, M. Walker, M. Humphrey, and A. S. Grimshaw. LegionFS: A Secure and Scalable File System Supporting Cross-Domain High-Performance Applications. In SC2001: High Performance Networking and Computing., Denver, CO, November 10–16 2001. 225
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pinchak, C., Lu, P., Goldenberg, M. (2002). Practical Heterogeneous Placeholder Scheduling in Overlay Metacomputers: Early Experiences. In: Feitelson, D.G., Rudolph, L., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2002. Lecture Notes in Computer Science, vol 2537. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36180-4_11
Download citation
DOI: https://doi.org/10.1007/3-540-36180-4_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00172-0
Online ISBN: 978-3-540-36180-0
eBook Packages: Springer Book Archive