Abstract
Coscheduling has been shown to be a critical factor in achieving efficient parallel execution in timeshared environments [12, 19, 4]. However, the most common approach, gang scheduling, has limitations in scaling, can compromise good interactive response, and requires that communicating processes be identified in advance.
We explore a technique called dynamic coscheduling (DCS) which produces emergent coscheduling of the processes constituting a parallel job. Experiments are performed in a workstation environment with high performance networks and autonomous timesharing schedulers for each CPU. The results demonstrate that DCS can achieve effective, robust coscheduling for a range of workloads and background loads. Empirical comparisons to implicit scheduling and uncoordinated scheduling are presented. Under spin-block synchronization, DCS reduces job response times by up to 20% over implicit scheduling while maintaining fairness; and under spinning synchronization, DCS reduces job response times by up to two decimal orders of magnitude over uncoordinated scheduling. The results suggest that DCS is a promising avenue for achieving coordinated parallel scheduling in an environment that coexists with autonomous node schedulers.
Preview
Unable to display preview. Download preview PDF.
References
Nanette J. Boden, Danny Cohen, Robert E. Felderman, Alan E. Kulawik, Charles L. Seitz, Jakov N. Seizovic, and Wen-King Su. Myrinet—a gigabit-per-second local-area network. IEEE Micro, 15(1):29–36, February 1995. Available from http://www.myri.com/research/publications/Hot.ps.
Rohit Chandra, Scott Devine, Ben Verghese, Anoop Gupta, and Mendel Rosenblum. Scheduling and page migration for multiprocessor compute servers. In Proceedings of the Sixth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 12–24, San Jose, California, 1994.
Andrea C. Dusseau, Remzi H. Arpaci, and David E. Culler. Effective distributed scheduling of parallel workloads. In ACM SIGMETRICS '96 Conference on the Measurement and Modeling of Computer Systems, 1996. Available from http://www.cs.berkeley.edu/~dusseau/Papers/sigmetrics96.ps.
Dror G. Feitelson and Larry Rudolph. Distributed hierarchical control for parallel processing. IEEE Computer, 23(5):65–77, May 1990.
Dror G. Feitelson and Larry Rudolph. Gang Scheduling Performance Benefits for Fine-Grained Synchronization. Journal of Parallel and Distributed Computing, 16(4):306–18, December 1992.
Dror G. Feitelson and Larry Rudolph. Coscheduling based on run-time identification of activity working sets. International Journal of Parallel Programming, 23(2):135–160, April 1995.
Richard B. Gillett. Memory Channel network for PCI. IEEE Micro, 16(1):12–18, February 1996. Available from http://www.computer.org/pubs/micro/web/mlgil.pdf.
Anoop Gupta, Andrew Tucker, and Shigeru Urushibara. The impact of operating system scheduling policies and synchronization methods on the performance of parallel applications. In ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pages 120–132, May 1991. Available from http://xenon.stanford.edu/~tucker/papers/sigmetrics.ps.
D. B. Gustavson. The scalable coherent interface and related standards projects. IEEE Micro, 12(1), Feb. 1992.
Sun Microsystems Inc. ts_dptbl(4) manual page. SunOS 5.4 Manual. Section 4.
Mario Lauria and Andrew Chien. MPI-FM: High performance MPI on workstation clusters. Submitted to the Journal of Parallel and Distributed Computing. Available from http://www-csag.cs.uiuc.edu/papers/mpi-fm.ps.
John K. Ousterhout. Scheduling techniques for concurrent systems. In Proceedings of the 3rd International Conference on Distributed Computing Systems, pages 22–30, October 1982.
Scott Pakin, Vijay Karamcheti, and Andrew A. Chien. Fast Messages (FM): Efficient, portable communication for workstation clusters and massively-parallel processors. IEEE Concurrency, 1997.
Scott Pakin, Mario Lauria, Matt Buchanan, Kay Hane, Louis Giannini, Jane Prusakova, and Andrew Chien. Fast Messages 2.0 User Documentation, October 1996.
Scott Pakin, Mario Lauria, and Andrew Chien. High performance messaging on workstations: Illinois Fast Messages (FM) for Myrinet. In Supercomputing, December 1995. Available from http://www-csag.cs.uiuc.edu/papers/myrinet-fm-sc95.ps.
Patrick G. Sobalvarro. Demand-based Coscheduling of Parallel Jobs on Multi-programmed Multiprocessors. PhD thesis, Massachusetts Institute of Technology, 1997. MIT/LCS/TR-710.
Patrick G. Sobalvarro and William E. Weihl. Demand-based coscheduling of parallel jobs on multiprogrammed multiprocessors. In Proceedings of the Parallel Job Scheduling Workshop at IPPS '95, 1995. Available from http: //www.psg.les.mit.edu/~pgs/papers/ jsw-for-springer.ps. Also appears in Springer-Verlag Lecture Notes in Computer Science, Vol. 949.
Andrew Tucker. Efficient scheduling on multiprogrammed shared-memory multi-processors. Technical Report CSL-TR-94-601, Stanford University Department of Computer Science, November 1993. Available from http://elib.stanford.edu/ Dienst/UI/2.0/Describe/stanford.cs/CSL-TR-94-601.
Andrew Tucker and Anoop Gupta. Process control and scheduling issues for multiprogrammed shared-memory multiprocessors. In Proceedings of the 12th ACM SIGOPS Symposium on Operating Systems Principles, pages 159–186, 1989. Available from http://xenon.stanford.edu/~tucker/papers/sosp.ps.
T. von Eicken, D. Culler, S. Goldstein, and K. Schauser. Active Messages: a mechanism for integrated communication and computation. In Proceedings of the International Symposium on Computer Architecture, 1992.
Thorsten von Eicken, Anindya Basu, Vineet Buch, and Werner Vogels. U-Net: A user-level network interface for parallel and distributed computing. In Proceedings of the 15th ACM Symposium on Operating Systems Principles, December 1995. Available from http://www.cs.cornell.edu/Info/Projects/ATM/sosp.ps.
Carl A. Waldspurger. Lottery and Stride Scheduling: Flexible Proportional-Share Resource Management. PhD thesis, Massachusetts Institute of Technology, 1995. MIT/LCS/TR-667.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sobalvarro, P.G., Pakin, S., Weihl, W.E., Chien, A.A. (1998). Dynamic coscheduling on workstation clusters. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1998. Lecture Notes in Computer Science, vol 1459. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0053990
Download citation
DOI: https://doi.org/10.1007/BFb0053990
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64825-3
Online ISBN: 978-3-540-68536-4
eBook Packages: Springer Book Archive