Abstract
In every production parallel processing environment, the set of resources potentially available to an application fluctuate due to changes in the load on the system. This is true for clusters of workstations which are an increasingly popular platform for parallel computing. Today's parallel programming environments have largely succeeded in making the communication aspect of parallel programming much easier, but they have not provided adequate resource management services which are needed to adapt to such changes in availability. To fill this need, we have developed CARMI, a resource management system, aimed at allowing a parallel application to make use of all available computing power. CARMI permits an application to grow as new resources become available, and shrink when resources are reclaimed. Building upon CARMI, we have also developed WoDi which provides a simple interface for writing master-workers programs in a dynamic resource environment. Both CARMI and WoDi are operational, and have been used on a pool of more than 200 workstations managed by the Condor batch system. Experience with the two systems has shown them to be easy to use, and capable of providing large numbers of cycles to parallel applications even in a real-life production environment in which no resources are dedicated to parallel processing.
Preview
Unable to display preview. Download preview PDF.
References
M. Mutka and M. Livny, “The available capacity of a privately owned workstation environment,” Performance Evaluation, vol. 12, pp. 269–284, July 1991.
A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderain, “Pvm 3 user's guide and reference manual,” Tech. Rep. ORNL/TM-12187, Oak Ridge National Laboratory, May 1993.
R. Butler and E. Lusk, “Monitors, messages and clusters: The p4 parallel programming system,” Parallel Computing, vol. 20, pp. 547–564, April 1994.
M. J. Litzkow, M. Livny, and M. W. Mutka, “Condor: A hunter of idle workstations,” in Proceedings of the 8th International Conference on Distributed Computing Systems, pp. 104–111, June 1988.
S. Zhou, J. Wang, X. Zheng, and P. Delisle, “Utopia: A load sharing facility for large, heterogeneous distributed computing systems,” Tech. Rep. CSRI-257, Computer Systems Research Institute, University of Toronto, April 1992.
IBM Corporation, IBM LoadLeveler: User's Guide, 1993.
J. Pruyne and M. Livny, “Providing resource management services to parallel applications,” in Proceedings of the Second Workshop on Environments and Tools for Parallel Scientific Computing (J. Dongarra and B. Tourancheau, eds.), SIAM Proceedings Series, pp. 152–161, SIAM, May 1994.
A. Bricker, M. Litzkow, and M. Livny, “Condor technical summary,” Tech. Rep. 1069, Computer Sciences Department, University of Wisconsin-Madison, January 1992.
D. Gelernter and D. Kaminsky, “Supercomputing out of recycled garbage: Preliminary experience with piranha,” in Proceedings of the ACM, International Conference on Supercomputing, July 1992.
D. Gelernter, “Generative communications in linda,” ACM Transactions on Programming Languages and Systems, vol. 7, pp. 80–112, January 1985.
W. A. Shelton, G. M. Stocks, R. G. Jordan, Y. Liu, L. Qui, D. D. Johnson, F. J. Pinski, J. B. Staunton, and B. Ginatempo, “First principles simulation of materials properties,” in Proceedings of SHPCC '94, pp. 103–110, May 1994.
R. L. Graham, “Bounds on multiprocessing timing anomalies,” Siam Journal of Applied Mathematics, vol. 17, no. 2, pp. 416–429, 1969.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pruyne, J., Livny, M. (1995). Parallel processing on dynamic resources with CARMI. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1995. Lecture Notes in Computer Science, vol 949. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60153-8_33
Download citation
DOI: https://doi.org/10.1007/3-540-60153-8_33
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60153-1
Online ISBN: 978-3-540-49459-1
eBook Packages: Springer Book Archive