Skip to main content

Parallel processing on dynamic resources with CARMI

  • Conference paper
  • First Online:
Job Scheduling Strategies for Parallel Processing (JSSPP 1995)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 949))

Included in the following conference series:

Abstract

In every production parallel processing environment, the set of resources potentially available to an application fluctuate due to changes in the load on the system. This is true for clusters of workstations which are an increasingly popular platform for parallel computing. Today's parallel programming environments have largely succeeded in making the communication aspect of parallel programming much easier, but they have not provided adequate resource management services which are needed to adapt to such changes in availability. To fill this need, we have developed CARMI, a resource management system, aimed at allowing a parallel application to make use of all available computing power. CARMI permits an application to grow as new resources become available, and shrink when resources are reclaimed. Building upon CARMI, we have also developed WoDi which provides a simple interface for writing master-workers programs in a dynamic resource environment. Both CARMI and WoDi are operational, and have been used on a pool of more than 200 workstations managed by the Condor batch system. Experience with the two systems has shown them to be easy to use, and capable of providing large numbers of cycles to parallel applications even in a real-life production environment in which no resources are dedicated to parallel processing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Mutka and M. Livny, “The available capacity of a privately owned workstation environment,” Performance Evaluation, vol. 12, pp. 269–284, July 1991.

    Google Scholar 

  2. A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderain, “Pvm 3 user's guide and reference manual,” Tech. Rep. ORNL/TM-12187, Oak Ridge National Laboratory, May 1993.

    Google Scholar 

  3. R. Butler and E. Lusk, “Monitors, messages and clusters: The p4 parallel programming system,” Parallel Computing, vol. 20, pp. 547–564, April 1994.

    Google Scholar 

  4. M. J. Litzkow, M. Livny, and M. W. Mutka, “Condor: A hunter of idle workstations,” in Proceedings of the 8th International Conference on Distributed Computing Systems, pp. 104–111, June 1988.

    Google Scholar 

  5. S. Zhou, J. Wang, X. Zheng, and P. Delisle, “Utopia: A load sharing facility for large, heterogeneous distributed computing systems,” Tech. Rep. CSRI-257, Computer Systems Research Institute, University of Toronto, April 1992.

    Google Scholar 

  6. IBM Corporation, IBM LoadLeveler: User's Guide, 1993.

    Google Scholar 

  7. J. Pruyne and M. Livny, “Providing resource management services to parallel applications,” in Proceedings of the Second Workshop on Environments and Tools for Parallel Scientific Computing (J. Dongarra and B. Tourancheau, eds.), SIAM Proceedings Series, pp. 152–161, SIAM, May 1994.

    Google Scholar 

  8. A. Bricker, M. Litzkow, and M. Livny, “Condor technical summary,” Tech. Rep. 1069, Computer Sciences Department, University of Wisconsin-Madison, January 1992.

    Google Scholar 

  9. D. Gelernter and D. Kaminsky, “Supercomputing out of recycled garbage: Preliminary experience with piranha,” in Proceedings of the ACM, International Conference on Supercomputing, July 1992.

    Google Scholar 

  10. D. Gelernter, “Generative communications in linda,” ACM Transactions on Programming Languages and Systems, vol. 7, pp. 80–112, January 1985.

    Google Scholar 

  11. W. A. Shelton, G. M. Stocks, R. G. Jordan, Y. Liu, L. Qui, D. D. Johnson, F. J. Pinski, J. B. Staunton, and B. Ginatempo, “First principles simulation of materials properties,” in Proceedings of SHPCC '94, pp. 103–110, May 1994.

    Google Scholar 

  12. R. L. Graham, “Bounds on multiprocessing timing anomalies,” Siam Journal of Applied Mathematics, vol. 17, no. 2, pp. 416–429, 1969.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Dror G. Feitelson Larry Rudolph

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pruyne, J., Livny, M. (1995). Parallel processing on dynamic resources with CARMI. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1995. Lecture Notes in Computer Science, vol 949. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60153-8_33

Download citation

  • DOI: https://doi.org/10.1007/3-540-60153-8_33

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60153-1

  • Online ISBN: 978-3-540-49459-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics