Skip to main content

Runtime support for virtual BSP computer

  • Workshop on Run-Time Systems for Parallel Programming Matthew Haines, University of Wyoming, USA Koen Langendoen, Vrije Universiteit, The Netherlands Greg Benson, University of Califonia at Davis, USA
  • Conference paper
  • First Online:
Parallel and Distributed Processing (IPPS 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1388))

Included in the following conference series:

Abstract

Several computing environments including wide area networks and nondedicated networks of workstations are characterized by frequent unavailability of the participating machines. Parallel computations, with interdependencies among their component processes, can not make progress if some of the participating machines become unavailable during the computation. As a result, to deliver acceptable performance, the set of participating processors must be dynamically adjusted following the changes in computing environment. In this paper, we discuss the design of a run time system to support a Virtual BSP Computer that allows BSP programmers to treat a network of transient processors as a dedicated network. The Virtual BSP Computer enables parallel applications to remove computations from processors that become unavailable and thereby adapt to the changing computing environment. The run time system, which we refer to as adaptive replication system (ARS), uses replication of data and computations to keep current a mapping of a set of virtual processors to a subset of the available machines. ARS has been implemented and integrated with a message passing library for the Bulk-Synchronous Parallel (BSP) model. The extended library has been applied to two parallel applications with the aim of using idle machines in a network of workstations (NOW) for parallel computations. We present the performance results of ARS for these applications.

This work was partially supported by NSF Grant CCR-9527151. The content does not necessarily reflect the position or policy of the U.S. Government.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Gilbert Cabillic and Isabelle Puaut. Stardust: an environment for parallel programming on networks of heterogeneous workstations. J. Parallel and Distributed Computing, 40(1), Jan 1997.

    Google Scholar 

  2. Clemens H. Cap and Volker Strumpen. Efficient Parallel Computing in Distributed Workstation Environments. Parallel Computing, pages 1221–1234, 1993.

    Google Scholar 

  3. Nicholas Carriero, Eric Freeman, Gelernter, and David Kaminsky. Adaptive Parallelism and Piranha. Computer, 28(1):40–49, January 1995.

    Article  Google Scholar 

  4. Message Passing Interface Forum. MPI: A Message Passing Interface Standard. Technical report, Message Passing Interface Forum, May 5, 1994.

    Google Scholar 

  5. L. Kleinrock and W.Korfhage. Collecting Unused Processing Capacity: An Analysis of Transient Distributed Systems. IEEE Transactions on Parallel and Distributed Systems, 4(5), May 1993.

    Google Scholar 

  6. J. Leon, Allan L. Fischer, and Peter Steenkiste. Fail-safe PVM: A portable package for distributed programming with transparent recovery. Technical report, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, Feb 1993.

    Google Scholar 

  7. Michael J. Litzkow, Miron Livny, and Matt W. Mutka. Condor — A Hunter of Idle Workstations. In Proc. 8th Intl. Conf. Distributed Computing Systems, San Jose, California, June 13–17, 1988.

    Google Scholar 

  8. Richard Miller. A Library for Bulk-synchronous Parallel Programming. In British Computer Society Workshop on General Purpose Parallel Computing, Dec 1993.

    Google Scholar 

  9. M. V. Nibhanupudi, C. D. Norton, and B. K. Szymanski. Plasma Simulation On Networks Of Workstations Using The Bulk-Synchronous Parallel Model. In Proc. Intl. Conf. on Parallel and Distributed Processing Techniques and Applications (PDPTA'95), Athens, Georgia, Nov 1995.

    Google Scholar 

  10. M. V. Nibhanupudi and B. K. Szymanski. Adaptive Parallelism In The Bulk-Synchronous Parallel model. In Proceedings of the Second International Euro-Par Conference, Lyon, France, Aug 1996.

    Google Scholar 

  11. J. K. Ousterhout. Scheduling techniques for concurrent systems. In Proc. Third Intl. Conf. Distributed Computing Systems, Oct 1982.

    Google Scholar 

  12. G. Stellner. CoCheck: Checkpointing and process migration for MPI. In Proceedings of the International Parallel Processing Symposium, April 1996.

    Google Scholar 

  13. V. S. Sunderam. PVM: A Framework for Parallel Distributed Computing. Concurrency: Practice and Experience, 2(4):315–339, 1990.

    Google Scholar 

  14. Leslie G. Valiant. A Bridging Model for Parallel Computation. Communications of the ACM, 33(8):103–111, August 1990.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Rolim

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nibhanupudi, M.V., Szymanski, B.K. (1998). Runtime support for virtual BSP computer. In: Rolim, J. (eds) Parallel and Distributed Processing. IPPS 1998. Lecture Notes in Computer Science, vol 1388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64359-1_685

Download citation

  • DOI: https://doi.org/10.1007/3-540-64359-1_685

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64359-3

  • Online ISBN: 978-3-540-69756-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics