Runtime support for virtual BSP computer

Nibhanupudi, Mohan V.; Szymanski, Boleslaw K.

doi:10.1007/3-540-64359-1_685

Mohan V. Nibhanupudi¹ &
Boleslaw K. Szymanski¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1388))

Included in the following conference series:

International Parallel Processing Symposium

105 Accesses
2 Citations

Abstract

Several computing environments including wide area networks and nondedicated networks of workstations are characterized by frequent unavailability of the participating machines. Parallel computations, with interdependencies among their component processes, can not make progress if some of the participating machines become unavailable during the computation. As a result, to deliver acceptable performance, the set of participating processors must be dynamically adjusted following the changes in computing environment. In this paper, we discuss the design of a run time system to support a Virtual BSP Computer that allows BSP programmers to treat a network of transient processors as a dedicated network. The Virtual BSP Computer enables parallel applications to remove computations from processors that become unavailable and thereby adapt to the changing computing environment. The run time system, which we refer to as adaptive replication system (ARS), uses replication of data and computations to keep current a mapping of a set of virtual processors to a subset of the available machines. ARS has been implemented and integrated with a message passing library for the Bulk-Synchronous Parallel (BSP) model. The extended library has been applied to two parallel applications with the aim of using idle machines in a network of workstations (NOW) for parallel computations. We present the performance results of ARS for these applications.

This work was partially supported by NSF Grant CCR-9527151. The content does not necessarily reflect the position or policy of the U.S. Government.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Analytical Estimation of the Scalability of Iterative Numerical Algorithms on Distributed Memory Multiprocessors

Article 25 May 2018

Active Objects for Coordinating BSP Computations (Short Paper)

Enumerated BSP Automata

References

Gilbert Cabillic and Isabelle Puaut. Stardust: an environment for parallel programming on networks of heterogeneous workstations. J. Parallel and Distributed Computing, 40(1), Jan 1997.
Google Scholar
Clemens H. Cap and Volker Strumpen. Efficient Parallel Computing in Distributed Workstation Environments. Parallel Computing, pages 1221–1234, 1993.
Google Scholar
Nicholas Carriero, Eric Freeman, Gelernter, and David Kaminsky. Adaptive Parallelism and Piranha. Computer, 28(1):40–49, January 1995.
Article Google Scholar
Message Passing Interface Forum. MPI: A Message Passing Interface Standard. Technical report, Message Passing Interface Forum, May 5, 1994.
Google Scholar
L. Kleinrock and W.Korfhage. Collecting Unused Processing Capacity: An Analysis of Transient Distributed Systems. IEEE Transactions on Parallel and Distributed Systems, 4(5), May 1993.
Google Scholar
J. Leon, Allan L. Fischer, and Peter Steenkiste. Fail-safe PVM: A portable package for distributed programming with transparent recovery. Technical report, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, Feb 1993.
Google Scholar
Michael J. Litzkow, Miron Livny, and Matt W. Mutka. Condor — A Hunter of Idle Workstations. In Proc. 8th Intl. Conf. Distributed Computing Systems, San Jose, California, June 13–17, 1988.
Google Scholar
Richard Miller. A Library for Bulk-synchronous Parallel Programming. In British Computer Society Workshop on General Purpose Parallel Computing, Dec 1993.
Google Scholar
M. V. Nibhanupudi, C. D. Norton, and B. K. Szymanski. Plasma Simulation On Networks Of Workstations Using The Bulk-Synchronous Parallel Model. In Proc. Intl. Conf. on Parallel and Distributed Processing Techniques and Applications (PDPTA'95), Athens, Georgia, Nov 1995.
Google Scholar
M. V. Nibhanupudi and B. K. Szymanski. Adaptive Parallelism In The Bulk-Synchronous Parallel model. In Proceedings of the Second International Euro-Par Conference, Lyon, France, Aug 1996.
Google Scholar
J. K. Ousterhout. Scheduling techniques for concurrent systems. In Proc. Third Intl. Conf. Distributed Computing Systems, Oct 1982.
Google Scholar
G. Stellner. CoCheck: Checkpointing and process migration for MPI. In Proceedings of the International Parallel Processing Symposium, April 1996.
Google Scholar
V. S. Sunderam. PVM: A Framework for Parallel Distributed Computing. Concurrency: Practice and Experience, 2(4):315–339, 1990.
Google Scholar
Leslie G. Valiant. A Bridging Model for Parallel Computation. Communications of the ACM, 33(8):103–111, August 1990.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Rensselaer Polytechnic Institute, 12180-3590, Troy, NY, USA
Mohan V. Nibhanupudi & Boleslaw K. Szymanski

Authors

Mohan V. Nibhanupudi
View author publications
You can also search for this author in PubMed Google Scholar
Boleslaw K. Szymanski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

José Rolim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nibhanupudi, M.V., Szymanski, B.K. (1998). Runtime support for virtual BSP computer. In: Rolim, J. (eds) Parallel and Distributed Processing. IPPS 1998. Lecture Notes in Computer Science, vol 1388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64359-1_685

Download citation

DOI: https://doi.org/10.1007/3-540-64359-1_685
Published: 08 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64359-3
Online ISBN: 978-3-540-69756-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics