PARDIS: Programmer-level abstractions for metacomputing

https://doi.org/10.1016/S0167-739X(99)00015-1Get rights and content

Abstract

The potential offered by metacomputing is hard to realize due to the complexity of programming geographically distributed applications spanning different software systems. This paper describes PARDIS, a system designed to address this challenge, based on ideas underlying the Common Object Request Broker Architecture (CORBA), a successful industry standard. PARDIS is a distributed environment in which objects representing data-parallel computations, called Single Program Multiple Data (SPMD) objects, as well as non-parallel objects present in parallel programs, can interact with each other across platforms and software systems. Each of these objects represents a small encapsulated application and can be used as a building block in the construction of powerful distributed metaapplications. The objects interact through interfaces specified in the Interface Definition Language (IDL), which allows the programmer to integrate within one metaapplication component implemented using different software systems. Further, support for non-blocking interactions between objects allows PARDIS to build concurrent distributed scenarios.

Introduction

Due to recent technical innovations, networking technologies have reached performance levels acceptable for supercomputing and have enabled a new class of high-performance applications which are able to exploit the heterogeneity of diverse software systems and geographically distributed resources. The advantages that these metacomputing environments [1], [2] offer are enormous, however, they are seldom realized due to the difficulty of building such systems. Up to now metacomputing applications have usually been developed in an ad hoc fashion, trying to explicitly combine different communication libraries and developing special-case tools [3]. Systems constructed in this way usually require extensive modifications to the original application code in order to integrate it into a metacomputing environment and result in software which is complex, and difficult to debug and maintain. This problem gives rise to the need for a general-purpose metacomputing environment which would provide abstractions suitable for easy and efficient coupling of high-performance scientific applications.

In order to fulfill these requirements, reduce the complexity, and accelerate the process of developing these applications, such system should implement application-level interaction of heterogeneous, parallel components in a distributed environment. That is, it should allow the programmer to build metaapplications from independently developed and tested components which once implemented can be reused as building blocks in many scenarios. In order to achieve this, such system must be able to define interactive interfaces to components which will hide enough to free the programmer from the necessity to understand the implementation details of any particular component, and at the same time be flexible enough to define the components accurately and enable efficient interaction with them. Hiding implementation details should also allow combining components developed using different software systems and in this way enable metaapplications to exploit heterogeneity of their components. Further, the system should provide an elegant and flexible way of handling remote interactions to ensure interoperability with remote components.

In the field of distributed computing a similar set of problems has been addressed by DCOM [4] and the Common Object Request Broker Architecture (CORBA) [5], which has successfully built an infrastructure supporting the integration of heterogeneous distributed applications within one system. However, the abstractions implemented in CORBA are not always efficient or suitable for parallel computing; for example they create sequential bottlenecks in interactions with data-parallel applications. In this paper we will describe PARDIS, a system which employs the ideas introduced in CORBA to suit the needs of PARallel DIStributed objects.

PARDIS builds on CORBA in that it allows the programmer to construct metaapplications using independently developed heterogeneous components interacting through interfaces specified in an Interface Definition Language (IDL) common to all components. Using interfaces defined in this way the components can be plugged into a reusable infrastructure capable of combining remote or local components into metaapplications. The fact that the combining infrastructure is reusable, and that the objects need only to provide an interface in order to use it (no need to modify implementation), allows metaapplications to be easily put together and taken apart, and their components to be reused in building other metaapplications. PARDIS modifies the set of abstractions implemented in CORBA by introducing Single Program Multiple Data (SPMD) objects representing data-parallel computations, and distributed sequences representing the notion of distributed data in a parallel program. Further, PARDIS accommodates the need for concurrency in high-performance applications by allowing non-blocking invocations, and, aside from a general-purpose solution suitable for all data-parallel systems provides direct interoperability with two existing data-parallel libraries: POOMA [6] and HPC++ PSTL [7].

We will begin to discuss PARDIS by describing in detail its architecture, implementation and the ways in which a programmer can interface it. We will move on to evaluate the main abstraction introduced in PARDIS: SPMD objects, and finally give a few concrete examples of how PARDIS can be used to construct metaapplications.

Section snippets

Design and implementation of PARDIS

In this section we will discuss a high-level view of the design and implementation of PARDIS. The ideas described here have been prototyped and tested on SGI multiprocessors, IBM SP2 and clusters of Sun workstations.

Performance evaluation of SPMD objects

SPMD objects are a useful abstraction in interaction with data-parallel applications, as they ensure request delivery to all the computing threads of these applications. In addition, access to all computing threads of interacting programs, and knowledge of data distribution in each program, allows the ORB to compute a transfer schedule which enables it to send data directly between corresponding threads of the client and server. This can bring significant advantages by parallelizing the

Concurrent execution of data-parallel components

This example will provide a basic introduction to programming with PARDIS, and describe three of its features: non-blocking invocations, locality transparency and ease of manipulation of dynamically-sized data structures. We will consider a scenario in which the same system of linear equations is solved by a direct method and an iterative method; the returned solutions are then compared to calculate agreement between these two methods.

Both solvers were implemented as SPMD servers successively

Conclusions and future work

This paper described PARDIS, an environment allowing the programmer to construct metaapplications from high-performance components residing on distributed servers. Organizing the interaction between components on application-level, and using an IDL to represent them to each other, allowed us to combine within one metaapplication components implemented using different parallel libraries. Further, we have discussed examples which show that PARDIS contains support for abstractions which allow the

Katarzyna Keahey ([email protected]) is a researcher at the Advanced Computing Laboratory in Los Alamos National Laboratory. Her research interests focus on component architectures and programming, run-time systems, and performance engineering. She received an M.S. in Computer Science from Indiana University in 1994, and later, in 1998, a Ph.D. in Computer Science from the same institution.

References (13)

  • C. Catlett et al.

    Metacomputing

    Communications of the ACM

    (1992)
  • I. Foster et al.

    Globus: A Metacomputing Infrastructure Toolkit

    The International J. Supercomputer Applications and High Performance Computing

    (1997)
  • M.L. Norman et al.

    Galaxies Collide on the I-WAY: An Example of Heterogeneous Wide-Area Collaborative Supercomputing

    The International Journal of Supercomputer Applications and High Performance Computing

    (1996)
  • R. Sessions, COM and DCOM: Microsoft’s Vision for Distributed Objects, Wiley,...
  • OMG, The Common Object Request Broker: Architecture and Specification, Revision 2.0, OMG Document, June...
  • S. Atlas, S. Banerjee, J.C. Cummings, P.J. Hinker, M. Srikant, J.V.W. Reynders, M. Tholburn, POOMA: A High Performance...
There are more references available in the full text version of this article.

Cited by (2)

  • Programming the Grid with POP-C ++

    2007, Future Generation Computer Systems
    Citation Excerpt :

    Therefore, our OD can be customized, based on the real input parameters of the object. On the tool aspect, COBRA [20] and Parallel Data CORBA [21] extend CORBA by encapsulating several distributed components (object parts) within an object and by implementing the data parallelism based on automatic data partitioning. This differs from our approach in which each parallel object resides in a single memory address space and the parallelism is achieved by parallel executions among objects and concurrent executions of methods within the same object.

  • Implementation of a CORBA-based metacomputing system

    2001, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Katarzyna Keahey ([email protected]) is a researcher at the Advanced Computing Laboratory in Los Alamos National Laboratory. Her research interests focus on component architectures and programming, run-time systems, and performance engineering. She received an M.S. in Computer Science from Indiana University in 1994, and later, in 1998, a Ph.D. in Computer Science from the same institution.

View full text