Parallel adaptive computing on meta-systems including NOWs

doi:10.1016/S0167-8191(99)00105-2

Parallel Computing

Volume 26, Issues 2–3, February 2000, Pages 267-284

https://doi.org/10.1016/S0167-8191(99)00105-2 Get rights and content

Abstract

Load analysis of meta-systems including NOWs or COWs has shown that only a few percentage of the available power is used during long periods of time. Therefore, in order to exploit the idle time when executing a parallel application work load must be sent to a machine as soon as the latter becomes available. Furthermore, in order to keep respected the ownership of workstations work has to be stopped and resumed later as soon as the machine executing it is requisitioned by its owner. As a consequence, users need an adaptive system allowing to return events related to the goings and comings of workstations. On the other hand, it is necessary to provide them a parallel adaptive programming methodology that plans the handling of these events.

In this paper, we present the MARS (MARS: multi-user adaptive resource scheduler, developed at LIFL laboratory, Université de Lille I) system and its parallel adaptive programming methodology through the block-based Gauss–Jordan algorithm used in numerical analysis to invert large matrices. Moreover, we propose a work scheduling strategy and an application-oriented solution for the fault tolerance issue. Furthermore, we present some experimental results obtained on a DEC/ALPHA COW and a SUN/Sparc4 NOW. The results show that very high absolute efficiencies can be obtained if the size of the blocks is well chosen. We also present some experimentations related to the adaptability of the application in a meta-system including the DEC/ALPHA COW and the SUN/Sparc4 NOW. The results show that the management of the adaptability consumes just a short percentage of execution time.

Introduction

Over the past few years, the parallel computing has known the emergence of the networks (NOWs) and clusters (COWs) of workstations. Particularly, for applications with little or no time constraints NOWs/COWs compete seriously supercomputers like Intel Paragon, IBM SP2, etc. The main reasons of that are: first, NOWs are scalable; second, the performance/cost ratio of workstations constantly increase; third, the technologic evolution of the networks like Myrinet, ATM, GigaEthernet, etc., makes the latter more and more fast; and finally, the arrival of parallel programming environments like PVM [1] and MPI [2] has promoted the NOW/COW programming.

Obviously, NOWs/COWs do not make supercomputers coming to the end, but conversely both of them can be integrated in a platform called a meta-system. Meta-systems including a NOW/COW has some characteristics which have a great impact on the parallel programming methodology. The major of them are the following:

•
Only a few percentage of the available power is used during long periods of time. Indeed, a load analysis presented in [3] on a meta-system of over 50 machines (DEC-ALPHA/OSF-1 processors, PCs/Linux and SunSparc4/(SunOS, Solaris)). The results show that the average observed idle time on a working day (24 h) is over 83%.
•
Failures are frequent. According to an experimental study [4] done on a large meta-system and on a working day (24 h), one failure (hard and soft failures, a machine reboot for example) occurs every 2.78 h.
•
The last characteristic is a psychologic factor. In fact, users do not want others running processes on their own workstations when they are using them.

Therefore, according to these statements programming on meta-systems including a NOW/COW requires to dispose of a programming environment allowing to automatically (adaptivity) inform the user about the occurrence of events related to the availability of machines (their goings and comings), and the fluctuation of their load and their failures. MARS¹ [3], Piranha [5] and CARMI [6] are examples of such environment. On the other hand, it is necessary to plan at application level (adaptability) a “reaction” for each event. Thus, the parallel programming methodology must be changed so that it takes into account the adaptive control of parallelism.

In this paper, we present the MARS [3] system and its parallel adaptive programming methodology through the block-based Gauss–Jordan algorithm [7], [8], [9] used in numerical analysis to invert large matrices. MARS with its programming methodology aims at:

•
Exploiting the idle time of the machines, i.e., as soon as a machine becomes available it is owned for executing some work of the application. That operation is referred to as an application unfold.
•
Keeping respected the machines' ownership, it means that when a machine is requisitioned by its owner the work, part of the application, is pulled out. That operation is referred to an application fold.
•
Solving, in some way, the fault tolerance problem: if a failure of a given machine occurs only the work running on this latter is re-started.

The Gauss–Jordan algorithm is a direct method used in numerical analysis to invert large dense matrices. A classical parallel version of this application has already been developed in [8], [9]. In [8], the application is implemented on a MIMD machine which is constituted by 8 processor elements (PEs), 4 data migration controllers and 3 levels shared memory, interconnected by an OMEGA network. The method includes a real-time scheduler that schedules the tasks of the application according to a tasks' graph determined by hand. In [9], the proposed method is implemented with the Occam language on a network of 8 transputers connected as a cube. It is based on block row operations, i.e., a block row of the matrix to be inverted is sent to each transputer. During computation, each processor produces a block of intermediate results and broadcasts it to all the transputers.

Unlike these versions, our implementation is parallel adaptive. Its idea is the following: a process (server) maintains the two matrices A (to be inverted) and B (the result of the inversion) of the algorithm. According to the availability/lack of availability of machines it extracts blocks from these matrices and distributes or redistributes them among the machines. These blocks are used by workers created on available machines to compute new blocks which are returned to the server which uses them to update the matrices A and B. If a machine computing a block is requisitioned by its owner then the partial resulting block is folded; then, it is unfolded on an available machine which resumes it exactly at its interruption point. Moreover, if a machine fails its work is re-started on an available machine.

The remainder of this paper is organized as follows: Section 2 describes the MARS system and its parallel adaptive programming methodology. Section 3 presents the sequential version of the block-based Gauss–Jordan algorithm and Section 4 describes its parallelization. In Section 5, we present our proposed fault tolerant parallel adaptive version of the algorithm and its implementation with MARS. In Section 6, we present its performance evaluation. Finally, in Section 7 we give some conclusions and some investigations of our future work.

Section snippets

MARS: A parallel adaptive programming environment

MARS is a multi-threaded environment that allows programming parallel adaptive applications. It is built (see Fig. 1) on top of PM² [10], a parallel multi-threaded execution support that couples two libraries: the PVM communication library and a threads' package called MARCEL [10].

The characteristics of MARS [3] are mainly:

•
Portability: MARS is portable on Sparc/SunOS, Sparc/Solaris, Alpha/OSF-1 and PC/Linux.
•
Respect of machines' ownership: When a machine executing some work of a MARS application

The sequential algorithm

Let A and B be two matrices of dimension n and B be the inverted matrix of A. Let also A and B be partitioned into (q×q) matrix blocks of a fixed dimension $b (b=n/q)$ . The implemented method is the block version of Gauss–Jordan, i.e., it does not exist a global pivot strategy. Nevertheless, we use an element Gauss–Jordan method to invert each block A_k,k^k−1 with a pivot strategy at this level. As the sizes of the blocks are large enough (at least 100 for the presented results), the global

Parallelization

The parallelization of the block-based Gauss–Jordan algorithm consists of exploiting two kinds of parallelism: the inter-steps parallelism and the intra-step parallelism. The first one means that the q steps are partially or completely executed in parallel. The partial parallelism means that at the step k only the inversion of the pivot A_k,k^k+1 of the step k+1 is executed by anticipation, and the remainder of this step is triggered only at the end of the step k. The intra-step parallelism

MARS-based adaptability

The MARS system aims mainly at keeping respected the machines' ownership and at exploiting the wasted time when executing applications in adaptive environments. In such environments nodes (workstations, processors of parallel machines, etc.) are dynamically pulled out from and/or added to the virtual machine executing a parallel adaptive application. As a consequence, work must be dynamically interrupted when a node has to leave the virtual machine, and be resumed later as soon as a node joins

Performance evaluation

In this section, we first present the influence of the granularity of parallelism on the efficiency of the parallel execution of the algorithm. Then, we study the scalability of the algorithm. Finally, we evaluate the processing and communication costs involved in the execution of the application in an adaptive environment. The experimentation platform (Fig. 7) is composed of three networks interconnected by a FDDI network: the Research Laboratory (LIFL) Ethernet NOW, the Teaching Division

Conclusions and future work

Solving lon-running scientific applications requires to exploit all the available power of meta-systems including MPP, NOWs and COWs. In order to meet this objective it is necessary to solve some issues related to the adaptive nature (goings and comings of workstations and the fluctuation of the load) of meta-systems. The major of them are the following: the adaptive control of parallelism, the fault tolerance and the scheduling of work units.

First, the adaptive control of parallelism

References (14)

A. Geist, A. Beguelin, J. Dongarra et al., PVM: Parallel Virtual Machine, A User's guide and Tutorial for Networked...
MPI forum, MPI: A message passing interface standard, Technical report, April...
E.-G. Talbi, J-M. Geib, Z. Hafidi, D. Kebbal, MARS: An adaptive parallel programming environment, in: High Performance...
D. Kebbal, E.G. Talbi, J.M. Geib, A new approach for checkpointing parallel adaptive applications, in: PDPTA'97...
D. Gelernster, D.L. Kaminsky, Supercomputing out of recycled garbage: preliminary experience with Piranha, in: Sixth...
J. Pruyne, M. Livny, Parallel processing on dynamic resources with CARMI, in: Proceeding of the Workshop on Job...
C. Reinsch, F. Bauer, Inversion of positive matrices by the Gauss–Jordan method, in: J. Wilkinson, C. Reinsch (Eds.),...

There are more references available in the full text version of this article.

Cited by (0)

View full text