Metacomputing in gigabit environments: Networks, tools, and applications
Introduction
During the last years metacomputing has become a catchword among the supercomputing community. Like other catchwords it is mostly unclear what it is supposed to mean. However, commonly it describes some sort of linking together of computational resources that compete with supercomputers or try to outperform them, at least theoretically. This development is driven mainly by two ideas.
First, supercomputing resources are expensive and have a short life cycle. They should be shared between different research centers for economical reasons. Such resources not only include supercomputers of different architectures (massively parallel and vector-based), but also high quality visualization hardware like the CAVE [1]and other devices that produce or consume data at high rates. An example for the latter are Magnetic Resonance (MR) Tomographs. Combining these resources leads to a heterogeneous metacomputer. Typical examples for such a scenario are the coupled simulation of groundwater flow and transport of contaminants in the groundwater and the real-time visualization of brain activity as described in this paper.
The second idea is that the coupling of supercomputers offers a way to increase the peak performance of a machine. In principal two T3Es could be twice as powerful as one. Typical applications for such a homogeneous metacomputing scenario are Monte Carlo codes as described in this paper.
Since the efforts that have to be made to couple such powerful machines is high and the benefit is limited to very loosely coupled applications it is obvious that metacomputing as described here is restricted to a limited number of special applications. However, those applications can then benefit substantially from the accumulated performance of a metacomputing environment.
Metacomputing currently faces a number of problems. Some of which are well understood, others still have to be investigated thoroughly. Following a layered approach the problems are threefold.
First, there is the network problem. Coupling of remote resources requires fast and reliable networks. A resource like the internet is not designed to support the traffic characteristics of a metacomputing application. A software relying on the internet may therefore sometimes yield acceptable results and sometimes fail completely. One of the prerequisites for metacomputing is therefore to be able to provide the application with a stable and fast network connection that can be dedicated to a single application run. Typically such quality of service can be provided by ATM. But, so far, ATM-networks for research activities are not yet widely available.
Second, there is the communication problem. While each hardware vendor has adopted the MPI standard and provides his users with fast and stable implementations, there is no support for metacomputing. Since even the MPI-Forum has refused to put the topic on its todo list it is up to the user to find ways how to overcome the problem. PVM definitely is designed to overcome that problem. But then PVM is no longer the standard in the field and most users have moved to MPI and do not want to change their code for metacomputing experiments. A tool to bridge the gap between PVM and MPI would be PVMPI [9]. But again this would require the user to substantially change his code. It has therefore become necessary to set up tools that provide the user with a global MPI – often called an interoperable MPI. One such tool, PACX-MPI, is described in this paper.
Third, there is the application level at which one has to consider the limitations of metacomputing. Latencies even on fast networks tend to go up to several milliseconds. Even traveling at the speed of light a message traveling from Germany to the US will take about 25 ms. And even though bandwidths are constantly increasing it is unlikely that external bandwidths will ever be able to compete with the internal bandwidth of a highly integrated MPP. Applications have therefore to take into consideration a substantially higher latency for communication between machines. And in addition they will have to deal with the problem of bandwidths that vary by orders of magnitude between internal and external communication. An approach to overcome this problem is latency hiding by overlapping communication with calculation. Since it is often a non-trivial task to incorporate such methods in an application, supporting libraries would be useful. Such a library is described in the load balancing section of this paper.
Besides these technical problems there is a number of organizational ones. Running metacomputing applications requires a synchronization the computing resources of several computing centers. Furthermore, the input and output data have to be distributed and collected. This is not a real problem as long as metacomputing is performed on an experimental basis. But in a production environment a secure and consistent access to distributed computing resources and data will be essential too. Research projects that address these topics include UNICORE [2]and HPCM [3].
In the following, our current activities and results in the above mentioned aspects of metacomputing: networking, tools and applications will be discussed.
Section snippets
Networks
A key factor for the success of metacomputing activities are communication networks that provide high-bandwidth and low-latency connections between the components of the metacomputer. Generally, the performance available over wide area networks is low compared to the communication within a parallel computer.
In Germany, the network that connects research, science and educational institutions with each other and the rest of the internet is operated by the DFN-Verein, an association of these
PACX-MPI
PACX-MPI was developed to allow to extend MPI [4]communication beyond the boundary of an MPP system. Typically, on such systems an optimized version of MPI is offered that does not allow to communicate outside that system. Only recently commercial implementations have come up that allow to run one single MPI application across a series of machines. But then again the user is restricted to one hardware vendor 5, 6. Public domain implementations of MPI like MPICH [7]support clusters of machines
TRACE/PARTRACE
The program TRACE (Transport of Contaminants in Environmental Systems) simulates the flow of water in variably saturated, porous, heterogeneous media. It is used in combination with the program PARTRACE (PARticle TRACE) for 3-D simulations of particle transport in ground water [30]. The programs have been developed at the Institute for Petroleum and Organic Geochemistry at the Forschungszentrum Jülich. TRACE is based upon 3DFEMWATER, a ground water simulation code by Yeh [31]. PARTRACE performs
Conclusion
The examples in this contribution show that various problems we have to face in metacomputing environments are quite well understood. The evolution of the Wide Area networks is continuously enhancing the available bandwidth, but the latency is already approaching the limit imposed by the speed of light. Therefore applications have to be selected carefully for metacomputing. To keep the effort for porting applications to a metacomputer acceptable supporting tools and libraries are essential.
Acknowledgements
The authors gratefully acknowledge support from Pittsburgh Supercomputing Center and supercomputing time provided by the San Diego Supercomputing Center. We also wish to thank the BMBF for funding parts of this work and the DFN for its support.
References (37)
- et al.
A high-performance, portable implementation of the MPI message passing interface standard
Parallel Comput.
(1996) - et al.
The CAVE: Audio visual experience automatic virtual environment
Comm. ACM
(1992) - D. Erwin, The UNICORE Architecture and Project Plan, Workshop on Seamless Computing, ECMWF, Reading, 16–17 September...
- V. Sander, High Performance Computer Management, Workshop Hypercomputing, Rostock, 8–11 September...
- Message Passing Interface Forum, MPI: A Message-Passing Interface Standard, University of Tennessee,...
- R. Bourbonnais, The Thinking behind SUN's MPI Machines, The Fourth EurPVM-MPI Users' Group Meeting, Cracow, Poland, 3–5...
- P. Romero, Message passing interface on HP exemplar systems, in: The Fourth EurPVM-MPI Users' Group Meeting, Cracow,...
- A. Geist, PVM 3 User's Guide and Reference Manual, ORNL/TM-12187,...
- G.E. Fagg, J.J. Dongarra, PVMPI: An integration of the PVM and MPI systems, Department of Computer Science Technical...
- M. Brune, J. Gehring, A. Reinefeld, A lightweight communication interface for parallel programming environments, in:...
Cited by (4)
A fair benchmark for evaluating the latent potential of heterogeneous coupled clusters
2007, Sixth International Symposium on Parallel and Distributed Computing, ISPDC 2007Coupling general circulation models on a meta-computer
2003, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Metacomputing in a high performance computing center
2000, Proceedings of the International Conference on Parallel Processing WorkshopsDistributed computing in a heterogeneous computing environment
1998, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
- 1
E-mail: [email protected]