OPIOM: Off-Processor I/O with Myrinet☆
Introduction
The availability of powerful microprocessors and high-speed networks as commodity components is making clusters an appealing solution for cost-effective high performance computing. However, the bottleneck for users’ applications tends to shift from the computation and the communication sides to the I/O domain: the problem sizes are bigger and bigger and the time to load datasets into the cluster work pool and write the results to disks cannot be neglected any more.
The new generation of commodity components like storage controllers and high-speed networks can be efficiently used to break some architectural limitations inherited from the past, while keeping the price/performance ratio as low as possible. Our research effort improves a basic feature for the usage of parallel I/O on clusters by removing bottlenecks.
In Section 2, we present the motivation of this work, one current limitation of the parallel I/O design for clusters, and some related work that tries to improve it. We propose a new design in Section 3, describing our contribution and detailing the issues that occurred during the implementation. Then, Section 4 shows the results of some experimental benchmarks to highlight the benefit of our work for parallel I/O implementations. Finally, the conclusion in Section 5 summarizes our work and presents the short and medium term perspectives of our project.
Section snippets
Motivation
Today’s clusters are larger and more powerful than ever before. They start to be used to face some Grand Challenges in genomics or nuclear simulations, or even for intensive multimedia applications like Video-on-Demand (VOD). The datasets used in these contexts are very large, and require the I/O sub-system to be as efficient as the computation or the communication components.
There are two ways to achieve high-performance I/O in a cluster environment:
- •
To use a dedicated Storage Area Network
Contribution
The data path between the storage controller and the network interface passes through the host memory despite the fact that the data is not processed by the main processor before being sent to the I/O request emitter. The data goes through the host because of system constraints: the interactions with a local storage controller are traditionally operated from a user application and the communication interface of the NIC assumes the data to be present in the main memory at the beginning of a
Experimentation
We have conducted experiments with OPIOM to validate the implementation and measure the performance gain versus the regular I/O implementation.
Conclusions and perspectives
Parallel I/O is a very important research domain for high performance cluster computing. Today’s clusters cannot be used at the maximum of their capacity because of disappointing I/O performance compared to computational power. We have designed a basic interface to optimize the data movement between disks and an intelligent network interface with Linux. Our implementation with SCSI and Myrinet, OPIOM, provides high-performance throughput and very low host overhead as well as a UNIX-like
Patrick Geoffray received his PhD from the University of Lyon (France) in 2001. He is currently a senior programmer at Myricom in the branch office located in Oak Ridge, TN, USA. He is in charge of the high performance middleware on top of Myrinet: MPI, VIA, SHMEM. His points of interest are high speed interconnects, message passing and high performance storage I/O.
References (9)
- F.C.I.A. (FCIA), Fiber Channel, WWW,...
- Finisar, M.SoftTech, NCSA, High-performance solution uses FC, VIA, and MPI, Headlines, NCSA, November 1998....
- H. Stern, Managing NFS and NIS, O’Reilly & Associates, Inc.,...
Parallel file systems for the IBM SP computers
IBM Syst. J.
(1995)
Cited by (11)
Exploring I/O virtualization data paths for MPI applications in a cluster of VMs: A networking perspective
2011, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Communication-aware load balancing for parallel applications on clusters
2010, IEEE Transactions on ComputersDynamic load balancing for I/O-intensive applications on clusters
2009, ACM Transactions on StorageSynchronized send operations for efficient streaming block I/O over myrinet
2008, IPDPS Miami 2008 - Proceedings of the 22nd IEEE International Parallel and Distributed Processing Symposium, Program and CD-ROMEfficient block device sharing over myrinet with memory bypass
2007, Proceedings - 21st International Parallel and Distributed Processing Symposium, IPDPS 2007; Abstracts and CD-ROMImproving the performance of I/O-intensive applications on clusters of workstations
2006, Cluster Computing
Patrick Geoffray received his PhD from the University of Lyon (France) in 2001. He is currently a senior programmer at Myricom in the branch office located in Oak Ridge, TN, USA. He is in charge of the high performance middleware on top of Myrinet: MPI, VIA, SHMEM. His points of interest are high speed interconnects, message passing and high performance storage I/O.
- ☆
This work has been realized with the help of Sycomore—Aerospatiale Matra in France, and Jack Dongarra and Rich Wolski at the University of Tennessee, USA.
- 1
URL: http://www.myri.com.