research-article

InterferenceRemoval: removing interference of disk access for MPI programs through data replication

Authors:
Xuechen Zhang

Wayne State University, Detroit, MI

Wayne State University, Detroit, MI
View Profile

,
Song Jiang

Wayne State University, Detroit, MI

Wayne State University, Detroit, MI
View Profile

ICS '10: Proceedings of the 24th ACM International Conference on SupercomputingJune 2010Pages 223–232https://doi.org/10.1145/1810085.1810116

Published:02 June 2010Publication History

ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing

Pages 223–232

ABSTRACT

As the number of I/O-intensive MPI programs becomes increasingly large, many efforts have been made to improve I/O performance, on both software and architecture sides. On the software side, researchers can optimize processes' access patterns, either individually (e.g., by using large and sequential requests in each process), or collectively (e.g., by using collective I/O). On the architecture side, files are striped over multiple I/O nodes for a high aggregate I/O throughput. However, a key weakness, the access interference on each I/O node, remains unaddressed in these efforts. When requests from multiple processes are served simultaneously by multiple I/O nodes, one I/O node has to concurrently serve requests from different processes. Usually the I/O node stores its data on the hard disks, and different process accesses different regions of a data set. When there are a burst of requests from multiple processes, requests from different processes to a disk compete with each other for its single disk head to access data. The disk efficiency can be significantly reduced due to frequent disk head seeks.

In this paper, we propose a scheme, InterferenceRemoval, to eliminate I/O interference by taking advantage of optimized access patterns and potentially high throughput provided by multiple I/O nodes. It identifies segments of files that could be involved in the interfering accesses and replicates them to their respectively designated I/O nodes. When the interference is detected at an I/O node, some I/O requests can be re-directed to the replicas on other I/O nodes, so that each I/O node only serves requests from one or a limited number of processes. InterferenceRemoval has been implemented in the MPI library for high portability on top of the Lustre parallel file system. Our experiments with representative benchmarks, such as NPB BTIO and mpi-tile-io, show that it can significantly improve I/O performance of MPI programs. For example, the I/O throughput of mpi-tile-io can be increased by 105% as compared to that without using collective I/O, and by 23% as compared to that using collective I/O.

References

M. Bhadkamkar, J. Guerra, L. Useche, S. Burnett, J. Liptak, R. Rangaswami, and V. Hristidis, "BORG: Block-reORGanization for Self-optimizing Storage Systems", In Proceedings of the 7th USENIX Conference on File and Storage Technologies, San Fancisco, CA, 2009. Google ScholarDigital Library
A. Ching, A. Choudhary, W. Liao, R. Ross, and W. Gropp, "Efficient Structured Data Access in Parallel File System", In Proceedings of IEEE International Conference on Cluster Computing, Hong Kong, China, 2003.Google ScholarCross Ref
A. Ching, A. Choudhary, K. Coloma, and W. Liao, "Noncontiguous I/O Accesses Through MPI-IO", In Proceedings of IEEE International Symposium on Cluster, Cloud, and Grid Computing, Tokyo, Japan, 2003. Google ScholarDigital Library
Cluster File Systems, Inc. Lustre. "Lustre: A scalable, robust, highly-available cluster file system", http://www.lustre.org/. Online-document, 2010.Google Scholar
H. Huang, W. Hung, and K. Shin, "FS2: Dynamic Data Replication in Free Disk Space for Improving Disk Performance and Energy Consumption", In Proceedings of ACM Symposium on Operating Systems Principles, Brighton, UK, 2005. Google ScholarDigital Library
W. Hsu, A. Smith, H. Young, "The Automatic Improvement of Locality in Storage Systems", ACM Transactions on Computer Systems, Volume 23, Issue 4, Nov. 2006, Pages 424--473. Google ScholarDigital Library
W. Hsu, A. Smith, H. Young, "The Automatic Improvement of Locality in Storage Systems", Technical Report CSD-03-1264, UC Berkeley, Jul. 2003.Google Scholar
Interleaved or Random (IOR) benchmarks, http://www.cs.dartmouth.edu/pario/examples.html, Online-document, 2008.Google Scholar
S. Iyer and P. Druschel, "Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous I/O", In Proceedings of ACM Symposium on Operating Systems Principles, Banff, Canada, 2001. Google ScholarDigital Library
D. Kotz, "Disk-directed I/O for MIMD Multiprocessors.", ACM Transactions on Computer Systems, Volume 15, Issue 1, Feb. 1997, pages 41--74. Google ScholarDigital Library
S. Liang, S. Jiang, and X. Zhang, "STEP: Sequentiality and Thrashing Detection Based Prefetching to Improve Performance of Networked Storage Servers.", In Proceedings of International Conference on Distributed Computing Systems, Toronto, Canada, 2007. Google ScholarDigital Library
Mpi-tile-io Benchmark, http: www-unix.mcs.anl.gov/thakur/pio-benchmarks.html. Online-document, 2009.Google Scholar
M. Kandemir, S. Son, M. Karakoy, "Improving I/O Performance of Applications through Compiler-Directed Code Restructuring", In Proceedings of the 6th USENIX Conference on File and Storage Technologies, San Jose, CA, 2008. Google ScholarDigital Library
MPICH2, Argonne National Laboratory, http://www.mcs.anl.gov/research-/projects/mpich2/. Online-document, 2009.Google Scholar
NAS Parallel Benchmarks, NASA AMES Research Center, http://www.nas.nasa.gov/Software/NPB/. Online-document, 2009.Google Scholar
PVFS, http://www.pvfs.org. Online-document, 2010.Google Scholar
P. Pacheco, "Parallel Programming with MPI", Morgran Kaufmann Publishers, pages 137--178, 1997. Google ScholarDigital Library
K. Seamons, Y. Chen, P. Jones, J. Jozwiak, and M. Winslett, "Server-directed collective I/O in Panda", In Proceedings of Supercomputing, San Diego, CA, 1995. Google ScholarDigital Library
F. Schmuck and R. Haskin, "GPFS: A shared-disk file system for large computing clusters.", In Proceedings of the 1st USENIX Conference on File and Storage Technologies, Monterey, CA, 2002, Monterey, CA, USA. Google ScholarDigital Library
R. Thakur, W. Gropp and E. Lusk, "Data Sieving and Collective I/O in ROMIO", In Proceedings of the 7th Symposium on the Frontiers of Massively Parallel Computation, Annapolis, MD, 1999. Google ScholarDigital Library
S3aSim I/O Benchmark, http://www-unix.mcs.anl.gov/thakur/s3asim.html. Online-document, 2009.Google Scholar
The DiskSim Simulation Environment(v4.0), Parallel Data Lab, http://www.pdl.cmu.edu/DiskSim/. Online-document, 2009.Google Scholar
Y. Wang and D. Kaeli, "Profile-Guided I/O Partitioning", In Proceedings of International Conference on Supercomputing, San Fancisco, CA, 2003. Google ScholarDigital Library
C. Wang, Z. Zhang, X. Ma, S. Vazhkudai, and F. Mueller, "Improving the Availability of Supercomputer Job Input Data Using Temporal Replication", In Proceedings of International Supercomputing Conference, Hamburg, Germany, 2009.Google ScholarCross Ref
X. Zhang, S. Jiang, and K. Davis, "Making Resonance a Common Case: A High-performance Implementation of Collective I/O on Parallel File Systems", In Proceedings of IEEE International Parallel & Distributed Processing Symposium, Rome, Italy, 2009. Google ScholarDigital Library

Index Terms

InterferenceRemoval: removing interference of disk access for MPI programs through data replication
1. Software and its engineering
  1. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        Memory management
        Secondary storage

Recommendations

IR+: Removing parallel I/O interference of MPI programs via data replication over heterogeneous storage devices
Highlights
- A data replication method IR+ is proposed to eliminate parallel I/O interference of an MPI program.
Abstract
I/O requests from parallel processes to a disk compete with each other for a single disk head to access data. The disk efficiency can be significantly reduced due to frequent disk head seeks. In this paper, we propose a scheme, named ...
Read More
MPI-IO/Gfarm: An Optimized Implementation of MPI-IO for the Gfarm File System
CCGRID '11: Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

This paper proposes a design and implementation of an MPI-IO implementation of the Gfarm file system, called MPI-IO/Gfarm. The Gfarm file system is a global file system that federates the local storage of compute nodes among several clusters. It has a ...
Read More
MPI-IO on a Parallel File System for Cluster of Workstations
IWCC '99: Proceedings of the 1st IEEE Computer Society International Workshop on Cluster Computing

Since the MPI-IO definition, a standard interface for parallel IO, some implementations are available for cluster of workstations, but the performances are within the limits of the file system (typically NFS). New parallel file systems are now available ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing
June 2010
365 pages
ISBN:9781450300186
DOI:10.1145/1810085
General Chair:
Taisuke Boku
University of Tsukuba
,
Program Chairs:
Hiroshi Nakashima
Kyoto University
,
Avi Mendelson
Microsoft
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 June 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
I/O interference
MPI program
MPI-IO
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate584of2,055submissions,28%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 30
  Total Citations
  View Citations
- 212
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

InterferenceRemoval: removing interference of disk access for MPI programs through data replication

ICS '10: Proceedings of the 24th ACM International Conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

IR+: Removing parallel I/O interference of MPI programs via data replication over heterogeneous storage devices

MPI-IO/Gfarm: An Optimized Implementation of MPI-IO for the Gfarm File System

MPI-IO on a Parallel File System for Cluster of Workstations