Orthrus: A Framework for Implementing Efficient Collective I/O in Multi-core Clusters

Zhang, Xuechen; Ou, Jianqiang; Davis, Kei; Jiang, Song

doi:10.1007/978-3-319-07518-1_22

Orthrus: A Framework for Implementing Efficient Collective I/O in Multi-core Clusters

Xuechen Zhang¹⁸,
Jianqiang Ou¹⁹,
Kei Davis²⁰ &
…
Song Jiang¹⁹

Conference paper

2657 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8488))

Abstract

Optimization of access patterns using collective I/O imposes the overhead of exchanging data between processes. In a multi-core-based cluster the costs of inter-node and intra-node data communication are vastly different, and heterogeneity in the efficiency of data exchange poses both a challenge and an opportunity for implementing efficient collective I/O. The opportunity is to effectively exploit fast intra-node communication. We propose to improve communication locality for greater data exchange efficiency. However, such an effort is at odds with improving access locality for I/O efficiency, which can also be critical to collective-I/O performance. To address this issue we propose a framework, Orthrus, that can accommodate multiple collective-I/O implementations, each optimized for some performance aspects, and dynamically select the best performing one accordingly to current workload and system patterns. We have implemented Orthrus in the ROMIO library. Our experimental results with representative MPI-IO benchmarks on both a small dedicated cluster and a large production HPC system show that Orthrus can significantly improve collective I/O performance under various workloads and system scenarios.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

The Opportunities and Challenges of Exascale Computing, http://science.energy.gov/media/ascr/ascac/pdf/reports/exascale_subcommittee_report.pdf/
Alam, S., Barrett, R., Kuehn, J., Roth, P., Vetter, J.: Characterization of Scientific Workloads on Systems with Multi-core processors. In: IEEE International Symposium on Workload Characterization (2006)
Google Scholar
Benkert, K., Gabriel, E.: Empirical Optimization of Collective Communications with ADCL. In: High Performance Computing on Vector Systems 2010 (2010)
Google Scholar
Byna, S., Chen, Y., Sun, X., Thakur, R., Gropp, W.: Parallel I/O Prefetching Using MPI File Caching and I/O Signatures. In: ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (2008)
Google Scholar
Carroll, A.: Linux Block I/O Scheduling (2007), http://www.cse.unsw.edu.au/aaronc/iosched/doc/sched.pdf
Parallel Languages/Paradigms: Charm ++ - Parallel Objects, http://charm.cs.uiuc.edu/research/charm/ .
Ching, A., Choudhary, A., Liao, W., Ward, L., Pundit, N.: Evaluating I/O Characteristics and Methods for Storing Sturctured Scientific Data. In: IEEE Internatinal Parallel & Distributed Processing Symposium (1996)
Google Scholar
Lu, Y., Chen, Y., Amritkar, P., Thakur, R., Zhuang, Y.: A New Data Sieving Approach for High Performance I/O. In: Park, J.J(J.H.), Leung, V.C.M., Wang, C.-L., Shon, T. (eds.) Future Information Technology, Application, and Service. LNEE, vol. 164, pp. 111–121. Springer, Heidelberg (2012)
Chapter Google Scholar
Ching, A., Choudhary, A., Liao, W., Ross, R., Gropp, W.: Efficient Structured Data Access in Parallel File Systems. In: IEEE Internatinal Conference on Cluster Computing (2003)
Google Scholar
Ching, A., Choudhary, A., Coloma, K., Liao, W.: Noncontiguous I/O Access Through MPI-IO. In: IEEE/ACM International Symposium on Cluster Computing and the Grid (2003)
Google Scholar
Coloma, K., Ching, A., Choudhary, A., Liao, W., Ross, R., Thakur, R., Ward, L.: A New Flexible MPI Collective I/O Implementation. In: IEEE International Conference on Cluster Computing (2006)
Google Scholar
Cha, K., Maeng, S.: An Efficient I/O Aggregator Assignment Scheme for Collective I/O Considering Processor Affinity. In: 7th International Workshop on Scheduling and Resource Management for Parallel and Distributed Systems (2011)
Google Scholar
Chai, L., Gao, Q., Panda, D.: Understanding the Impact of Multi-core Architecture in Cluster Computing. In: The 7th IEEE International Symposium on Cluster Computing and the Grid (2007)
Google Scholar
Chaarawi, M., Gabriel, E.: Automatically Selecting the Number of Aggregators for Collective I/O Operations. In: IEEE International Conference on Cluster Computing (2011)
Google Scholar
FLASH IO Benchmark Routine-Parallel HDF 5, http://www.ucolick.org/zingale/flash_benchmark_io/
FT: Discrete 3D Fast Fourier Transform, http://www.nas.nasa.gov/publications/npb.html
HDF5 documents, http://www.hdfgroup.org/HDF5/whatishdf5.html
Johnsson, S.L.: CMSSL: a Scalable Scientific Software Library. In: Proceedings of Scalable Parallel Libraries Conference, Mississippi State, MS (1993)
Google Scholar
PVFS2, Parallel Virtual File System (Version 2), http://www.pvfs.org/
Riley, K.: Introduction to Flash, http://flash.uchicago.edu/site/flashcode/user_support/tutorial_talks/home.py?submit=May2004.txt
Tuning I/O Performance (2012), http://doc.opensuse.org/products/draft/SLES/SLES-tuning_sd_draft/cha.tuning.io.html
Thakur, R., Gropp, W., Lusk, E.: Data Sieving and Collective I/O in ROMIO. In: Proceedings of the 7th Symposium on the Frontiers of Massively Parallel Computation, Annapolis, MD (1999)
Google Scholar
Latham, R., Ross, R.: PVFS, ROMIO, and the noncontig Benchmark (2005)
Google Scholar
Liao, W., Choudhary, A.: Dynamically Adapting File Domain Partitioning Methods for Collective I/O Based on Underlying Parallel File System Locking Protocols. In: ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (2008)
Google Scholar
Lofstead, J., Klasky, S., Schwan, K., Podhorszki, N., Jin, C.: Flexible io and integration for scientific codes through the adaptable io system (adios). In: Proc. CLADE 2008 (2008)
Google Scholar
Lustre File System (2012), https://www.xyratex.com/products/lustre
MPICH2, A High Performance Message Passing Interface, http://www.mcs.anl.gov/research/projects/mpich2/
Madduri, K., Ibrahim, K., Williams, S., Im, E., Ethier, S., Shalf, J., Oliker, L.: Gyrokinetic Toroidal Simulations on Leading Multi- and Manycore HPC Systems. In: ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (2011)
Google Scholar
Noncontig I/O Benchmark, http://www-unix.mcs.anl.gov/thakur/pio-benchmarks.html
NAS parallel benchmarks, NASA Ames Research Center (2009), http://www.nas.nasa.gov/Software/NPB
Spider - the Center-Wide Lustre File System, http://www.olcf.ornl.gov/kb_articles/spider-the-center-wide-lustre-file-system/
Zhang, C., Yuan, X., Srinivasan, A.: Processor Affinity and MPI Performance on SMP-CMP Clusters. In: 11th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (2010)
Google Scholar
Zheng, G.: Achieving High Performance on Extremely Large Parallel Machines: Performance Prediction and Load Balancing. PhD Thesis (2005)
Google Scholar
Zou, H., Sun, X., Ma, S., Duan, X.: A Source-Aware Interrupt Scheduling for Modern Parallel I/O Systems. In: 26th IEEE International Parallel & Distributed Processing Symposium (2012)
Google Scholar
Zhang, X., Jiang, S., Davis, K.: Making Resonance a Common Case: A High-Performance Implementation of Collective I/O on Parallel File Systems. In: 23th IEEE International Parallel & Distributed Processing Symposium (2009)
Google Scholar
Zhang, X., Xu, Y., Jiang, S.: YouChoose: A Performance Interface Enabling Convenient and Efficient QoS Support for Consolidated Storage Systems. In: 27th IEEE Symposium on Massive Storage Systems and Technologies (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Georgia Institute of Technology, Atlanta, GA, USA
Xuechen Zhang
Wayne State University, Detroit, MI, USA
Jianqiang Ou & Song Jiang
Los Alamos National Laboratory, Los Alamos, NM, USA
Kei Davis

Authors

Xuechen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jianqiang Ou
View author publications
You can also search for this author in PubMed Google Scholar
Kei Davis
View author publications
You can also search for this author in PubMed Google Scholar
Song Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MIN Faculty, Department of Informatics Scientific Computing, University of Hamburg, Bundestraße 45a, 20146, Hamburg, Germany
Julian Martin Kunkel
Deutsches Klimarechenzentrum, Bundesstraße 45a, 20146, Hamburg, Germany
Thomas Ludwig
Germany and Prometeus GmbH, University of Mannheim, Fliederstraße 2, 74915, Waibstadt, Germany
Hans Werner Meuer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, X., Ou, J., Davis, K., Jiang, S. (2014). Orthrus: A Framework for Implementing Efficient Collective I/O in Multi-core Clusters. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds) Supercomputing. ISC 2014. Lecture Notes in Computer Science, vol 8488. Springer, Cham. https://doi.org/10.1007/978-3-319-07518-1_22

Download citation

DOI: https://doi.org/10.1007/978-3-319-07518-1_22
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07517-4
Online ISBN: 978-3-319-07518-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics