ABSTRACT
Large-scale parallel applications can face significant I/O performance bottlenecks, making efficient I/O crucial. This work presents a comparative study of several parallel I/O implementations in the Weather Research and Forecasting model, including PnetCDF blocking and non-blocking I/O options, netCDF4, HDF5 Log VOL, and ADIOS. For I/O methods creating files in a canonical data layout, PnetCDF's non-blocking option offers up to 2x improvement over its blocking option and up to 4.5x over HDF5 via netCDF4, demonstrating the effectiveness of the write request aggregation technique. The HDF5 Log VOL outperforms ADIOS with a 4x improvement in write performance when creating files in the log layout, although both require non-negligible time to convert the file back to canonical order for post-run analysis. From these results we extract some observations that can guide I/O strategies for modern parallel codes.
- [n. d.]. Community Multiscale Air Quality Modeling System (CMAQ). Zenodo. Software. Google ScholarCross Ref
- Ehtesham Ahmed, Naeem Saddique, Firas Al Janabi, Klemens Barfus, Malik Rizwan Asghar, Abid Sarwar, and Peter Krebs. 2023. Flood Predictability of One-Way and Two-Way WRF Nesting Coupled Hydrometeorological Flow Simulations in a Transboundary Chenab River Basin, Pakistan. Remote Sens. 15, 2 (2023), 457. Google ScholarCross Ref
- Katie Antypas, Nicholas Wright, Nicholas P Cardo, Allison Andrews, and Matthew Cordery. 2014. Cori: A Cray XC pre-exascale system for NERSC. Cray User Group Proceedings. Cray (2014).Google Scholar
- Tricia Balle and Pete Johnsen. 2016. Improving I/O Performance of the Weather Research and Forecast (WRF) Model.Google Scholar
- Suren Byna, Mohamad Chaarawi, Quincey Koziol, John Mainzer, and Frank Willmore. 2017. Tuning HDF5 Subfiling Performance on Parallel File Systems. In the Cray User Group Meeting.Google Scholar
- Philip Carns, Kevin Harms, William Allcock, Charles Bacon, Samuel Lang, Robert Latham, and Robert Ross. 2011. Understanding and Improving Computational Science Storage Access through Continuous Characterization. ACM Trans. Storage 7, 3, Article 8 (oct 2011), 26 pages. Google ScholarDigital Library
- Jacqueline Chen, Alok Choudhary, Bronis R. de Supinski, Matt DeVries, Evatt Hawkes, Scott Klasky, Wei-keng Liao, Kwan-Liu Ma, Jim Crummey, Norbert Podhorszki, Ramanan Sankaran, Sameer Shende, and Chialin Yoo. 2009. Terascale Direct Numerical Simulations of Turbulent Combustion Using S3D. Computational Science and Discovery 2 (January 2009).Google Scholar
- Jacob Finkenrath, Giannis Koutsou, Swen Metzger, Hendrik Elbern, and Jonas Berndt. 2019. Approaching exascale with the Weather Research and Forecasting Solar model. Google ScholarCross Ref
- Kui Gao, Chen Jin, Alok Choudhary, and Wei-Keng Liao. 2011. Supporting computational data model representation with high-performance I/O in parallel netCDF. In 2011 18th International Conference on High Performance Computing. 1--10. Google ScholarDigital Library
- Kui Gao, Wei-Keng Liao, Alok Choudhary, Robert Ross, and Robert Latham. 2009. Combining I/O operations for multiple array variables in parallel netCDF. In 2009 IEEE International Conference on Cluster Computing and Workshops. 1--10. Google ScholarCross Ref
- Kui Gao, Wei-Keng Liao, Arifa Nisar, Alok N. Choudhary, Robert B. Ross, and Robert Latham. 2009. Using Subfiling to Improve Programming Flexibility and Performance of Parallel Shared-file I/O. In the International Conference Parallel Processing.Google ScholarDigital Library
- William F. Godoy, Norbert Podhorszki, Ruonan Wang, Chuck Atkins, Greg Eisenhauer, Junmin Gu, Philip Davis, Jong Choi, Kai Germaschewski, Kevin Huck, Axel Huebl, Mark Kim, James Kress, Tahsin Kurc, Qing Liu, Jeremy Logan, Kshitij Mehta, George Ostrouchov, Manish Parashar, Franz Poeschel, David Pugmire, Eric Suchyta, Keichi Takahashi, Nick Thompson, Seiji Tsutsumi, Lipeng Wan, Matthew Wolf, Kesheng Wu, and Scott Klasky. 2020. ADIOS 2: The Adaptable Input Output System. A framework for high-performance data management. SoftwareX 12 (2020), 100561. Google ScholarCross Ref
- The HDF Group. 1997--2023. Hierarchical Data Format, version 5. https://www.hdfgroup.org/HDF5.Google Scholar
- The HDF Group. 2022. HDF5 release 1.14.0. https://portal.hdfgroup.org/display/support/HDF5+1.14.0.Google Scholar
- Qiao Kang, Sunwoo Lee, Kaiyuan Hou, Robert Ross, Ankit Agrawal, Alok Choudhary, and Wei-Keng Liao. 2020. Improving MPI Collective I/O for High Volume Non-Contiguous Requests with Intra-Node Aggregation. IEEE Transactions on Parallel and Distributed Systems 31, 11 (2020), 2682--2695. Google ScholarDigital Library
- Akira Kyle. 2018. Weather Research and Forecast (WRF) Scaling, Performance Assessment and Optimization. https://akirakyle.github.io/WRF_benchmarks/. Accessed on June 2022.Google Scholar
- Rob Latham, Chris Daley, Wei-keng Liao, Kui Gao, Rob Ross, Anshu Dubey, and Alok Choudhary. 2012. A case study for scientific I/O: improving the FLASH astrophysics code. Computational Science & Discovery 5, 1 (2012), 015001.Google ScholarCross Ref
- Michael Laufer. 2022. WRF-ADIOS2-to-NetCDF4. https://github.com/MichaelLaufer/WRF-ADIOS2-to-NetCDF4. Accessed on Mar, 2023.Google Scholar
- Michael Laufer and Erick Fredj. 2022. High Performance Parallel I/O and In-Situ Analysis in the WRF Model with ADIOS2. CoRR abs/2201.08228 (2022). arXiv:2201.08228 https://arxiv.org/abs/2201.08228Google Scholar
- Jianwei Li, Wei-Keng Liao, Alok Choudhary, Robert Ross, Rajeev Thakur, William Gropp, Rob Latham, Andrew Siegel, Brad Gallagher, and Michael Zingale. 2003. Parallel netCDF: A High-Performance Scientific I/O Interface. In SC Conference. IEEE Computer Society, Los Alamitos, CA, USA, 39. Google ScholarCross Ref
- Wei-Keng Liao and Alok Choudhary. 2008. Dynamically Adapting File Domain Partitioning Methods for Collective I/O Based on Underlying Parallel File System Locking Protocols. In the ACM/IEEE Conference on Supercomputing.Google ScholarCross Ref
- Wei-Keng Liao, Kaiyuan Hou, and Zanhua Huang. 2019--2023. Log VOL. https://github.com/DataLib-ECP/vol-log-based.Google Scholar
- Barry H. Lynn, Seth Cohen, Leonard Druyan, Adam S. Phillips, Dennis Shea, Haim-Zvi Krugliak, and Alexander P. Khain. 2020. An Examination of the Impact of Grid Spacing on WRF Simulations of Wintertime Precipitation in the Mid-Atlantic United States. Weather and Forecasting 35, 6 (2020), 2317--2343. Google ScholarCross Ref
- Message Passing Interface Forum. 2021. MPI: A Message-Passing Interface Standard Version 4.0. https://www.mpi-forum.orgGoogle Scholar
- OpenFOAM Foundation. [n. d.]. OpenFOAM - The Open Source Computational Fluid Dynamics (CFD) Toolbox. https://www.openfoam.org.Google Scholar
- Juan Rosario, Rajesh Bordawekar, and Alok Choudhary. 1993. Improved parallel I/O via a two-phase run-time access strategy. ACM SIGARCH Computer Architecture News 21 (12 1993), 31--38. Google ScholarDigital Library
- Manu Shantharam, Mahidhar Tatineni, Dongju Choi, and Amitava Majumdar. 2018. Understanding I/O Bottlenecks and Tuning for High Performance I/O on Large HPC Systems: A Case Study. In Proceedings of the Practice and Experience on Advanced Research Computing (Pittsburgh, PA, USA) (PEARC '18). Association for Computing Machinery, New York, NY, USA, Article 54, 6 pages. Google ScholarDigital Library
- William C. Skamarock, Joseph B. Klemp, Jimy Dudhia, David O. Gill, Zhiquan Liu, Judith Berner, Wei Wang, Jordan G. Powers, Michael G. Duda, Dale M. Barker, and Xiang-yu Huang. 2019. A description of the advanced research WRF Version 4. techreport NCAR/TN-556+STR. NCAR Tech. 145 pages.Google Scholar
- R Thakur, E Lusk, and W Gropp. 1997. Users guide for ROMIO: A high-performance, portable MPI-IO implementation. (10 1997). Google ScholarCross Ref
- Unidata. 2022. NetCDF version 4.9.0 [software]. Google ScholarCross Ref
- Xiwen Wang, Weijia Wang, Yuan He, Shulei Zhang, Wei Huang, R. Iestyn Woolway, Kun Shi, and Xiaofan Yang. 2023. Numerical simulation of thermal stratification in Lake Qiandaohu using an improved WRF-Lake model. Journal of Hydrology 618 (2023), 129184. Google ScholarCross Ref
- D. C. Wong, C. E. Yang, J. S. Fu, K. Wong, and Y. Gao. 2015. An approach to enhance PnetCDF performance in environmental modeling applications. Geoscientific Model Development 8, 4 (2015), 1033--1046. Google ScholarCross Ref
Index Terms
- I/O in WRF: A Case Study in Modern Parallel I/O Techniques
Recommendations
Performance Evaluation of Software RAID vs. Hardware RAID for Parallel Virtual File System
ICPADS '02: Proceedings of the 9th International Conference on Parallel and Distributed SystemsLinux clusters of commodity computer systems and interconnectshave become the fastest growing choice for buildingcost-effective high-performance parallel computing systems.The Parallel Virtual File System (PVFS) could potentially fulfillthe requirements ...
Large files, small writes, and pNFS
ICS '06: Proceedings of the 20th annual international conference on SupercomputingWorkload characterization studies highlight the prevalence of small and sequential data requests in scientific applications. Parallel file systems excel at large data transfers but sometimes at the expense of small I/O performance. pNFS is an NFSv4.1 ...
MPI-IO/Gfarm: An Optimized Implementation of MPI-IO for the Gfarm File System
CCGRID '11: Proceedings of the 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid ComputingThis paper proposes a design and implementation of an MPI-IO implementation of the Gfarm file system, called MPI-IO/Gfarm. The Gfarm file system is a global file system that federates the local storage of compute nodes among several clusters. It has a ...
Comments