ABSTRACT
Accurate analysis of HPC storage system designs is contingent on the use of I/O workloads that are truly representative of expected use. However, I/O analyses are generally bound to specific workload modeling techniques such as synthetic benchmarks or trace replay mechanisms, despite the fact that no single workload modeling technique is appropriate for all use cases. In this work, we present the design of IOWA, a novel I/O workload abstraction that allows arbitrary workload consumer components to obtain I/O workloads from a range of diverse input sources. Thus, researchers can choose specific I/O workload generators based on the resources they have available and the type of evaluation they wish to perform. As part of this research, we also outline the design of three distinct workload generation methods, based on I/O traces, synthetic I/O kernels, and I/O characterizations. We analyze and contrast each of these workload generation techniques in the context of storage system simulation models as well as production storage system measurements. We found that each generator mechanism offers varying levels of accuracy, flexibility, and breadth of use that should be considered before performing I/O analyses. We also recommend a set of best practices for HPC I/O workload modeling based on challenges that we encountered while performing our evaluation.
- mdtest benchmark. http://sourceforge.net/projects/mdtest/, 2015.Google Scholar
- A. Adelmann, R. Ryne, J. Shalf, and C. Siegerist. H5Part: A portable high performance parallel data interface for particle simulations. In Particle Accelerator Conference, 2005. PAC 2005. Proceedings of the, pages 4129--4131. IEEE, 2005.Google ScholarCross Ref
- D. W. Bauer Jr, C. D. Carothers, and A. Holder. Scalable time warp on blue gene supercomputers. In Proceedings of the 2009 ACM/IEEE/SCS 23rd Workshop on Principles of Advanced and Distributed Simulation, pages 35--44. IEEE Computer Society, 2009. Google ScholarDigital Library
- K. J. Bowers, B. Albright, L. Yin, B. Bergen, and T. Kwan. Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulation. Physics of Plasmas (1994-present), 15(5):055703, 2008.Google Scholar
- S. Byna, Y. Chen, X.-H. Sun, R. Thakur, and W. Gropp. Parallel I/O prefetching using MPI file caching and I/O signatures. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, page 44. IEEE Press, 2008. Google ScholarDigital Library
- D. Capps and W. Norcott. IOzone filesystem benchmark. http://www.iozone.org/.Google Scholar
- P. Carns. ALCF I/O data repository. Technical Report ANL/ALCF/TM-13/1, Argonne National Laboratory (ANL), 2013.Google Scholar
- P. Carns, K. Harms, W. Allcock, C. Bacon, S. Lang, R. Latham, and R. Ross. Understanding and improving computational science storage access through continuous characterization. ACM Transactions on Storage (TOS), 7(3):8, 2011. Google ScholarDigital Library
- P. Carns, R. Latham, R. Ross, K. Iskra, S. Lang, and K. Riley. 24/7 characterization of petascale I/O workloads. In Proceedings of 2009 Workshop on Interfaces and Architectures for Scientific Data Storage, September 2009.Google ScholarCross Ref
- P. Carns, Y. Yao, K. Harms, R. Latham, R. B. Ross, and K. Antypas. Production I/O characterization on the Cray XE6. In In Proceedings of the Cray User Group meeting 2013 (CUG 2013), 2013.Google Scholar
- C. D. Carothers, D. Bauer, and S. Pearce. Ross: A high-performance, low-memory, modular time warp system. Journal of Parallel and Distributed Computing, 62(11):1648--1669, 2002.Google ScholarDigital Library
- C. D. Carothers, K. S. Perumalla, and R. M. Fujimoto. Efficient optimistic parallel simulations using reverse computation. ACM Transactions on Modeling and Computer Simulation (TOMACS), 9(3):224--253, 1999. Google ScholarDigital Library
- J. Cope, N. Liu, S. Lang, P. Carns, C. Carothers, and R. Ross. Codes: Enabling co-design of multilayer exascale storage architectures. In Proceedings of the Workshop on Emerging Supercomputing Technologies, 2011.Google Scholar
- P. E. Crandall, R. A. Aydt, A. A. Chien, and D. A. Reed. Input/output characteristics of scalable parallel applications. In Proceedings of the 1995 ACM/IEEE conference on Supercomputing, page 59. ACM, 1995. Google ScholarDigital Library
- Department of Energy. CORAL. http://asc.llnl.gov/CORAL-benchmarks/, 2015.Google Scholar
- S. Eidenbenz, M. Erazo, T. Li, and J. Liu. Toward comprehensive and accurate simulation performance prediction of parallel file systems. Technical report, Los Alamos National Laboratory (LANL), 2011.Google Scholar
- S. Godard. Sysstat utilities home page. http://sebastien.godard.pagesperso-orange.fr/, 2015.Google Scholar
- W. He, D. H. Du, and S. B. Narasimhamurthy. PIONEER: A solution to parallel I/O workload characterization and generation. In Cluster, Cloud and Grid Computing (CCGrid), 2015 15th IEEE/ACM International Symposium on, pages 111--120. IEEE, 2015.Google ScholarDigital Library
- M. Heroux and R. Barrett. Mantevo project. https://mantevo.org/, 2015.Google Scholar
- W.-I. Kao and R. K. Iyer. A user-oriented synthetic workload generator. In Distributed Computing Systems, 1992., Proceedings of the 12th International Conference on, pages 270--277. IEEE, 1992.Google ScholarCross Ref
- Y. Kim, R. Gunasekaran, G. M. Shipman, D. A. Dillow, Z. Zhang, and B. W. Settlemyer. Workload characterization of a leadership class storage cluster. In 5th Petascale Data Storage Workshop (PDSW), pages 1--5. IEEE, 2010.Google ScholarCross Ref
- J. Kunkel. HDTrace -- a tracing and simulation environment of application and system interaction. Hamburg. University of Hamburg-2011, 2011.Google Scholar
- Z. Kurmas, K. Keeton, and K. Mackenzie. Synthesizing representative I/O workloads using iterative distillation. In Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003. MASCOTS 2003. 11th IEEE/ACM International Symposium on, pages 6--15. IEEE, 2003.Google Scholar
- Lawrence Livermore National Laboratory. IOR benchmark. https://github.com/chaos/ior, 2015.Google Scholar
- Lawrence Livermore National Laboratory. Lustre Monitoring Tool (Github). https://github.com/chaos/lmt, 2015.Google Scholar
- N. Liu, C. Carothers, J. Cope, P. Carns, R. Ross, A. Crume, and C. Maltzahn. Modeling a leadership-scale storage system. In Parallel Processing and Applied Mathematics, pages 10--19. Springer, 2012. Google ScholarDigital Library
- N. Liu, J. Cope, P. Carns, C. Carothers, R. Ross, G. Grider, A. Crume, and C. Maltzahn. On the role of burst buffers in leadership-class storage systems. In Proceedings of 28th IEEE MSST conference, 2012.Google ScholarCross Ref
- Y. Liu, R. Figueiredo, D. Clavijo, Y. Xu, and M. Zhao. Towards simulation of parallel file system scheduling algorithms with PFSsim. In Proceedings of the 7th IEEE International Workshop on Storage Network Architectures and Parallel I/O (May 2011), 2011.Google Scholar
- H. Luu, B. Behzad, R. Aydt, and M. Winslett. A multi-level approach for understanding I/O activity in HPC applications. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on, pages 1--5, Sept 2013.Google ScholarCross Ref
- S. Méndez, D. Rexachs, and E. Luque. Modeling parallel scientific applications through their input/output phases. In Cluster Computing Workshops (CLUSTER WORKSHOPS), 2012 IEEE International Conference on, pages 7--15. IEEE, 2012. Google ScholarDigital Library
- M. P. Mesnier, M. Wachs, R. R. Sambasivan, J. Lopez, J. Hendricks, G. R. Ganger, and D. O'Hallaron. Trace: Parallel trace replay with approximate causal events. In Proceedings of the 5th USENIX Conference on File and Storage Technologies, pages 24--24, Berkeley, CA, USA, 2007. USENIX Association. Google ScholarDigital Library
- E. Molina-Estolano, C. Maltzahn, J. Bent, and S. Brandt. Building a parallel file system simulator. In Journal of Physics: Conference Series, volume 180, page 012050. IOP Publishing, 2009.Google Scholar
- A. Núñez, J. Fernández, J. D. Garcia, F. Garcia, and J. Carretero. New techniques for simulating high performance MPI applications on large storage networks. The Journal of Supercomputing, 51(1):40--57, 2010. Google ScholarDigital Library
- P. C. Roth. Characterizing the I/O behavior of scientific applications on the Cray XT. In Proceedings of the 2nd International Workshop on Petascale Data Storage, pages 50--55, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- H. Shan, K. Antypas, and J. Shalf. Characterizing and predicting the I/O performance of HPC applications using a parameterized synthetic benchmark. In Proceedings of the 2008 ACM/IEEE conference on Supercomputing, page 42. IEEE Press, 2008. Google ScholarDigital Library
- S. S. Shende and A. D. Malony. The tau parallel performance system. International Journal of High Performance Computing Applications, 20(2):287--311, 2006. Google ScholarDigital Library
- E. Smirni and D. A. Reed. Workload characterization of input/output intensive parallel applications. In Computer Performance Evaluation Modelling Techniques and Tools, pages 169--180. Springer, 1997. Google ScholarDigital Library
- R. Thakur, W. Gropp, and E. Lusk. Data sieving and collective I/O in ROMIO. In The Seventh Symposium on the Frontiers of Massively Parallel Computation, 1999. Frontiers' 99., pages 182--189. IEEE, 1999. Google ScholarDigital Library
- A. Uselton, M. Howison, N. J. Wright, D. Skinner, N. Keen, J. Shalf, K. L. Karavanic, and L. Oliker. Parallel I/O performance: From events to ensembles. In 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pages 1--11. IEEE, 2010.Google ScholarCross Ref
- R. F. Van der Wijngaart and P. Wong. NAS parallel benchmarks version 2.4. Technical report, NAS technical report, NAS-02-007, 2002.Google Scholar
- A. Varga et al. The OMNeT++ discrete event simulation system. In Proceedings of the European Simulation Multiconference (ESMâĂŹ2001), 2001.Google Scholar
- J. Vetter and C. Chambreau. mpiP: Lightweight, scalable MPI profiling. 2014.Google Scholar
- K. Vijayakumar, F. Mueller, X. Ma, and P. C. Roth. Scalable I/O tracing and analysis. In Proceedings of the 4th Annual Workshop on Petascale Data Storage, pages 26--31, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- N. Zhu, J. Chen, T.-C. Chiueh, and D. Ellard. TBBT: scalable and accurate trace replay for file server evaluation. In ACM SIGMETRICS Performance Evaluation Review, volume 33, pages 392--393. ACM, 2005. Google ScholarDigital Library
Index Terms
- Techniques for modeling large-scale HPC I/O workloads
Recommendations
E-HPC: a library for elastic resource management in HPC environments
WORKS '17: Proceedings of the 12th Workshop on Workflows in Support of Large-Scale ScienceNext-generation data-intensive scientific workflows need to support streaming and real-time applications with dynamic resource needs on high performance computing (HPC) platforms. The static resource allocation model on current HPC systems that was ...
An Efficient Dynamic Load-Balancing Large Scale Graph-Processing System
ICNCC '18: Proceedings of the 2018 VII International Conference on Network, Communication and ComputingSince the introduction of pregel by Google, several large-scale graph-processing systems have been introduced. These systems are based on the bulk synchronous parallel model or other similar models and use various strategies to optimize system ...
RADICAL-Pilot and PMIx/PRRTE: Executing Heterogeneous Workloads at Large Scale on Partitioned HPC Resources
Job Scheduling Strategies for Parallel ProcessingAbstractExecution of heterogeneous workflows on high-performance computing (HPC) platforms present unprecedented resource management and execution coordination challenges for runtime systems. Task heterogeneity increases the complexity of resource and ...
Comments