Skip to main content

A Best Practice Analysis of HDF\(5\) and NetCDF-\(4\) Using Lustre

  • Conference paper
  • First Online:
Book cover High Performance Computing (ISC High Performance 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9137))

Included in the following conference series:

Abstract

With the constantly increasing number of cores in high performance computing (HPC) systems, applications produce even more data that will eventually have to be stored and accessed in parallel. Applications’ I/O in HPC is performed in a layered manner; scientific applications use standardized high-level libraries and data formats like HDF\(5\) and NetCDF-\(4\) to store and manipulate data that is located inside a parallel file system. In this paper, we present a performance analysis of the parallel interfaces of HDF\(5\) and NetCDF-\(4\) using different test configurations in order to provide best practices for choosing the right I/O configuration. Our evaluation follows a breakdown approach where we examine the performance penalties of each layer. The tested configurations include: (i) different access patterns, disjoint and interleaved (ii) aligned and unaligned accesses (iii) collective and independent I/O (iv) contiguous and chunked data layout. The main observation is that using interleaved data access in a certain configuration achieves near the maximum performance. Also, we see that NetCDF-\(4\) does not provide the ability to align the access to the Lustre object boundaries. To overcome this we have developed a patch that resolves this issue and improves the performance dramatically.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bayer, R., McCreight, E.: Organization and Maintenance of Large Ordered Indexes. Springer, New York (2002)

    Google Scholar 

  2. Braam, P.J., Zahir, R.: Lustre: a scalable, high performance file system. Cluster File Systems, Inc. (2002)

    Google Scholar 

  3. Dickens, P., Logan, J.: Towards a high performance implementation of MPI-IO on the lustre file system. In: Meersman, R., Tari, Z. (eds.) OTM 2008, Part I. LNCS, vol. 5331, pp. 870–885. Springer, Heidelberg (2008)

    Google Scholar 

  4. Group, H., et al.: Hierarchical data format version 5 (2000). Software package, http://www.hdfgroup.org/HDF5

  5. Howison, M.: Tuning HDF5 for lustre file systems. In: Workshop on Interfaces and Abstractions for Scientific Data Storage (IASDS 2010), Heraklion, Crete, Greece, 24 September 2010 (2012)

    Google Scholar 

  6. IOR: https://github.com/chaos/ior

  7. Liao, W.K., Choudhary, A.: Dynamically adapting file domain partitioning methods for collective I/O based on underlying parallel file system locking protocols. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008, pp. 1–12. IEEE (2008)

    Google Scholar 

  8. Nisar, A., Liao, W.K., Choudhary, A.: Scaling parallel I/O performance through I/O delegate and caching system. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2008, pp. 1–12. IEEE (2008)

    Google Scholar 

  9. OpenSFS (2014). http://www.opensfs.org/press-releases/lustre-file-system-version-2-5-released/. Accessed December 2014

  10. Rew, R., Davis, G., Emmerson, S., Davies, H., Hartnett, E.: The NetCDF users guide-data model, programming interfaces, and format for self-describing, portable data-NetCDF version 4.1. Unidata Program Center (2010)

    Google Scholar 

  11. Yu, W., Vetter, J., Canon, R.S., Jiang, S.: Exploiting lustre file joining for effective collective IO. In: Seventh IEEE International Symposium on Cluster Computing and the Grid, CCGRID 2007, pp. 267–274. IEEE (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher Bartz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Bartz, C., Chasapis, K., Kuhn, M., Nerge, P., Ludwig, T. (2015). A Best Practice Analysis of HDF\(5\) and NetCDF-\(4\) Using Lustre. In: Kunkel, J., Ludwig, T. (eds) High Performance Computing. ISC High Performance 2015. Lecture Notes in Computer Science(), vol 9137. Springer, Cham. https://doi.org/10.1007/978-3-319-20119-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20119-1_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20118-4

  • Online ISBN: 978-3-319-20119-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics