Skip to main content

Transparent Log-Based Data Storage in MPI-IO Applications

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 4757))

Abstract

The MPI-IO interface is a critical component in I/O software stacks for high-performance computing, and many successful optimizations have been incorporated into implementations to help provide high performance I/O for a variety of access patterns. However, in spite of these optimizations, there is still a large performance gap between ”easy” access patterns and more difficult ones, particularly when applications are unable to describe I/O using collective calls.

In this paper we present LogFS, a component that implements log-based storage for applications using the MPI-IO interface. We first discuss how this approach allows us to exploit the temporal freedom present in the MPI-IO consistency semantics, allowing optimization of a variety of access patterns that are not well-served by existing approaches. We then describe how this component is integrated into the ROMIO MPI-IO implementation as a stackable layer, allowing LogFS to be used on any file system supported by ROMIO. Finally we show performance results comparing the LogFS approach to current practice using a variety of benchmarks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Thakur, R., Gropp, W., Lusk, E.: An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces. In: Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation, pp. 180–187 (1996)

    Google Scholar 

  2. Kimpe, D., Vandewalle, S., Poedts, S.: On the Usability of High-Level Parallel IO in Unstructured Grid Simulations. In: Proceedings of the 13th EuroPVM/MPI Conference, pp. 400–401 (2007)

    Google Scholar 

  3. Allsopp, N., Follows, J., Hennecke, M., Ishibashi, F., Paolini, M., Quintero, D., Tabary, A., Reddy, H., Sosa, C., Prakash, S., Lascu, O.: Unfolding the IBM Eserver Blue Gene Solution. International Business Machines Corporation (September 2005)

    Google Scholar 

  4. Worringen, J., Traff, J., Ritzdorf, H.: Improving Generic Non-Contiguous File Access for MPI-IO. In: Proceedings of the 10th EuroPVM/MPI Conference (2003)

    Google Scholar 

  5. Ross, R., Miller, N., Gropp, W.: Implementing Fast and Reusable Datatype Processing. In: Proceedings of the 10th EuroPVM/MPI Conference (2003)

    Google Scholar 

  6. Hastings, A., Choudhary, A.: Exploiting Shared Memory to Improve Parallel I/O Performance. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 4192, Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Thakur, R., Gropp, W., Lusk, E.: A case for using MPI’s derived datatypes to improve I/O performance. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, San Jose, CA (1998)

    Google Scholar 

  8. Guttman, A.: R-Trees: A Dynamic Index Structure for Spatial Searching. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), ACM, New York (1984)

    Google Scholar 

  9. Li, J., Liao, W., Choudhary, A., Ross, R., Thakur, R., Gropp, W., Latham, R., Siegel, A., Gallagher, B., Zingale, M.: Parallel netCDF: A High-Performance Scientific I/O Interface. In: Proceedings of SC2003 (2003)

    Google Scholar 

  10. Purakayastha, A., Ellis, C., Kotz, D., Nieuwejaar, N., Best, M.: Characterizing Parallel File-Access Patterns on a Large-Scale Multiprocessor. In: Proceedings of the Ninth International Parallel Processing Symposium (1995)

    Google Scholar 

  11. Yu, W., Vetter, J., Canon, R., Jiang, S.: Exploiting Lustre File Joining for Effective Collective IO. In: Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), IEEE Computer Society Press, Los Alamitos (2007)

    Google Scholar 

  12. Coloma, K., Choudhary, A., Liao, W., Ward, L., Tideman, S.: DAChe: Direct Access Cache System for Parallel I/O. In: the 2005 Proceedings of the International Supercomputer Conference (2005)

    Google Scholar 

  13. Liao, W., Ching, A., Coloma, K., Choudhary, A., Kandemir, M.: Improving MPI Independent Write Performance Using A Two-Stage Write-Behind Buffering Method. In: the Proceedings of the Next Generation Software (NGS) Workshop, held in conjunction with the 21th International Parallel and Distributed Processing Symposium (IPDPS), Long Beach, California (2007)

    Google Scholar 

  14. Carns, P.H., Ligon, W.B., Ross, III.R.B., Thakur, R.: PVFS: A Parallel File System For Linux Clusters. In: the Proceedings of the 4th Annual Linux Showcase and Conference, Atlanta, GA, pp. 317–327 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Franck Cappello Thomas Herault Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kimpe, D., Ross, R., Vandewalle, S., Poedts, S. (2007). Transparent Log-Based Data Storage in MPI-IO Applications. In: Cappello, F., Herault, T., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2007. Lecture Notes in Computer Science, vol 4757. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75416-9_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75416-9_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75415-2

  • Online ISBN: 978-3-540-75416-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics