Skip to main content
Log in

Performance model-directed data sieving for high-performance I/O

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Many scientific computing applications and engineering simulations exhibit noncontiguous I/O access patterns. Data sieving is an important technique to improve the performance of noncontiguous I/O accesses by combining small and noncontiguous requests into a large and contiguous request. It has been proven effective even though more data are potentially accessed than demanded. In this study, we propose a new data sieving approach namely performance model-directed data sieving, or PMD data sieving in short. It improves the existing data sieving approach from two aspects: (1) dynamically determines when it is beneficial to perform data sieving; and (2) dynamically determines how to perform data sieving if beneficial. It improves the performance of the existing data sieving approach considerably and reduces the memory consumption as verified by both theoretical analysis and experimental results. Given the importance of supporting noncontiguous accesses effectively and reducing the memory pressure in a large-scale system, the proposed PMD data sieving approach in this research holds a great promise and will have an impact on high-performance I/O systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

References

  1. Ali N, Carns PH, Iskra K, Kimpe D, Lang S, Latham R, Ross RB, Ward L, Sadayappan P (2009) Scalable I/O forwarding framework for high-performance computing systems. Proceedings of the 2009 IEEE International Conference on Cluster Computing

  2. Abbasi H, Wolf M, Eisenhauer G, Klasky S, Schwan K, Zheng F (2010) Datastager: scalable data staging services for petascale applications. Cluster Comput 13(3):277–290

    Article  Google Scholar 

  3. Abbasi H, Eisenhauer G, Wolf M, Schwan K, Klasky S (2011) Just In Time: Adding Value to the I/O Pipelines Of High Performance Applications with JITStaging. In: Proceedings of International Symposium on High Performance Distributed Computing (HPDC), pp 27–36

  4. Blas JG, Isaila F, Carretero J, Latham R, Ross R (2009) Multiple-level MPI file write-back and prefetching for blue gene systems. In: Proceedings of PVM/MPI

  5. Bordawekar R, Rosario JM, Choudhary AN (1993) Design and evaluation of primitives for parallel I/O. In: Proceedings of ACM/IEEE Supercomputing Conference

  6. Byna S, Chen Y, Sun X-H, Thakur R, Gropp W (2008) Parallel I/O prefetching using MPI file caching and I/O signatures. In: Proceedings of the ACM/IEEE SuperComputing Conference (SC’08)

  7. Carns PH, Ligon III WB, Ross RB, Thakur R (2000) PVFS: A parallel file system for linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference

  8. Chang F, Gibson GA (1999) Automatic I/O hint generation through speculative execution. In: Proceedings of the 3rd Symposium on Operating Systems Design and Implementation (OSDI)

  9. Chen Y, Sun X-H, Thakur R, Roth PC, Gropp W (2011) LACIO: a new layout-aware collective I/O strategy for parallel I/O systems. In: The Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS’11)

  10. Chen Y, Byna S, Sun X-H, Thakur R, Gropp W (2008) Hiding I/O latency with pre-execution prefetching for parallel applications. Best paper award finalist, in Proceedings of the ACM/IEEE SuperComputing Conference (SC’08)

  11. Cluster File Systems Inc. Lustre: a scalable, high performance file system, Whitepaper. http://www.lustre.org/docs/whitepaper.pdf

  12. Crandall PE, Aydt RA, Chien AA, Reed DA (1995) Input/output characteristics of scalable parallel applications. In: Proceedings of the ACM/IEEE conference on Supercomputing, pp 59-es

  13. Eshel M, Haskin RL, Hildebrand D, Naik M, Schmuck FB, Tewari R (2010) Panache: a parallel file system cache for global file access. In: Proceedings of the 8th USENIX Conference on File and Storage Technologies

  14. Gu P, Wang J, Ross R (2008) Bridging the gap between parallel file systems and local file systems: a case study with PVFS. The 37th International Conference on Parallel processing 2008 (ICPP’08), pp 554–561

  15. Huang HH, Shan L, Szalay A, Terzis A (2011) Performance modeling and analysis of flash-based storage devices in Mass Storage Systems and Technologies (MSST). 2011 IEEE 27th Symposium on

  16. Iskra K, Romein JW, Yoshii K, Beckman P (2008) ZOID: I/O forwarding infrastructure for petascale architectures. In: Proceedings of the 13th ACM SIGPLAN symposium on principles and practice of parallel programming, pp 153–162

  17. Kotz D (1997) Disk-directed I/O for MIMD multiprocessors. ACM Trans Comput Systems 15(1):41–74

  18. Lang S, Latham R, Ross RB, Kimpe D (2009) Interfaces for coordinated access in the file system. CLUSTER, pp 1–9

  19. Lei H, Duchamp D (1997) An analytical approach to file prefetching. In: Proceedings of the 1997 USENIX Annual Technical Conference, pp 275–288

  20. Liao W-K, Ching A, Coloma K, Choudhary A, Ward L (2007) An implementation and evaluation of client-side file caching for MPI-IO. In: Proceedings of IEEE International parallel and distributed processing symposium

  21. Lofstead JF, Klasky S, Schwan K, Podhorszki N, Jin C (2008) Flexible I/O and integration for scientific codes through the adaptable I/O system (ADIOS). In: Proceedings of the 6th International Workshop on challenges of large applications in distributed environments

  22. Lu Y, Chen Y, Amritkar Y, Thakur R, Zhuang Y (2012) A new data sieving approach for high performance I/O. In: Proceedings of 7th International Conference on Future Information Technology, Vancouver, Canada

  23. May J (2001) Parallel I/O for high performance computing. Morgan Kaufmann Publishing, San Francisco, CA

  24. Ma XS, Winslett M, Lee J, Yu SK (2002) Faster collective output through active buffering. IPDPS

  25. Nisar A, Liao WK, Choudhary A (2008) Scaling parallel I/O performance through I/O delegate and caching system. SC

  26. Nitzberg B, Lo V (1997) Collective buffering: improving parallel I/O performance. HPDC

  27. Oldfield R, Kotz D (2001) Armada: a parallel file system for computational grids. In: Proceedings of IEEE/ACM International Symposium on luster Computing and the Grid, pp 194–201, Brisbane, Australia. IEEE Press

  28. OCZ. OCZ revo X2 PCIE SSD. http://www.ocztechnology.com/ocz-revodrive-x2-pci-express-ssd.html

  29. Patterson RH, Gibson GA, Ginting E, Stodolsky D, Zelenka J (1995) Informed prefetching and caching. In: Proceedings of the 15th ACM Symposium on Operating Systems Principles (SOSP ’05), ACM

  30. Rafique MM, Butt AR, Nikolopoulos DS (2008) DMA-based prefetching for I/O-intensive workloads on the cell architecture. Conf. Computing, Frontiers, pp 23–32

  31. ROMIO website. http://www-unix.mcs.anl.gov/romio/

  32. Schmuck F, Haskin R (2002) GPFS: A shared-disk file system for large computing clusters. In: Proceedings of the First USENIX Conference on File and Storage Technologies, pp 231–244, USENIX

  33. Seamons K, Chen Y, Jones P, Jozwiak J, Winslett M (1995) Server-directed collective I/O in panda. In: Proceedings of Supercomputing Conference

  34. Song H, Yin Y, Chen Y, Sun X (2011) A cost intelligent application specific data layout scheme for parallel file systems. In: Proceedings of the 20th international symposium on High performance distributed computing. ACM New York, NY, USA

  35. Tran N, Reed DA (2004) Automatic ARIMA time series modeling for adaptive I/O prefetching. IEEE Trans Parallel Distrib Syst 15(4):362–377

    Article  Google Scholar 

  36. Thakur R, Gropp W, Lusk E (1999) Data sieving and collective I/O in ROMIO. In: Proceedings of the 7th Symposium on the Frontiers of Massively Parallel Computation

  37. Thakur R, Ross R, Lusk E, Gropp W (2004) Users Guide for ROMIO: a high-performance, portable MPI-IO implementation. Technical Memorandum ANL/MCS-TM-234. Mathematics and Computer Science Division, Argonne National Laboratory, Revised May

  38. Thakur R, Choudhary A, Bordawekar R, More S, Kuditipudi S (1996) Passion: optimized I/O for parallel applications. Computer 29(6):70–78

  39. Vilayannur M, Sivasubramaniam A, Kandemir MT, Thakur R, Ross R (2006) Discretionary caching for I/O on clusters. Cluster Comput 9(1):29–44

    Article  Google Scholar 

  40. Wang J, Yao X, Mitchell C, Gu P (2009) A hierarchical data cache architecture for iSCSI storage server. IEEE Trans Comput 58(4):1–15

    Article  MathSciNet  Google Scholar 

  41. Weil S, Brandt S, Miller E, Long DDE, Maltzahn C (2006) Ceph: a scalable, high-performance distributed file system. In: Proceedings of USENIX Symposium on operating Systems design and implementation

  42. Welch B, Unangst M, Abbasi Z, Gibson G, Mueller B, Small J, Zelenka J, Zhou B (2008) Scalable performance of the panasas parallel file system. In: Proceedings of the 6th USENIX Conference on File and Storage Technologies

  43. Widener P, Wolf M, Abbasi H, McManus S, Payne M, Barrick MJ, Pulikottil J, Bridges PG, Schwan K (2011) Exploiting latent I/O asynchrony in petascale science applications. IJHPCA 25(2):161–179

    Google Scholar 

  44. Yang CK, Mitra T, Chiueh T (2002) A decoupled architecture for application-specific file prefetching. Freenix Track of USENIX 2002 Annual Conference

  45. Zhang X, Jiang S, Davis K (2009) Making resonance a common case: a high-performance implementation of collective I/O on parallel file systems. In: Proceedings of the 23rd IEEE International Symposium on parallel and distributed processing

  46. Zhang Z, Lee K, Ma X, Zhou Y (2008) PFC: transparent optimization of existing prefetching strategies for multi-level storage systems. ICDCS, pp 740–751

Download references

Acknowledgments

We are thankful to the Scalable Computing Laboratory of Illinois Institute of Technology for providing the experimental test platform to carry out tests presented in this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Chen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Y., Lu, Y., Amritkar, P. et al. Performance model-directed data sieving for high-performance I/O. J Supercomput 71, 2066–2090 (2015). https://doi.org/10.1007/s11227-014-1277-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-014-1277-8

Keywords

Navigation