Skip to main content
Log in

Accelerating I/O performance of ZFS-based Lustre file system in HPC environment

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

To meet increasing data access performance demands of applications run on high-performance computing (HPC) systems, an efficient design of HPC storage file system is becoming more important. Nowadays, Lustre is the most commonly used distributed parallel file system that has been deployed on world’s fastest supercomputers and currently supports two backend file systems: ldiskfs and ZFS. Although ZFS provides numerous performance optimization options, most of the supercomputer systems use Lustre with ldiskfs for higher performance. In this work, we analyze the root cause of low I/O performance on a ZFS-based Lustre file system and propose a novel ZFS scheme, dynamic-ZFS, which combines two optimization approaches. The experiment results show that our approach can improve the sequential I/O performance by 37% on average. We demonstrate that dynamic-ZFS can deliver an I/O performance comparable to that of ldiskfs-based Lustre while still providing a multitude of beneficial features.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Availability of data and materials

The results/data/figures in this manuscript have not been published elsewhere, nor are they under consideration (from you or one of your Contributing Authors) by another publisher.

References

  1. Allen B, Bresnahan J, Childers L, Foster I, Kandaswamy G, Kettimuthu R, Kordas J, Link M, Martin S, Pickett K, Tuecke S (2012) Software as a service for data scientists. Commun ACM 55(2):81–88

    Article  Google Scholar 

  2. Arslan E, Alhussen A (2018) A low-overhead integrity verification for big data transfers. In: 2018 IEEE International Conference on Big Data (Big Data), pp 4227–4236

  3. Bagbaba A, Wang X, Niethammer C, Gracia J (2021) Improving the I/O performance of applications with predictive modeling based auto-tuning. In: 2021 International Conference on Engineering and Emerging Technologies (ICEET), pp 1–6

  4. Behrmann N (2017) Support for external data transformation in ZFS. PhD thesis, Master’s Thesis. Universität Hamburg

  5. Babak Behzad, Surendra Byna, Prabhat Snir Marc (2019) Optimizing I/O performance of HPC applications with autotuning. ACM Trans Parallel Comput 5(4):1–27

    Google Scholar 

  6. Behzad B, Luu HVT, Huchette J, Byna S, Prabhat, Aydt R, Koziol Q, Snir M (2013) Taming parallel I/O complexity with auto-tuning. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC ’13, New York, NY, USA. Association for Computing Machinery

  7. Bhimji W, Bard D, Romanus M, Paul D, Ovsyannikov A, Friesen B, Bryson M, Correa J, Lockwood GK, Tsulaia V, et al (2016) Accelerating science with the NERSC burst buffer early user program

  8. Bonwick J, Ahrens M, Henson V, Maybee M, Shellenbaum M (2003) The zettabyte file system. In: Proceedings of the 2nd Usenix Conference on File and Storage Technologies, vol 215

  9. Chien SWD, Markidis S, Sishtla CP, Santos L, Herman P, Narasimhamurthy S, Laure E (2018) Characterizing deep-learning I/O workloads in tensorflow. In: 2018 IEEE/ACM 3rd International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS). IEEE, pp 54–63

  10. Comparison of Lustre with ldiskfs and ZFS. https://wiki.lustre.org/Introduction_to_Lustre_Object_Storage_Devices_(OSDs)

  11. fio—flexible io tester. https://git.kernel.dk/?p=fio.git;a=summary

  12. Fletcher J (1982) An arithmetic checksum for serial transmissions. IEEE Trans Commun 30(1):247–252

    Article  MathSciNet  Google Scholar 

  13. Frontier supercomputer. https://www.olcf.ornl.gov/frontier/

  14. Foster I (2011) Globus online: accelerating and democratizing science through cloud-based services. IEEE Internet Comput 15(3):70–73

    Article  Google Scholar 

  15. Goponenko AV, Izadpanah R, Brandt JM, Dechev D (2020) Towards workload-adaptive scheduling for hpc clusters. In: 2020 IEEE International Conference on Cluster Computing (CLUSTER), pp 449–453

  16. Gurjar D, Kumbhar SS (2019) File I/O performance analysis of ZFS & BTRFS over iSCSI on a storage pool of flash drives. In: 2019 International Conference on Communication and Electronics Systems (ICCES), pp 484–487

  17. Gurjar D, Kumbhar SS (2019) A review on performance analysis of ZFS & BTRFS. In: 2019 International Conference on Communication and Signal Processing (ICCSP), pp 0073–0076

  18. Hemmert KS, Rajan M, Hoekstra RJ, Dawson S, Vigil M, Grunau D, Lujan J, Morton D, Nam HA, Peltz Jr P, et al (2016) Trinity: architecture and early experience. Technical report, Sandia National Lab.(SNL-NM), Albuquerque, NM, USA

  19. Ibrahim KZ, Nguyen T, Nam HA, Bhimji W, Farrell S, Oliker L, Rowan M, Wright NJ, Williams S (2021) Architectural requirements for deep learning workloads in HPC environments. In: 2021 International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), pp 7–17

  20. Introduction to Lustre. https://wiki.lustre.org/Introduction_to_Lustre

  21. Koo D, Lee J, Liu J, Byun E-K, Kwak J-H, Lockwood GK, Hwang S, Antypas K, Kesheng W, Eom H (2021) An empirical study of I/O separation for burst buffers in hpc systems. J Parallel Distrib Comput 148:96–108

    Article  Google Scholar 

  22. Kurth T, Treichler S, Romero J, Mudigonda M, Luehr N, Phillips E, Mahesh A, Matheson M, Deslippe J, Fatica M et al (2018) Exascale deep learning for climate analytics. In: SC18: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, pp 649–660

  23. Li X, Xu L, Zhang J (2019) Improving the thread scalability and parallelism of BWA-MEM on intel HPC platforms. In: 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp 1858–1865

  24. Liu S, Jung E-S, Kettimuthu R, Sun X-H, Papka M (2016) Towards optimizing large-scale data transfers with end-to-end integrity verification. In: 2016 IEEE International Conference on Big Data (Big Data), pp 3002–3007

  25. Lockwood GK, Chiusole A, Wright NJ (2021) New challenges of benchmarking all-flash storage for HPC. In: 2021 IEEE/ACM Sixth International Parallel Data Systems Workshop (PDSW), pp 1–8

  26. Lustre file system. https://www.opensfs.org/lustre/

  27. Mathur A, Cao M, Bhattacharya S, Dilger A, Tomas A, Vivier L (2007) The new ext4 filesystem: current status and future plans. In: Proceedings of the Linux Symposium, 01

  28. Mohr R, Howard AP (2019) Provisioning ZFS pools on Lustre. In: Proceedings of the Practice and Experience in Advanced Research Computing on Rise of the Machines (Learning), PEARC ’19, New York, NY, USA. Association for Computing Machinery

  29. Mohr R, Peltz P (2014) Benchmarking SSD-based Lustre file system configurations. In: Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE ’14, New York, NY, USA. Association for Computing Machinery

  30. Perlmutter supercomputer. https://www.nersc.gov/systems/perlmutter/

  31. Phromchana V, Nupairoj N, Piromsopa K (2011) Performance evaluation of ZFS and LVM (with ext4) for scalable storage system. In: 2011 Eighth International Joint Conference on Computer Science and Software Engineering (JCSSE), pp 250–253

  32. Qiao Z, Fu S, Chen HB, Lang M (2017) Incorporating proactive data rescue into ZFS disk recovery for enhanced storage reliability, 11

  33. Ramirez-Gargallo G, Garcia-Gasulla M, Mantovani F (2019) Tensorflow on state-of-the-art HPC clusters: a machine learning use case. In: 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, pp 526–533

  34. Settlemyer B, Amvrosiadis G, Carns P, Ross R (2021) It’s time to talk about HPC storage: Perspectives on the past and future. Comput Sci Eng 23(6):63–68

    Article  Google Scholar 

  35. Summit supercomputer. http://www.olcf.ornl.gov/olcf-resources/compute-systems/summit/

  36. Supercomputer fugaku. https://www.r-ccs.riken.jp/en/fugaku/project

  37. System activity report. https://en.wikipedia.org/wiki/Sar_(Unix)

  38. TOP500 List—June 2022 TOP500 Supercomputer Sites. https://www.top500.org/lists/top500/2022/06/

  39. US NIST. Descriptions of sha-256, sha-384 and sha-512, 2001

  40. Widianto ED, Prasetijo AB, Ghufroni A (2016) On the implementation of zfs (zettabyte file system) storage system. In: 2016 3rd International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE), pp 408–413

  41. You H, Liu Q, Li Z, Moore S (2011) The design of an auto-tuning I/O framework on cray XT5 system. In: Cray User Group meeting (CUG 2011)

  42. Zettabyte file system. https://zfsonlinux.org/

  43. ZFS module parameters. https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html#zfs-admin-snapshot

  44. Zhang Y, Rajimwale A, Arpaci-Dusseau A, Arpaci-Dusseau R(2010) End-to-end data integrity for file systems: a ZFS case study, pp 29–42, 02

Download references

Acknowledgements

Not applicable.

Funding

This research was supported by the Korea Institute of Science and Technology Information (K-22-L02-C06-S01, K-22-L02-C01), the Basic Science Research Program (NRF-2020R1F1A1072696, NRF2021R1F1A1063438) through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT, BK21 FOUR Intelligence Computing (Dept. of Computer Science and Engineering, SNU) funded by National Research Foundation of Korea(NRF)(4199990214639), the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2022-2018-0-01423) supervised by the IITP (Institute for Information \& Communications Technology Planning \& Evaluation) and the GRRC program of Gyeong-gi province (No. GRRC-KAU-2017-B01, "Study on the Video and Space Convergence Platform for 360VR Services"). This work was also supported by the National Research Foundation of Koread(NRF) grant funded by the Korea govenment(MSIT) (NO. NRF-2022R1G1A1011433).

Author information

Authors and Affiliations

Authors

Contributions

JB wrote the main manuscript text. E-KB and HS prepared the evaluation setup, and CK helped run the experiments with JB. JL and HE gave advice throughout the paper. All authors reviewed the manuscript.

Corresponding author

Correspondence to Jaehwan Lee.

Ethics declarations

Ethics approval and consent to participate

I confirm that I understand journal The Journal of Supercomputing is a transformative journal. When research is accepted for publication, there is a choice to publish using either immediate gold open access or the traditional publishing route.

Consent for publication

I have read the Springer journal policies on author responsibilities and submit this manuscript in accordance with those policies.

Conflict of interest

I declare that the authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bang, J., Kim, C., Byun, EK. et al. Accelerating I/O performance of ZFS-based Lustre file system in HPC environment. J Supercomput 79, 7665–7691 (2023). https://doi.org/10.1007/s11227-022-04966-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04966-7

Keywords

Navigation