Skip to main content

CLMS: Configurable and Lightweight Metadata Service for Parallel File Systems on NVMe SSDs

  • Conference paper
  • First Online:
Advanced Parallel Processing Technologies (APPT 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14103))

Included in the following conference series:

  • 271 Accesses

Abstract

With the tendency of running large-scale data-intensive applications on High-Performance Computing (HPC) systems, the I/O workloads of HPC storage systems are becoming more complex, such as the increasing metadata-intensive I/O operations in Exascale computing and High-Performance Data Analytics (HPDA). To meet the increasing performance requirements of the metadata service in HPC parallel file systems, this paper proposes a Configurable and Lightweight Metadata Service (CLMS) design for the parallel file systems on NVMe SSDs. CLMS introduces a configurable metadata distribution policy that simultaneously enables the directory-based and hash-based metadata distribution strategies and can be activated according to the application I/O access pattern, thus improving the processing efficiency of metadata accesses from different kinds of data-intensive applications. CLMS further reduces the memory copy and serialization processing overhead in the I/O path through the full-user metadata service design. We implemented the CLMS prototype and evaluated it under the MDTest benchmarks. Our experimental results demonstrate that CLMS can significantly improve the performance of metadata services. Besides, CMLS achieves a linear growth trend as the number of metadata servers increases for the unique-directory file distribution pattern.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. IOR/mdtest (2020). https://github.com/hpc/ior

  2. Amvrosiadis, G., Park, J.W., Ganger, G.R., Gibson, G.A., Baseman, E., DeBardeleben, N.: On the diversity of cluster workloads and its impact on research results. In: 2018 USENIX Annual Technical Conference (USENIX ATC 2018), pp. 533–546 (2018)

    Google Scholar 

  3. Chen, Y., Shu, J., Ou, J., Lu, Y.: HiNFS: a persistent memory file system with both buffering and direct-access. ACM Trans. Storage (ToS) 14(1), 1–30 (2018)

    Google Scholar 

  4. Devarakonda, M.V., Mohindra, A., Simoneaux, J., Tetzlaff, W.H.: Evaluation of design alternatives for a cluster file system. In: USENIX, pp. 35–46 (1995)

    Google Scholar 

  5. Dorier, M., Antoniu, G., Ross, R., Kimpe, D., Ibrahim, S.: CALCioM: mitigating I/O interference in HPC systems through cross-application coordination. In: 2014 IEEE 28th International Parallel and Distributed Processing Symposium, pp. 155–164. IEEE (2014)

    Google Scholar 

  6. Dulloor, S.R., et al.: System software for persistent memory. In: Proceedings of the Ninth European Conference on Computer Systems, pp. 1–15 (2014)

    Google Scholar 

  7. Hua, Y., Jiang, H., Zhu, Y., Feng, D., Tian, L.: SmartStore: a new metadata organization paradigm with semantic-awareness for next-generation file systems. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pp. 1–12 (2009)

    Google Scholar 

  8. Kougkas, A., Devarajan, H., Sun, X.H.: Hermes: a heterogeneous-aware multi-tiered distributed I/O buffering system. In: Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, pp. 219–230 (2018)

    Google Scholar 

  9. Lensing, P.H., Cortes, T., Hughes, J., Brinkmann, A.: File system scalability with highly decentralized metadata on independent storage devices. In: 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 366–375. IEEE (2016)

    Google Scholar 

  10. Leung, A.W., Shao, M., Bisson, T., Pasupathy, S., Miller, E.L.: Spyglass: fast, scalable metadata search for large-scale storage systems. In: FAST, vol. 9, pp. 153–166 (2009)

    Google Scholar 

  11. Li, S., Lu, Y., Shu, J., Hu, Y., Li, T.: LocoFS: a loosely-coupled metadata service for distributed file systems. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–12 (2017)

    Google Scholar 

  12. Patil, S., Gibson, G.A.: Scale and concurrency of giga+: file system directories with millions of files. In: FAST, vol. 11, p. 13 (2011)

    Google Scholar 

  13. Ren, K., Zheng, Q., Patil, S., Gibson, G.: IndexFS: scaling file system metadata performance with stateless caching and bulk insertion. In: SC 20: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 237–248. IEEE (2014)

    Google Scholar 

  14. Ross, R.B., Thakur, R., et al.: PVFS: a parallel file system for Linux clusters. In: Proceedings of the 4th Annual Linux Showcase and Conference, pp. 391–430 (2000)

    Google Scholar 

  15. Schmuck, F.B., Haskin, R.L.: GPFS: a shared-disk file system for large computing clusters. In: FAST, vol. 2 (2002)

    Google Scholar 

  16. Sim, H., Kim, Y., Vazhkudai, S.S., Vallée, G.R., Lim, S.H., Butt, A.R.: TagIt: an integrated indexing and search service for file systems. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–12 (2017)

    Google Scholar 

  17. Thapaliya, S., Bangalore, P., Lofstead, J., Mohror, K., Moody, A.: Managing I/O interference in a shared burst buffer system. In: 2016 45th International Conference on Parallel Processing (ICPP), pp. 416–425. IEEE (2016)

    Google Scholar 

  18. Vef, M.A., et al.: GekkoFS-a temporary distributed file system for HPC applications. In: 2018 IEEE International Conference on Cluster Computing (CLUSTER), pp. 319–324. IEEE (2018)

    Google Scholar 

  19. Wang, T., Mohror, K., Moody, A., Sato, K., Yu, W.: An ephemeral burst-buffer file system for scientific applications. In: SC 2016: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 807–818. IEEE (2016)

    Google Scholar 

  20. Wang, T., Yu, W., Sato, K., Moody, A., Mohror, K.: BurstFS: a distributed burst buffer file system for scientific applications. Technical report, Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States) (2016)

    Google Scholar 

  21. Zheng, Q., et al.: DeltaFS: a scalable no-ground-truth filesystem for massively-parallel computing. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–15 (2021)

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the Foundation of National Key Research and Development Program of China under Grant 2021YFB0300101, the Foundation of State Key Lab of High-Performance Computing under Grant 202101-09, and the Natural Science Foundation of NUDT under Grant ZK21-03.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuchao Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Q., Lv, S., Xie, X., Song, Z. (2024). CLMS: Configurable and Lightweight Metadata Service for Parallel File Systems on NVMe SSDs. In: Li, C., Li, Z., Shen, L., Wu, F., Gong, X. (eds) Advanced Parallel Processing Technologies. APPT 2023. Lecture Notes in Computer Science, vol 14103. Springer, Singapore. https://doi.org/10.1007/978-981-99-7872-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7872-4_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7871-7

  • Online ISBN: 978-981-99-7872-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics