Skip to main content

Workload Evaluation Tool for Metadata Distribution Method

  • Conference paper
  • First Online:
Simulation Tools and Techniques (SIMUtools 2020)

Abstract

In the High Performance Computing field (HPC), metadata server cluster is a critical aspect of a storage system performance and with object storage growth, systems must now be able to distribute metadata across servers thanks to distributed metadata servers. Storage systems reach better performances if the workload remains balanced over time. Indeed, an unbalanced distribution can lead to frequent requests to a subset of servers while other servers are completely idle. To avoid this issue, different metadata distribution methods exist and each one has its best use cases. Moreover, each system has different usages and different workloads, which means that one distribution method could fit to a specific kind of storage system and not to another one. To this end, we propose a tool to evaluate metadata distribution methods with different workloads. In this paper, we describe this tool and we use it to compare state-of-the-art methods and one method we developed. We also show how outputs generated by our tool enable us to deduce distribution weakness and chose the most adapted method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Raicu, I., Foster, I.T., Beckman, P.: Making a case for distributed file systems at exascale. In: Proceedings of the Third International Workshop on Large-Scale System and Application Performance (2011)

    Google Scholar 

  2. IEEE. IEEE 1003 - IEEE Standard for Information Technology - Portable Operating System Interface (POSIX(R)) (1988)

    Google Scholar 

  3. Mesnier, M., Ganger, G.R., Riedel, E.: Object-based storage. IEEE Commun. Mag. 41(8), 84–90 (2003)

    Google Scholar 

  4. Meshram, V., Besseron, X., Ouyang, X., Rajachandrasekar, R., Darbha, R.P., Panda, D.K.: Can a decentralized metadata service layer benefit parallel filesystems? In: 2011 IEEE International Conference on Cluster Computing (2011)

    Google Scholar 

  5. Singh, H.J., Bawa, S.: Scalable metadata management techniques for ultra-large distributed storage systems-a systematic review. ACM Comput. Surv. (CSUR) 51(4), 1–37 (2018)

    Google Scholar 

  6. Weil, S.J., Pollack, K.T., Brandt, S.A., Miller, E.L.: Dynamic metadata management for petabyte-scale file systems. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing (2004)

    Google Scholar 

  7. DeCandia, G., et al.: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)

    Google Scholar 

  8. Xu, Q., Arumugam, R.V., Yong, K.L., Mahadevan, S.: Efficient and scalable metadata management in eb-scale file systems. IEEE Transactions on Parallel and Distributed Systems (2014)

    Google Scholar 

  9. Li, W., Xue, W., Shu, J., Zheng, W.: Dynamic hashing: adaptive metadata management for petabyte-scale file systems. In: 23rd IEEE/14th NASA Goddard Conference on Mass Storage System and Technologies (2006)

    Google Scholar 

  10. Tang, H., Byna, S., Dong, B., Liu, J., Koziol, Q.: Someta: scalable object-centric metadata management for high performance computing. In: Cluster Computing (CLUSTER), 2017 IEEE International Conference on (2017)

    Google Scholar 

  11. Morrone, C.J., Loewe, B., McLarty, T., Kroiss, R.: Hpc io benchmark repository (2011). https://github.com/hpc/ior

  12. Katcher, J.: Postmark: a new file system benchmark. Technical report TR3022, Network Appliance (1997)

    Google Scholar 

  13. Xing, J., Xiong, J., Sun, N., Ma, J.: Adaptive and scalable metadata management to support a trillion files. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (2009)

    Google Scholar 

  14. Yang, D., Wu, W., Li, Z., Yu, J., Li, Y.: PPMS: a peer to peer metadata management strategy for distributed file systems. In: Hsu, C.-H., Shi, X., Salapura, V. (eds.) NPC 2014. LNCS, vol. 8707, pp. 435–445. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44917-2_36

    Chapter  Google Scholar 

  15. Braam, P.: The lustre storage architecture. arXiv preprint arXiv:1903.01955 (2019)

  16. Zheng, Q., Chen, H., Wang, Y., Duan, J., Huang, Z.: Cosbench: a benchmark tool for cloud object storage services. In: 2012 IEEE Fifth International Conference on Cloud Computing (2012)

    Google Scholar 

  17. Xue, W., Zhu, M.: Efficient dynamic management of distributed metadata. In: Zhu, R., Ma, Y. (eds.) Information Engineering and Applications. LNEE, vol. 154, pp. 354–362. Springer, London (2012). https://doi.org/10.1007/978-1-4471-2386-6_46

    Chapter  Google Scholar 

  18. Xiong, J., Hu, Y., Li, G., Tang, R., Fan, Z.: Metadata distribution and consistency techniques for large-scalecluster file systems. IEEE Trans. Parallel Distrib. Syst. 22(5), 803–816 (2010)

    Google Scholar 

  19. Wang, J., Feng, D., Wang, F., Lu, C.: MHS: a distributed metadata management strategy. J. Syst. Softw. 82(12), 2004–2011 (2009)

    Google Scholar 

  20. Billa, B.: Medie: a metadata distribution evaluator for object storage systems (2020). https://github.com/Billae/MeDiE

  21. Hintjens, P.: ZeroMQ, Messaging for Many Applications. O’Reilly Media, Sebastopol (2013)

    Google Scholar 

  22. Battle, R., Benson, E.: Bridging the semantic web and web 2.0 with representational state transfer (rest). J. Web Semant. 6(1), 61–69 (2008)

    Google Scholar 

  23. Sanfilippo, S., Noordhuis, P., Stancliff, M.: Redis, an in-memory data structure store, used as a database, cache and message broker (2009). https://github.com/antirez/redis

  24. Diakhate, F., Besnard, J.-B.: Pcocc: Run vms on an hpc cluster (2016). https://github.com/cea-hpc/pcocc

  25. CEA. Inauguration of joliot-curie, the french supercomputer dedicated to french and european research (2019). http://www.cea.fr/english/Pages/News/Inauguration-of-Joliot-Curie,-the-French-supercomputer-dedicated-to-French-and-European-research.aspx

  26. Scott, P.: C port of murmur3 hash (2011). https://github.com/PeterScott/murmur3

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Billa, É., Deniel, P., Zertal, S. (2021). Workload Evaluation Tool for Metadata Distribution Method. In: Song, H., Jiang, D. (eds) Simulation Tools and Techniques. SIMUtools 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 369. Springer, Cham. https://doi.org/10.1007/978-3-030-72792-5_63

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-72792-5_63

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-72791-8

  • Online ISBN: 978-3-030-72792-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics