Abstract
In the High Performance Computing field (HPC), metadata server cluster is a critical aspect of a storage system performance and with object storage growth, systems must now be able to distribute metadata across servers thanks to distributed metadata servers. Storage systems reach better performances if the workload remains balanced over time. Indeed, an unbalanced distribution can lead to frequent requests to a subset of servers while other servers are completely idle. To avoid this issue, different metadata distribution methods exist and each one has its best use cases. Moreover, each system has different usages and different workloads, which means that one distribution method could fit to a specific kind of storage system and not to another one. To this end, we propose a tool to evaluate metadata distribution methods with different workloads. In this paper, we describe this tool and we use it to compare state-of-the-art methods and one method we developed. We also show how outputs generated by our tool enable us to deduce distribution weakness and chose the most adapted method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Raicu, I., Foster, I.T., Beckman, P.: Making a case for distributed file systems at exascale. In: Proceedings of the Third International Workshop on Large-Scale System and Application Performance (2011)
IEEE. IEEE 1003 - IEEE Standard for Information Technology - Portable Operating System Interface (POSIX(R)) (1988)
Mesnier, M., Ganger, G.R., Riedel, E.: Object-based storage. IEEE Commun. Mag. 41(8), 84–90 (2003)
Meshram, V., Besseron, X., Ouyang, X., Rajachandrasekar, R., Darbha, R.P., Panda, D.K.: Can a decentralized metadata service layer benefit parallel filesystems? In: 2011 IEEE International Conference on Cluster Computing (2011)
Singh, H.J., Bawa, S.: Scalable metadata management techniques for ultra-large distributed storage systems-a systematic review. ACM Comput. Surv. (CSUR) 51(4), 1–37 (2018)
Weil, S.J., Pollack, K.T., Brandt, S.A., Miller, E.L.: Dynamic metadata management for petabyte-scale file systems. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing (2004)
DeCandia, G., et al.: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)
Xu, Q., Arumugam, R.V., Yong, K.L., Mahadevan, S.: Efficient and scalable metadata management in eb-scale file systems. IEEE Transactions on Parallel and Distributed Systems (2014)
Li, W., Xue, W., Shu, J., Zheng, W.: Dynamic hashing: adaptive metadata management for petabyte-scale file systems. In: 23rd IEEE/14th NASA Goddard Conference on Mass Storage System and Technologies (2006)
Tang, H., Byna, S., Dong, B., Liu, J., Koziol, Q.: Someta: scalable object-centric metadata management for high performance computing. In: Cluster Computing (CLUSTER), 2017 IEEE International Conference on (2017)
Morrone, C.J., Loewe, B., McLarty, T., Kroiss, R.: Hpc io benchmark repository (2011). https://github.com/hpc/ior
Katcher, J.: Postmark: a new file system benchmark. Technical report TR3022, Network Appliance (1997)
Xing, J., Xiong, J., Sun, N., Ma, J.: Adaptive and scalable metadata management to support a trillion files. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (2009)
Yang, D., Wu, W., Li, Z., Yu, J., Li, Y.: PPMS: a peer to peer metadata management strategy for distributed file systems. In: Hsu, C.-H., Shi, X., Salapura, V. (eds.) NPC 2014. LNCS, vol. 8707, pp. 435–445. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44917-2_36
Braam, P.: The lustre storage architecture. arXiv preprint arXiv:1903.01955 (2019)
Zheng, Q., Chen, H., Wang, Y., Duan, J., Huang, Z.: Cosbench: a benchmark tool for cloud object storage services. In: 2012 IEEE Fifth International Conference on Cloud Computing (2012)
Xue, W., Zhu, M.: Efficient dynamic management of distributed metadata. In: Zhu, R., Ma, Y. (eds.) Information Engineering and Applications. LNEE, vol. 154, pp. 354–362. Springer, London (2012). https://doi.org/10.1007/978-1-4471-2386-6_46
Xiong, J., Hu, Y., Li, G., Tang, R., Fan, Z.: Metadata distribution and consistency techniques for large-scalecluster file systems. IEEE Trans. Parallel Distrib. Syst. 22(5), 803–816 (2010)
Wang, J., Feng, D., Wang, F., Lu, C.: MHS: a distributed metadata management strategy. J. Syst. Softw. 82(12), 2004–2011 (2009)
Billa, B.: Medie: a metadata distribution evaluator for object storage systems (2020). https://github.com/Billae/MeDiE
Hintjens, P.: ZeroMQ, Messaging for Many Applications. O’Reilly Media, Sebastopol (2013)
Battle, R., Benson, E.: Bridging the semantic web and web 2.0 with representational state transfer (rest). J. Web Semant. 6(1), 61–69 (2008)
Sanfilippo, S., Noordhuis, P., Stancliff, M.: Redis, an in-memory data structure store, used as a database, cache and message broker (2009). https://github.com/antirez/redis
Diakhate, F., Besnard, J.-B.: Pcocc: Run vms on an hpc cluster (2016). https://github.com/cea-hpc/pcocc
CEA. Inauguration of joliot-curie, the french supercomputer dedicated to french and european research (2019). http://www.cea.fr/english/Pages/News/Inauguration-of-Joliot-Curie,-the-French-supercomputer-dedicated-to-French-and-European-research.aspx
Scott, P.: C port of murmur3 hash (2011). https://github.com/PeterScott/murmur3
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Billa, É., Deniel, P., Zertal, S. (2021). Workload Evaluation Tool for Metadata Distribution Method. In: Song, H., Jiang, D. (eds) Simulation Tools and Techniques. SIMUtools 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 369. Springer, Cham. https://doi.org/10.1007/978-3-030-72792-5_63
Download citation
DOI: https://doi.org/10.1007/978-3-030-72792-5_63
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72791-8
Online ISBN: 978-3-030-72792-5
eBook Packages: Computer ScienceComputer Science (R0)