Skip to main content

MRFS: A Distributed Files System with Geo-replicated Metadata

  • Conference paper
Algorithms and Architectures for Parallel Processing (ICA3PP 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8631))

Abstract

Distributed file system is one of the key blocks of data centers. With the advance in geo-replicated storage systems across data centers, both system scale and user scale are becoming larger and larger. Then, a single metadata server in distributed file system may lead to capacity bottleneck and high latency without considering locality. In this paper, we present the design and implementation of MRFS (Metadata Replication File System), a distributed file system with hierarchical and efficient distributed metadata management, which introduces multiple metadata servers (MDS) and an additional namespace server (NS). Metadata is divided into non-overlapping parts and stored on MDS in which the creation operation is raised, while namespace and directory information is maintained in NS. Such a hierarchical design not only achieves high scalability but also provides low-latency because it satisfies a majority of requests in local MDS. To address hotspot issues and flash crowds, the system supports flexible and configurable metadata replication among MDSs. Evaluation results show that our system MRFS is effective and efficient, and the replication mechanism brings substantial local visit at the cost of affordable memory overhead under various scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Leung, A.W., Shao, M., Bisson, T., Pasupathy, S., Miller, E.L.: Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems. In: FAST, vol. 9, pp. 153–166 (2009)

    Google Scholar 

  2. Roselli, D.S., Lorch, J.R., Anderson, T.E.: A Comparison of File System Workloads. In: USENIX Annual Technical Conference, General Track, pp. 41–54 (2000)

    Google Scholar 

  3. Traeger, A., Zadok, E., Joukov, N., Wright, C.P.: A nine year study of file system and storage benchmarking. ACM Transactions on Storage (TOS) 4(2), 5 (2008)

    Google Scholar 

  4. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–10. IEEE (2010)

    Google Scholar 

  5. Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. In: ACM SIGOPS Operating Systems Review, vol. 37(5), pp. 29–43. ACM (2003)

    Google Scholar 

  6. MooseFS, http://www.moosefs.org

  7. Satyanarayanan, M., Kistler, J.J., Kumar, P., Okasaki, M.E., Siegel, E.H., Steere, D.C.: Coda: A highly available file system for a distributed workstation environment. IEEE Transactions on Computers 39(4), 447–459 (1990)

    Article  Google Scholar 

  8. Rosenblum, M., Ousterhout, J.K.: The design and implementation of a log-structured file system. ACM Transactions on Computer Systems (TOCS) 10(1), 26–52 (1992)

    Article  Google Scholar 

  9. Brandt, S.A., Miller, E.L., Long, D.D., Xue, L.: Efficient metadata management in large distributed storage systems. In: 2013 IEEE 10th International Conference on Mobile Ad-Hoc and Sensor Systems, pp. 290–290 (2003)

    Google Scholar 

  10. Weil, S.A., Brandt, S.A., Miller, E.L., Maltzahn, C.: CRUSH: Controlled, scalable, decentralized placement of replicated data. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, p. 122. ACM (2006)

    Google Scholar 

  11. Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D., Maltzahn, C.,, C.: A scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation. USENIX Association (2006)

    Google Scholar 

  12. Zhu, Y., Jiang, H., Wang, J.: Hierarchical bloom filter arrays (hba): A novel, scalable metadata management system for large cluster-based storage. In: 2004 IEEE International Conference on Cluster Computing, pp. 165–174 (2004)

    Google Scholar 

  13. GlusterFS, http://www.gluster.org

  14. MapR, http://www.mapr.com

  15. FUSE, http://fuse.sourceforge.net

  16. NumPy, http://www.numpy.org

  17. Arnold, B.C.: Pareto distribution. John Wiley & Sons, Inc. (1985)

    Google Scholar 

  18. Reed, W.J.: The Pareto, Zipf and other power laws. Economics Letters 74(1) (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Yu, J., Wu, W., Yang, D., Huang, N. (2014). MRFS: A Distributed Files System with Geo-replicated Metadata. In: Sun, Xh., et al. Algorithms and Architectures for Parallel Processing. ICA3PP 2014. Lecture Notes in Computer Science, vol 8631. Springer, Cham. https://doi.org/10.1007/978-3-319-11194-0_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11194-0_21

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11193-3

  • Online ISBN: 978-3-319-11194-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics