Skip to main content

A Metadata Cooperative Caching Architecture Based on SSD and DRAM for File Systems

  • Conference paper
  • First Online:
Book cover Algorithms and Architectures for Parallel Processing (ICA3PP 2015)

Abstract

The metadata IO plays a critical role in achieving the high IO scalability and throughput to file systems. Due to the resource contention, the performance of the metadata IO is low. Adding the SSD into the storage system is a effective way to improve the performance of file systems, but the current methods mainly focus on the performance of the data server, rarely aim to the performance of the metadata IO. In this paper, we proposed a novel cooperative caching management algorithm based on DRAM and SSD named ACSH. By exploiting the temporal locality widely exhibited in most of the metadata workloads, ACSH can improve the performance of the metadata IO with reducing the write traffic to the SSD, and it includes a adaptive adjustment model, which can adjust the number of the cached metadata according to the locality strength of the metadata workload for improving the perforamcne and reducing the write traffic to the SSD cache layer further. ACSH has been evaluated based on the real-world workloads. Our experiments show that ACSH can reduce the latency by up to 1.5–3X in contrast with the original cache consisting of DRAM which has the same cost with ACSH. Compared with the recent study LARC, it can reduce the write traffic to the SSD by up to 23–30 %.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abad, C.L., Roberts, N., Lu, Y., Campbell, R.H.: A storage-centric analysis of mapreduce workloads: file popularity, temporal locality and arrival patterns. In: IEEE International Symposium on Workload Characterization (IISWC), pp. 100–109. IEEE (2012)

    Google Scholar 

  2. Adams, I.F., Madden, B.A., Frank, J.C., Storer, M.W., Miller, E.L., Harano, G.: Usage behavior of a large-scale scientific archive. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 86. IEEE Computer Society Press (2012)

    Google Scholar 

  3. Application, O.L.T.P.: I/O. UMass Trace Repository

    Google Scholar 

  4. Bent, J., Faibish, S., Ahrens, J., Grider, G., Patchett, J., Tzelnic, P., Woodring, J.: Jitter-free co-processing on a prototype exascale storage stack. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST 2012), pp. 1–5. IEEE (2012)

    Google Scholar 

  5. Bent, J., Grider, G., Kettering, B., Manzanares, A., McClelland, M., Torres, A., Torrez, A.: Storage challenges at Los Alamos National Lab. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST 2012), pp. 1–5. IEEE (2012)

    Google Scholar 

  6. Breslau, L., Cao, P., Fan, L., Phillips, G., Shenker, S.: Web caching and zipf-like distributions: evidence and implications. In: Proceedings of Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies, INFOCOM 1999, vol. 1, pp. 126–134. IEEE (1999)

    Google Scholar 

  7. Byan, S., Lentini, J., Madan, A., Pabón, L., Condict, M., Kimmel, J., Kleiman, S., Small, C., Storer, M.: Mercury: host-side flash caching for the data center. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST 2012), pp. 1–12. IEEE (2012)

    Google Scholar 

  8. Carns, P., Harms, K., Allcock, W., Bacon, C., Lang, S., Latham, R., Ross, R.: Storage access characteristics of computational science applications. In: Proceedings of 27th IEEE Conference on Mass Storage Systems and Technologies (MSST) (2011)

    Google Scholar 

  9. Chen, F., Koufaty, D.A., Zhang, X.: Hystor: making the best use of solid state drives in high performance storage systems. In: Proceedings of the International Conference on Supercomputing, pp. 22–32. ACM (2011)

    Google Scholar 

  10. Devulapalli, A., Ohio, P.: File creation strategies in a distributed metadata file system. In: IEEE International Conference on Parallel and Distributed Processing Symposium, IPDPS 2007, pp. 1–10. IEEE (2007)

    Google Scholar 

  11. Gu, P., Wang, J., Zhu, Y., Jiang, H., Shang, P.: A novel weighted-graph-based grouping algorithm for metadata prefetching. IEEE Trans. Comput. 59(1), 1–15 (2010)

    Article  MathSciNet  Google Scholar 

  12. Guerra, J., Pucha, H., Glider, J.S., Belluomini, W., Rangaswami, R.: Cost effective storage using extent based dynamic tiering. In: FAST, pp. 273–286 (2011)

    Google Scholar 

  13. Huang, S., Wei, Q., Chen, J., Chen, C., Feng, D.: Improving flash-based disk cache with lazy adaptive replacement. In: 29th Symposium on Mass Storage Systems and Technologies (MSST 2013), pp. 1–10. IEEE (2013)

    Google Scholar 

  14. Kavalanekar, S., Worthington, B., Zhang, Q., Sharda, V.: Characterization of storage workload traces from production windows servers. In: IEEE International Symposium on Workload Characterization, IISWC 2008, pp. 119–128. IEEE (2008)

    Google Scholar 

  15. Kim, Y., Gunasekaran, R., Shipman, G.M., Dillow, D.A., Zhang, Z., Settlemyer, B.W.: Workload characterization of a leadership class storage cluster. In: 5th Petascale Data Storage Workshop (PDSW 2010), pp. 1–5. IEEE (2010)

    Google Scholar 

  16. Kim, Y., Gupta, A., Urgaonkar, B., Berman, P., Sivasubramaniam, A.: Hybridstore: a cost-efficient, high-performance storage system combining SSDS and HDDS. In: IEEE 19th International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS 2011), pp. 227–236. IEEE (2011)

    Google Scholar 

  17. Koller, R., Marmol, L., Rangaswami, R., Sundararaman, S., Talagala, N., Zhao, M.: Write policies for host-side flash caches. In: FAST, pp. 45–58 (2013)

    Google Scholar 

  18. Koller, R., Rangaswami, R.: I/o deduplication: utilizing content similarity to improve i/o performance. ACM Trans. Storage (TOS) 6(3), 13 (2010)

    Google Scholar 

  19. Lee, E., Bahn, H., Noh, S.H.: Unioning of the buffer cache and journaling layers with non-volatile memory. In: FAST, pp. 73–80 (2013)

    Google Scholar 

  20. Leung, A.W., Pasupathy, S., Goodson, G.R., Miller, E.L.: Measurement and analysis of large-scale network file system workloads. In: USENIX Annual Technical Conference, vol. 1, no. 2, pp. 2–5 (June 2008)

    Google Scholar 

  21. Leung, A.W.: Organizing, indexing, and searching large-scale file systems. Dissertations & Theses - Gradworks (2009)

    Google Scholar 

  22. Leung, A.W., Shao, M., Bisson, T., Pasupathy, S., Miller, E.L.: Spyglass: fast, scalable metadata search for large-scale storage systems. In: FAST, vol. 9, pp. 153–166 (2009)

    Google Scholar 

  23. Liu, N., Cope, J., Carns, P., Carothers, C., Ross, R., Grider, G., Crume, A., Maltzahn, C.: On the role of burst buffers in leadership-class storage systems. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST 2012), pp. 1–11. IEEE (2012)

    Google Scholar 

  24. Narayanan, D., Thereska, E., Donnelly, A., Elnikety, S., Rowstron, A.: Migrating server storage to SSDS: analysis of tradeoffs. In: Proceedings of the 4th ACM European Conference on Computer Systems, pp. 145–158. ACM (2009)

    Google Scholar 

  25. Oh, Y., Choi, J., Lee, D., Noh, S.H.: Caching less for better performance: balancing cache size and update cost of flash memory cache in hybrid storage systems. In: FAST, vol. 12 (2012)

    Google Scholar 

  26. Peters, M.: Compellent harnessing SSDS potential. ESG Storage Systems Brief (2009)

    Google Scholar 

  27. Qiang, Z., Chu, L.: Cernet io workloads: analysis and characterization? J. Comput. Inf. Syst. 8(14), 6017–6024 (2012)

    Google Scholar 

  28. Qiu, S., Reddy, A.N.: Exploiting superpages in a nonvolatile memory file system. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST 2012), pp. 1–5. IEEE (2012)

    Google Scholar 

  29. Ghemawat, S., Gobioff, H., Leung, S.T.: The Google file system. ACM SIGOPS Operating Syst. Rev. 37(5), 29–43 (2003)

    Google Scholar 

  30. Sehgal, P., Voruganti, K., Sundaram, R.: Slo-aware hybrid store. In: IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST 2012), pp. 1–6. IEEE (2012)

    Google Scholar 

  31. Lasser, C., Lordi, R., Stanfill, C.: U.S. Patent No. 5,897,638. U.S. Patent and Trademark Office, Washington, DC (1999)

    Google Scholar 

  32. Vanichpun, S., Makowski, A.M.: The output of a cache under the independent reference model: where did the locality of reference go? In: ACM SIGMETRICS Performance Evaluation Review, vol. 32, pp. 295–306. ACM (2004)

    Google Scholar 

  33. Wallace, G., Douglis, F., Qian, H., Shilane, P., Smaldone, S., Chamness, M., Hsu, W.: Characteristics of backup workloads in production systems. In: FAST, p. 4 (2012)

    Google Scholar 

  34. Wang, F., Xin, Q., Hong, B., Brandt, S.A., Miller, E.L., Long, D.D., McLarty, T.T.: File system workload analysis for large scale scientific computing applications. In: Proceedings of the 21st IEEE/12th NASA Goddard Conference on Mass Storage Systems and Technologies, pp. 139–152 (2004)

    Google Scholar 

  35. Weil, S.A., Brandt, S.A., Miller, E.L., Long, D.D.E., Maltzahn, C.: Ceph: a scalable, high-performance distributed file system. In: Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI), pp. 307–320 (2006)

    Google Scholar 

  36. Weil, S.A., Pollack, K.T., Brandt, S.A., Miller, E.L.: Dynamic metadata management for petabyte-scale file systems. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, p. 4. IEEE Computer Society (2004)

    Google Scholar 

  37. Welch, B., Noer, G.: Optimizing a hybrid SSD/HDD HPC storage system based on file size distributions. In: IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST 2013), pp. 1–12 (2013)

    Google Scholar 

  38. Dufrasne, B., Bauer, W., Careaga, B., Myyrrylainen, J., Rainero, A., Usong, P.: IBM System Storage DS8700 Architecture and Implementation. IBM Redbooks (2011)

    Google Scholar 

  39. Xing, J., Xiong, J., Sun, N., Ma, J.: Adaptive and scalable metadata management to support a trillion files. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, p. 26. ACM (2009)

    Google Scholar 

  40. Xu, Q., Arumugam, R.V., Yong, K.L., Mahadevan, S.: Drop: facilitating distributed metadata management in EB-scale storage systems. In: IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST 2013), pp. 1–10. IEEE (2013)

    Google Scholar 

Download references

Acknowledgments

This version has benefited greatly from the many detailed comments and suggestions from the anonymous reviewers. The authors gratefully acknowledge these comments and suggestions. The work described in this paper are supported by the National Natural Science Foundation of China under Grant No. 61370059 and 61232009, the Beijing Natural Science Foundation under Grant No. 4152030, the fund of the State Key Laboratory of Software Development Environment under Grant No. SKLSDE-2014ZX-05, the Open Research Fund of The Academy of Satellite Application under Grant NO. 2014_CXJJ-DSJ_04, the Fundamental Research Funds for the Central Universities under Grant NO. YWF-14-JSJXY-14 and YWF-15-GJSYS-085, the Open Project Program of National Engineering Research Center for Science&Technology Resources Sharing Service (Beihang University).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhisheng Huo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Huo, Z. et al. (2015). A Metadata Cooperative Caching Architecture Based on SSD and DRAM for File Systems. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9529. Springer, Cham. https://doi.org/10.1007/978-3-319-27122-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27122-4_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27121-7

  • Online ISBN: 978-3-319-27122-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics