Skip to main content
Log in

Design and evaluation of a user-level file system for fast storage devices

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Lately, fast storage devices are rapidly increasing in social network services, cloud platforms, etc. Unfortunately, the traditional Linux I/O stack is designed to maximize performance on disk-based storage. Emerging byte-addressable and low-latency non-volatile memory technologies (e.g., phase-change memories, MRAMs, and the memristor) provide very different characteristics, so the disk-based I/O stack cannot lead to high performance. This paper presents a high performance I/O stack for the fast storage devices. Our scheme is to remove the concept of block and to simplify the whole I/O path and software stack, which results in only two layers that are the byte-capable interface and the byte-aware file system called BAFS. We aim to minimize I/O latency and maximize bandwidth by eliminating the unnecessary layers and supporting byte-addressable I/O without requiring changes to applications. We have implemented a prototype and evaluated its performance with multiple benchmarks. The experimental results show that our I/O stack achieves 6.2 times on average and up to 17.5 times performance gains compared to the existing Linux I/O stack.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Axboe, J.: Fiobenchmark, April (1998)

  2. Card, R., Tso, T., Tweedie, S.: Design and implementation of the second extended filesystem. In: Proceedings of the First Dutch International Symposium on Linux, pp. 1–6. Monterey (1994)

  3. Caulfield, A.M., De, A., Coburn, J., Mollow, T.I., Gupta, R.K., Swanson, S.: Moneta: a high-performance storage array architecture for next-generation, non-volatile memories. In: Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, pp. 385–395. IEEE Computer Society, Washington, DC (2010)

  4. Caulfield, A.M., Mollov, T.I., Eisner, L.A., De, A., Coburn, J., Swanson, S.: Providing safe, user space access to fast, solid state disks. SIGARCH Comput. Archit. News 40(1), 387–400 (2012)

    Article  Google Scholar 

  5. Condit, J., Nightingale, E.B., Frost, C., Ipek, E., Lee, B., Burger, D., Coetzee, D.: Better I/O through byte-addressable, persistent memory. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP ’09, pp. 133–146. ACM, New York (2009)

  6. Katti, R.R., Stadler, H.L., Wu, J.-C.: Non-volatile magnetic random access memory. US Patent 5,289,410, 22 Feb 1994

  7. Kim, H., Seshadri, S., Dickey, C.L., Chiu, L.: Evaluating phase change memory for enterprise storage systems: a study of caching and tiering approaches. In: Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST 14) USENIX, pp. 33–45. Santa Clara, CA (2014)

  8. Lu, L., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H., Lu, S.: A study of linux file system evolution. Trans. Storage 10(1), 3:1–3:32 (2014)

    Article  Google Scholar 

  9. Mathur, A., Cao, M., Bhattacharya, S., Dilger, A., Tomas, A., Vivier, L.: The new ext4 filesystem: current status and future plans. In: Ottawa Linux Symposium. http://ols.108.redhat.com/2007/Reprints/mathur-Reprint.pdf (2007)

  10. Norcott, W.D.: Lozone file system benchmark (2011)

  11. Oi, H.: A case study: performance evaluation of a dram-based solid state disk. In: Japan–China Joint Workshop on Frontier of Computer Science and Technology, FCST 2007, pp. 57–60 (2007)

  12. Raoux, S., Burr, G., Breitwisch, M., Rettner, C., Chen, Y., Shelby, R., Salinga, M., Krebs, D., Chen, S.H., Lung, H.L., Lam, C.: Phase-change random access memory: a scalable technology. IBM J. Res. Dev. 52(4.5), 465–479 (2008)

    Article  Google Scholar 

  13. Rodeh, O.: B-trees, shadowing, and clones. Trans. Storage 3(4), 2:1–2:27 (2008)

    Article  Google Scholar 

  14. Rodeh, O., Bacik, J., Mason, C.: The linux b-tree filesystem. Trans. Storage 9(3), 9:1–9:32 (2013)

    Article  Google Scholar 

  15. Seppanen, E., O’Keefe, M., Lilja, D.: High performance solid state storage under linux. In: IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–12 (2010)

  16. Shin, D.I., Yu, Y.J., Kim, H.S., Choi, J.W., Jung, D.Y., Yeom, H.Y.: Dynamic interval polling and pipelined post i/o processing for low-latency storage class memory. In: Proceedings of the 5th USENIX Conference on Hot Topics in Storage and File Systems, USENIX Association, pp. 5–5 (2013)

  17. Son, Y., Choi, J. W., Eom, H., Yeom, H.Y.: Optimizing the file system with variable-length I/O for fast storage devices. In: Proceedings of the 4th Asia-Pacific Workshop on Systems, APSys ’13, pp. 14:1–14:6. ACM, New York (2013)

  18. Son, Y., Song, N.Y., Eom, H., Yeom, H.Y.: A user-level file system for fast storage devices. Workshop on Autonomic Management of High Performance Grid and Cloud Computing

  19. Sweeney, A., Doucette, D., Hu, W., Anderson, C., Nishimoto, M., Peck, G.: Scalability in the xfs file system. In: USENIX Annual Technical Conference, vol. 15 (1996)

  20. TAILWINDSTORAGE. Extreme s3804 (2014)

  21. Worthington, B.L., Ganger, G.R., and Patt, Y.N.: Scheduling algorithms for modern disk drives. In: Proceedings of the 1994 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, SIGMETRICS’ 94, pp. 241–251. ACM, New York (1994)

  22. Wu, X., Reddy, A.L.N.: Scmfs: a file system for storage class memory. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pp. 39:1–39:11. ACM, New York (2011)

  23. Yang, J., Minturn, D.B., Hady, F.: When poll is better than interrupt. In: Proceedings of the 10th USENIX Conference on File and Storage Technologies, FAST’12, pp. 3–3. USENIX Association, Berkeley (2012)

  24. Yu, Y.J., Shin, D.I., Shin, W., Song, N.Y., Choi, J.W., Kim, H.S., Eom, H., Kim, H.S., Eom, H., Yeom, H.Y.: Optimizing the block I/O subsystem for fast storage devices. ACM Trans. Comput. Syst. 32(2), 6:1–6:48 (2014)

    Article  Google Scholar 

  25. Yu, Y.J., Shin, D.I., Shin, W., Song, N.Y., Eom, H., Yeom, H.Y.: Exploiting peak device throughput from random access workload. In: Proceedings of the 4th USENIX Conference on Hot Topics in Storage and File Systems, USENIX Association, pp. 7–7 (2012)

Download references

Acknowledgments

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (No. 0421-20150075) and partly supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2014R1A1A2055032).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hyeonsang Eom.

Additional information

A preliminary version [18] of this paper was presented at AMGCC 2014, London, England.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Son, Y., Song, N.Y., Han, H. et al. Design and evaluation of a user-level file system for fast storage devices . Cluster Comput 18, 1075–1086 (2015). https://doi.org/10.1007/s10586-015-0465-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-015-0465-5

Keywords

Navigation