Abstract
Modern file systems assume the use of disk, a system-wide performance bottleneck for over a decade. Current disk caching and RAM file systems either impose high overhead to access memory content or fail to provide mechanisms to achieve data persistence across reboots.The Conquest file system is based on the observation that memory is becoming inexpensive, which enables all file system services to be delivered from memory, except for providing large storage capacity. Unlike caching, Conquest uses memory with battery backup as persistent storage, and provides specialized and separate data paths to memory and disk. Therefore, the memory data path contains no disk-related complexity. The disk data path consists of optimizations only for the specialized disk usage pattern.Compared to a memory-based file system, Conquest incurs little performance overhead. Compared to several disk-based file systems, Conquest achieves 1.3x to 19x faster memory performance, and 1.4x to 2.0x faster performance when exercising both memory and disk.Conquest realizes most of the benefits of persistent RAM at a fraction of the cost of a RAM-only solution. It also demonstrates that disk-related optimizations impose high overheads for accessing memory content in a memory-rich environment.
- APC. 2005. SMART-UPS. http://www.apc.com.]]Google Scholar
- Anderson, D., Chase, J., and Vahdat, A. 2000. Interposed request routing for scalable network storage. In Proceedings of the 4th Symposium on Operating System Design and Implementation. San Diego, CA.]] Google ScholarDigital Library
- Baker, M. G., Hartman, J. H., Kupfer, M. D., Shirriff, K. W., and Ousterhout, J. K. 1991. Measurements of a distributed file system. In Proceedings of the 13th Symposium on Operating Systems Principles. Pacific Grove, CA.]] Google ScholarDigital Library
- Baker, M., Asami, S., Deprit, E., Ousterhout, J., and Seltzer, M. 1992. Non-volatile memory for fast, reliable file systems. In Proceedings of the 5th International Conference on Architectural Support for Programming Languages and Operating Systems. Boston, MA.]] Google ScholarDigital Library
- BITMICRO. 2005. High-End solid state disk. http://www.bitmicro.com/products_edisk_25_scsin.php.]]Google Scholar
- Boeve, H., Bruynseraede, C., Das, J., Dessein, K., Borghs, G., de Boeck, J., Sousa, R., Melo, L., and Freitas, P. 1999. Technology assessment for the implementation of magnetoresistive elements with semiconductor components in magnetic random access memory (MRAM) architectures. IEEE Trans. Magnet. 35, 5, 2820--2825.]]Google ScholarCross Ref
- Bolosky, W. J., Fitzgerald, R. P., and Douceur, J. R. 1997. Distributed schedule management in the Tiger video fileserver. In Proceedings of the 16th ACM Symposium on Operating Systems Principles. Saint-Malo, France.]] Google ScholarDigital Library
- Bonwick, J. 1994. The slab allocator: An object-caching kernel memory allocator. In Proceedings of the USENIX Summer Technical Conference. Boston, MA.]] Google ScholarDigital Library
- Bozman, G. P., Ghannad, H. H., and Weinberger, E. D. 1991. A trace-driven study of CMS file references. IBM J. Res. Dev. 35, 5--6, 815--828.]] Google ScholarDigital Library
- Cáceres, R., Douglis, F., Li, K., and Marsh, B. 1993. Operating system implications of solid-state mobile computers. Tech. rep. MITL-TR-56-93, Matsushita Information Technology Laboratory, United States.]]Google Scholar
- Card, R., Ts'o, T., and Tweedie, S. 1994. Design and implementation of the second extended filesystem. In Proceedings of the 1st Dutch International Symposium on Linux. ISBN 90-367-0385-9.]]Google Scholar
- Chen, P. M., Ng, W. T., Chandra, S., Aycock, C., Rajamani, G., and Lowell, D. 1996. The Rio file cache: Surviving operating system crashes. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. Cambridge, MA.]] Google ScholarDigital Library
- Chen, S. and Thapar, M. 1997. A novel video layout strategy for near-video-on-demand servers. Tech. rep. HPL-97-52. Hewlett-Packard Laboratories.]]Google Scholar
- DELL. 2002. Determining the availability and reliability of storage configurations. http://www1.us.dell.com/content/topics/global.aspx/power/en/ps3q02_shetty?c=us&l=en&s=corp. Google keywords: Dell, reliability, MTBF, hours.]]Google Scholar
- Dewitt, D. J., Katz, R. H., Olken, F., Shapiro, L. D., Stonebraker, M., and Wood, D. A. 1984. Implementation techniques for main memory database systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data.]] Google ScholarDigital Library
- Douceur, J. R. and Bolosky, W. J. 1999. A large-scale study of file-system contents. In Proceedings of the ACM Sigmetrics International Conference on Measurement and Modeling of Computer Systems. Atlanta, GA.]] Google ScholarDigital Library
- Douglis, F., Cáceres, R., Kaashoek, F., Li, K., Marsh, B., and Tauber, J. A. 1994. Storage alternatives for mobile computers. In Proceedings of the 1st Symposium on Operating Systems Design and Implementation. Monterey, CA.]] Google ScholarDigital Library
- Edel, N. K., Tuteja, D., Miller, M. L., and Brandt, S. A. 2004. MRAMFS: A compressing file system for non-volatile RAM. In Proceedings of the 12th IEEE/ACM International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems. Volendam, the Netherlands.]] Google ScholarDigital Library
- Eich, M. H. 1987. A classification and comparison of main memory database recovery techniques. In Proceedings of the 3rd International Conference on Data Engineering. Los Angeles, CA.]] Google ScholarDigital Library
- Evans, K. M. and Kuenning, G. K. 2002. A study of irregularities in file-size distributions. In Proceedings of the International Symposium on Performance Evaluation of Computer and Telecommunication Systems. San Diego, CA.]]Google Scholar
- Fagin, R., Nievergelt, J., Pippenger, N., and Strong, H. R. 1979. Extensible hashing---A fast access method for dynamic files. ACM Trans. Datab. Syst. 4, 3, 315--344.]] Google ScholarDigital Library
- Gal, E. and Toledo, S. 2005. A transactional flash file system for microcontrollers. In Proceedings of the USENIX Annual Technical Conference. Anaheim, CA.]] Google ScholarDigital Library
- Ganger, G. R. and Patt, Y. N. 1994. Metadata update performance in file systems. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation.]] Google ScholarDigital Library
- Ganger, G. R., Mckusick, M. K., Soules, C. A. N., and Patt, Y. N. 2000. Soft updates: A solution to the metadata update problem in file systems. ACM Trans. Comput. Syst. 18, 2, 127--153.]] Google ScholarDigital Library
- Garcia-Molina, H. and Salem, K. 1987. High performance transaction processing with memory resident data. In Proceedings of the 2nd International Workshop on High Performance Transaction Systems. Pacific Grove, CA.]]Google Scholar
- Garcia-Molina, H. and Salem, K. 1992. Main memory database systems: An overview. IEEE Trans. Know. Data Eng. 4, 6, 509--516.]] Google ScholarDigital Library
- Gawlick, D. and Kinkade, D. 1985. Varieties of concurrency control in MIS/VS fast path. IEEE Datab. Eng. 8, 2, 3--10.]]Google Scholar
- Gibson, G. A. and Patterson, D. A. 1993. Designing disk arrays for high data reliability. J. Parallel. Distribut. Comput. 17, 1--2, 4--27.]] Google ScholarDigital Library
- Grochowski, E. and Halem, R. D. 2003. Technological impact of magnetic hard disk drives on storage systems. IBM Syst. J. 42, 2. http://www.research.ibm.com/journal/sj/422/grochowski.html.]] Google ScholarDigital Library
- Hitz, D., Lau, J., and Malcolm, M. File system design for an NFS file server appliance. In Proceedings of the USENIX Winter Technical Conference. San Francisco, CA.]] Google ScholarDigital Library
- Howard, J., Kazar, M., Menees, S., Nichols, D., Satyanarayanan, M., Sidebotham, R., and West, M. 1988. Scale and performance in a distributed file system. ACM Trans. Comput. Syst. 6, 1, 51--81.]] Google ScholarDigital Library
- IBM. 2003. IBM iSeries storage overview. http://www-1.ibm.com/servers/eserver/iseries/hardware/storage/overview.html.]]Google Scholar
- Irlam, G. 1993. UNIX file size survey---1993. http://www.base.com/gordoni/ufs93.html.]]Google Scholar
- Katcher, J. 1997. PostMark: A new file system benchmark. Tech. Rep. TR3022. Network Appliance, Inc.]]Google Scholar
- Kawaguichi, A., Nishioka, S., and Motoda, H. 1995. A flash-memory-based file system. In Proceedings of the USENIX Winter Technical Conference. New Orleans, LA.]] Google ScholarDigital Library
- Kerekes, Z. 2005. Charting the rise of the solid state disk market. http://www.storagesearch.com/chartingtheriseofssds.html.]]Google Scholar
- Kleiman, S. R. 1986. Vnodes: An architecture for multiple file system types in Sun UNIX. In Proceedings of the Summer USENIX Conference. Atlanta, GA.]]Google Scholar
- Lehman, T. J. and Carey, M. J. 1987. A recovery algorithm for a high-performance memory-resident database system. In Proceedings of the ACM SIGMOD Conference. San Francisco, CA.]] Google ScholarDigital Library
- Li, K. and Naughton, J. F. 1988. Multiprocessor main memory transaction processing. In Proceedings of the International Symposium on Databases in Parallel and Distributed Systems. Austin, TX.]] Google ScholarDigital Library
- Liebert Cooperation. 2005. Field MTBF numbers: What do they really mean? http://www.liebert.com/support/whitepapers/documents/techmtbf.asp.]]Google Scholar
- Mahanti, A., Williamson, C., and Eager, D. 2000. Traffic analysis of a web proxy caching hierarchy. IEEE Netw. Magazine: Special Issue on Web Performance 14, 3, 16--23.]]Google ScholarDigital Library
- McKusick, M. K., Joy, W. N., Leffler, S. J., and Fabry, R. S. 1984. A fast file system for UNIX. ACM Trans. Comput. Syst. 2, 3, 181--197.]] Google ScholarDigital Library
- McKusick, M. K., Karels, M. J., and Bostic, K. 1990. A pageable memory based filesystem. In Proceedings of the Summer USENIX Conference. Anaheim, CA.]]Google Scholar
- McKusick, M. K. and Ganger, G. R. 1991. Soft updates: A technique for eliminating most synchronous writes in the fast filesystem. In Proceedings of the USENIX Annual Technical Conference.]] Google ScholarDigital Library
- McKusick, M. K. 2002. Running “fsck” in the background. In Proceedings of the BSDCon Conference. San Francisco, CA.]]Google Scholar
- MICRON. 1997. Module mean time between failures (MTBF). Tech. Note TN-04-45. http://download.micron.com/pdf/technotes/DT45.pdf.]]Google Scholar
- MICROSOFT. 2003. Microsoft Windows CE 3.0: Files, databases, and persistent storage. MSDN Online Library. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dncenet/html/systemmemorymgmtwince.asp.]]Google Scholar
- Miles, J. B. 2000. Thin clients. Government Comput. News 6, 11. http://appserv.gcn.com/state/vol6_no11/guide/893-1.html.]]Google Scholar
- Miller, E. L., Brandt, S. A., and Long, D. D. E. 2001. HerMES: High-performance reliable MRAM-enabled storage. In Proceedings of the 8th IEEE Workshop on Hot Topics in Operating Systems. Schloss Elmau, Germany.]] Google ScholarDigital Library
- NAMESYS. 2005. http://www.namesys.com.]]Google Scholar
- Ng, N. T., Aycock, C. M., Rajamani, G., and Chen, P. M. 1996. Comparing disk and memory's resistance to operating system crashes. In Proceedings of the International Symposium on Software Reliability Engineering. Hong Kong, China.]] Google ScholarDigital Library
- Ng, N. T. and Chen, P. M. 2001. The design and verification of the Rio file cache. IEEE Trans. Comput. 50, 4, 322--337.]] Google ScholarDigital Library
- Niijima, H. 1995. Design of a solid-state file using flash EEPROM. IBM J. Res. Dev. 39, 5, 531--546.]] Google ScholarDigital Library
- Ousterhout, J. K., Da Costa, H., Harrison, D., Kunze, A., Kupfer, M., and Thompson, J. G. 1985. A trace driven analysis of the UNIX 4.2 BSD file systems. In Proceedings of the 10th ACM Symposium on Operating Systems Principles. Orcas Island, WA, 15--24.]] Google ScholarDigital Library
- PALM. 2004. Introduction to palm OS memory use. Palm OS Programmer's Companion, Vol. I. http://www.palmos.com/dev/support/docs/palmos/PalmOSCompanion/Memory.html.]]Google Scholar
- PC WORLD. 2005. IRam speeds Windows XP startup. PC World. http://www.pcworld.com/news/article/0,aid,121105,00.asp.]]Google Scholar
- Peacock, J. K., Kamaraju, A. and Agrawal, S. 1998. Fast consistency checking for the solaris file system. In Proceedings of the USENIX Annual Technical Conference. New Orleans, LA.]] Google ScholarDigital Library
- Peterson, J. L. and Norman, T. A. 1997. Buddy systems. Commun. ACM 20, 6, 421--431.]] Google ScholarDigital Library
- PRICE WATCH. 2005. Memory---System. http://www.pricewatch.com.]]Google Scholar
- QUANTUM. 2003. Achieving real-time multimedia performance with multistream solid-state disk. http://uk.builder.com/whitepapers/0,39026692,60018746p-39000844q,00.htm.]]Google Scholar
- Riedel, E. 1998. A performance study of sequential I/O on Windows NT 4. In Proceedings of the 2nd USENIX Windows NT Symposium. Seattle, WA.]] Google ScholarDigital Library
- Roselli, D., Lorch, J. R., and Anderson, T. E. 2000. A comparison of file system workloads. In Proceedings of the USENIX Annual Technical Conference. San Diego, CA.]] Google ScholarDigital Library
- Rosenblum, M. and Ousterhout, J. 1991. The design and implementation of a log-structured file system. In Proceedings of the 13th ACM Symposium on Operating Systems Principles. Pacific Grove, CA.]] Google ScholarDigital Library
- Schindler, J., Griffin, J. L., Lumb, C. R., and Ganger, G. R. 2002. Track-aligned extents: Matching access patterns to disk drive characteristics. In Proceedings of the USENIX File and Storage Technologies Conference. Monterey, CA.]] Google ScholarDigital Library
- SEAGATE. 2003. Cheetah 10K.6 reliability, performance, and low ownership cost. http://www.seagate.com.]]Google Scholar
- Seltzer, M. I., Ganger, G. R., McKusick, M. K., Smith, K. A., Soules, C. A. N., and Stein, C. A. 2000. Journaling versus soft updates: Asynchronous meta-data protection in file systems. In Proceedings of the USENIX Annual Technical Conference. San Diego, CA.]] Google ScholarDigital Library
- Shankland, S. 2001. Transmeta taking Linux gadgets mobile. CNET News.com http://news.com.com/2100-1001-254020.html?legacy=cnet.]]Google Scholar
- Sweeney, A., Doucette, D., Hu, W., Anderson, C., Nishimoto, M., and Peck, G. 1996. Scalability in the XFS file system. In Proceedings of the USENIX Annual Technical Conference. San Digeo, CA.]] Google ScholarDigital Library
- Thompson, K. 1978. UNIX implementation. Bell Syst. Tech. J. 57, 6, 1931--1946.]]Google ScholarCross Ref
- Torelli, P. 1995. The Microsoft flash file system. Dr. Dobb's J. Feb, 63--70.]]Google Scholar
- Vogels, W. 1999. File system usage in Windows NT 4.0. In Proceedings of the 17th Symposium on Operating Systems Principles. Kiawah Island, SC.]] Google ScholarDigital Library
- Wang, A. I. A., Kuenning, G. H., Reiher P., and Popek, G. 2003. The effects of memory-rich environments on file system microbenchmarks. In Proceedings of the International Symposium on Performance Evaluation of Computer and Telecommunication Systems. Montreal, Canada.]]Google Scholar
- Woodhouse, D. 2001. JFFS: The journaling flash file system. http://sources.redhat.com/jffs2/jffs2-html/.]]Google Scholar
- Wu, M. and Zwaenepoel, W. 1994. eNVy: A non-volatile, main memory storage system. In Proceedings of the 6th Conference on Architectural Support for Programming Languages and Operating Systems. San Jose, CA.]] Google ScholarDigital Library
Index Terms
- The Conquest file system: Better performance through a disk/persistent-RAM hybrid design
Recommendations
SCMFS: A File System for Storage Class Memory and its Extensions
Modern computer systems have been built around the assumption that persistent storage is accessed via a slow, block-based interface. However, emerging nonvolatile memory technologies (sometimes referred to as storage class memory (SCM)), are poised to ...
A multiple-file write scheme for improving write performance of small files in Fast File System
Fast File System (FFS) stores files to disk in separate disk writes, each of which incurs a disk positioning (seek + rotation) limiting the write performance for small files. We propose a new scheme called co-writing to accelerate small file writes in ...
Comments