Abstract
Emerging fast, byte-addressable persistent memory (PM) promises substantial storage performance gains compared with traditional disks. We present TPFS, a tiered file system that combines PM and slow disks to create a storage system with near-PM performance and large capacity. TPFS steers incoming file input/output (I/O) to PM, dynamic random access memory (DRAM), or disk depending on the synchronicity, write size, and read frequency. TPFS profiles the application’s access stream online to predict the behavior of file access. In the background, TPFS estimates the “temperature” of file data and migrates the write-cold and read-hot file data from PM to disks. To fully utilize disk bandwidth, TPFS coalesces data blocks into large, sequential writes. Experimental results show that with a small amount of PM and a large solid-state drive (SSD), TPFS achieves up to 7.3× and 7.9× throughput improvement compared with EXT4 and XFS running on an SSD alone, respectively. As the amount of PM grows, TPFS’s performance improves until it matches the performance of a PM-only file system.
- [1] . 2017. Thermostat: Application-transparent page management for two-tiered main memory. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems, Association for Computing Machinery, Xi’an, 631–644.Google Scholar
- [2] . 2015. Let’s talk about storage & recovery methods for non-volatile memory database systems. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, Melbourne, 707–722.Google ScholarDigital Library
- [3] . 2012. Fio: Flexible i/o tester. Retrieved January 25, 2023 from http://freecode.com/projects/fio.Google Scholar
- [4] . 2017. Curator: Self-managing storage for enterprise clusters. In Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation (NSDI), USENIX Association, Boston MA, 51–66.Google Scholar
- [5] . 2010. Advances and future prospects of spin-transfer torque random access memory. IEEE Transactions on Magnetics 46, 6 (2010), 1873–1878.Google ScholarCross Ref
- [6] . 2021. SpanDB: A fast, cost-effective LSM-tree based KV store on hybrid storage. In 19th USENIX Conference on File and Storage Technologies (FAST 21), USENIX Association, virtual event, 17–32.Google Scholar
- [7] . 2021. Scalable persistent memory file system with kernel-userspace collaboration. In 19th USENIX Conference on File and Storage Technologies (FAST 21), USENIX Association, virtual event, 81–95.Google Scholar
- [8] . 2015. xfs: DAX support. Retrieved January 25, 2023 from https://lwn.net/Articles/635514/.Google Scholar
- [9] . 2009. Better I/O through byte-addressable, persistent memory. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles. ACM, Association for Computing Machinery, Big Sky, MT, 133–146.Google ScholarDigital Library
- [10] . 2022. Compute Express Link\(^{TM}\): The Breakthrough CPU-to-Device Interconnect. Retrieved January 25, 2023 from https://www.computeexpresslink.org.Google Scholar
- [11] . 2010. FlashStore: High throughput persistent key-value store. Proceedings of the VLDB Endowment 3, 1-2 (2010), 1414–1425.Google ScholarDigital Library
- [12] . 2019. Performance and protection in the ZoFS user-space NVM file system. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, Association for Computing Machinery, Huntsville Ontario, 478–493.Google ScholarDigital Library
- [13] . 2017. Soft updates made simple and fast on non-volatile memory. In 2017 USENIX Annual Technical Conference (USENIX ATC 17). USENIX Association, Santa Clara, CA. 719–731.Google Scholar
- [14] . 2014. System software for persistent memory. In Proceedings of the 9th European Conference on Computer Systems. ACM, Amsterdam, 1–15.Google ScholarDigital Library
- [15] . 2016. Data tiering in heterogeneous memory systems. In Proceedings of the 11th European Conference on Computer Systems. ACM, London, 1–16.Google ScholarDigital Library
- [16] . 2021. NVCache: A plug-and-play NVMM-based I/O booster for legacy systems. In 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’21). IEEE, Taipei, 186–198.Google ScholarCross Ref
- [17] . 2012. Rocksdb. (2012). Retrieved January 25, 2023 from http://rocksdb.org.Google Scholar
- [18] . 2011. High performance database logging using storage class memory. IEEE 27th International Conference on Data Engineering, IEEE Computer Society, Hannover, 1221–1231.Google Scholar
- [19] Google. 2011. LevelDB. Retrieved February 3, 2023 from https://github.com/google/leveldb.Google Scholar
- [20] . 1994. File system design for an NFS file server appliance. In USENIX Winter, Vol. 94.Google ScholarDigital Library
- [21] . 2018. Intel Optane Technology. Retrieved January 25, 2023 from https://www.intel.com/content/www/us/en/architecture-and-technology/intel-optane-technology.html.Google Scholar
- [22] . 2020. Intel optane DC persistent memory. Retrieved January 25, 2023 from https://www.intel.com/content/www/us/en/architecture-and-technology/optane-dc-persistent-memory.html.Google Scholar
- [23] . 2019. Basic performance measurements of the Intel Optane DC persistent memory module. arXiv preprint arXiv:1903.05714 (2019).Google Scholar
- [24] . 2003. Beyond backup toward storage management. IBM Systems Journal 42, 2 (2003), 322–337.Google ScholarDigital Library
- [25] . 2019. SplitFS: Reducing software overhead in file systems for persistent memory. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, Association for Computing Machinery, Huntsville, 494–508.Google ScholarDigital Library
- [26] . 2018. Designing a true direct-access file system with DevFS. In 16th USENIX Conference on File and Storage Technologies, USENIX Association, Oakland, CA, 241.Google ScholarDigital Library
- [27] . 2010. Scalable spin-transfer torque ram technology for normally-off computing. IEEE Design & Test of Computers 28 (2010), 52–63.Google ScholarDigital Library
- [28] . 2014. hats: A heterogeneity-aware tiered storage for Hadoop. In 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid’14). IEEE, Chicago, IL, 502–511.Google Scholar
- [29] . 2017. Strata: A cross media file system. In Proceedings of the 26th Symposium on Operating Systems Principles, Association for Computing Machinery, New York, NY, 460–477.Google ScholarDigital Library
- [30] . 2009. Architecting phase change memory as a scalable DRAM alternative. In Proceedings of the 36th Annual International Symposium on Computer Architecture, Association for Computing Machinery, Austin, TX, 2–13.Google Scholar
- [31] . 2014. Nitro: A capacity-optimized SSD cache for primary storage. In USENIX Annual Technical Conference, USENIX Association, Philadelphia, PA, 501–512.Google ScholarDigital Library
- [32] . 2017. Battery-backed NVDIMMs. (2017). Retrieved January 25, 2023 from https://www.micron.com/products/dram-modules/nvdimm/.Google Scholar
- [33] . 2012. Whole-system persistence. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, Association for Computing Machinery, London, 401–410.Google Scholar
- [34] . 2009. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the 36th Annual International Symposium on Computer Architecture, Association for Computing, Austin, TX, 24–33.Google Scholar
- [35] . 2019. Accelerating database workloads with DM-writecache and persistent memory. In Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, Association for Computing Machinery, Mumbai, 255–263.Google ScholarDigital Library
- [36] . 2016. Filebench: A flexible framework for file system benchmarking. USENIX; Login 41, 1 (2016), 6–12.Google Scholar
- [37] . 2020. Characterizing and modeling non-volatile memory systems. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, Athens, 496–508.Google ScholarCross Ref
- [38] . 2014. Add support for NV-DIMMs to ext4. Retrieved January 25, 2023 from https://lwn.net/Articles/613384/.Google Scholar
- [39] . 2017. Add support for NV-DIMMs to ext4. Retrieved February 3, 2023 from https://lwn.net/Articles/613384/.Google Scholar
- [40] . 2021. The storage hierarchy is not a hierarchy: Optimizing caching on modern storage devices with Orthus. In 19th USENIX Conference on File and Storage Technologies (FAST’21), USENIX Association, virtual event, 307–323.Google Scholar
- [41] . 2011. SCMFS: A file system for storage class memory. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, Association for Computing Machinery, Denver, CO, 1–11.Google ScholarDigital Library
- [42] . 2011. Design implications of memristor-based RRAM cross-point structures. In Design, Automation & Test in Europe Conference & Exhibition (DATE’11). IEEE, Grenoble, 1–6.Google Scholar
- [43] . 2016. NOVA: A log-structured file system for hybrid volatile/non-volatile main memories. In Proceeding of the 14th USENIX Conference on File and Storage Technologies (FAST’16). USENIX Association, Santa Clara, CA, 323–338.Google Scholar
- [44] . 2017. Nova-Fortis: A fault-tolerant non-volatile main memory file system. In Proceedings of the 26th Symposium on Operating Systems Principles, Association for Computing Machinery, Shanghai, 478–496.Google ScholarDigital Library
- [45] . 2020. An empirical guide to the behavior and use of scalable persistent memory. In 18th USENIX Conference on File and Storage Technologies (FAST’20), USENIX Association, Santa Clara, CA, 169–182.Google ScholarDigital Library
- [46] . 2013. Memristive devices for computing. Nature Nanotechnology 8, 1 (2013), 13.Google ScholarCross Ref
- [47] . 2017. AutoTiering: Automatic data placement manager in multi-tier all-flash datacenter. In IEEE 36th International Performance Computing and Communications Conference (IPCCC’17). IEEE, San Diego, CA, 1–8.Google ScholarCross Ref
- [48] . 2010. Adaptive data migration in multi-tiered storage based cloud environment. In IEEE 3rd International Conference on Cloud Computing (CLOUD’10). IEEE, Miami, FL, 148–155.Google ScholarDigital Library
- [49] . 2015. Mojim: A reliable and highly-available non-volatile memory system. In Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems, Association for Computing Machinery, Istanbul, 3–18.Google Scholar
Index Terms
- TPFS: A High-Performance Tiered File System for Persistent Memories and Disks
Recommendations
Optimizing CoW-based file systems on open-channel SSDs with persistent memory
DATE '22: Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in EuropeBlock-based file systems, such as Btrfs, utilize the copy-on-write (CoW) mechanism to guarantee data consistency on solid-state drives (SSDs). Open-channel SSD provides opportunities for in-depth optimization of block-based file systems. However, ...
A file system bypassing volatile main memory: towards a single-level persistent store
CF '18: Proceedings of the 15th ACM International Conference on Computing FrontiersExisting persistent memory (PM) based file systems rely on a DRAM and PM hybrid store. Although a hybrid store does boost system performance while avoiding some current PM limitations like limited endurance, we envision that with more advances PM ...
Experiences with Hierarchical Storage Management Support in Blue Whale File System
PDCAT '10: Proceedings of the 2010 International Conference on Parallel and Distributed Computing, Applications and TechnologiesIn order to meet the challenges of significant storage and application growth, as well as shortened backup windows and limited IT resources, more and more organizations embrace Hierarchical Storage Management (HSM). Parts of SAN file systems provide the ...
Comments