ABSTRACT
Unlike traditional block-based SSDs, Zoned Namespace (ZNS) SSDs expose storage through the zoned block interface, completely eliminating the need for in-device garbage collection (GC) and relinquishing this responsibility to applications. As a result, application-aware data placement decisions give the opportunity for applications on the host to perform efficient GC. Meanwhile, RocksDB for ZNS SSD places data with similar invalidation times (lifetimes) in the same zone through ZenFS (a user-level file system) using the Lifetime-based Zone Allocation algorithm (LIZA), and minimizes the GC overhead of valid data copy when reclaiming a zone. However, LIZA, which allocates zones by predicting the lifetime of each SSTable according to the level of the hierarchical structure of the LSM-tree, is very inefficient in minimizing the write amplification (WA) problem due to inaccurate predictions of SSTable lifetimes. Instead, based on our observation that the deletion time of SSTables in the LSM-tree is solely determined by the compaction process, we propose a novel Compaction-Aware Zone Allocation algorithm (CAZA) that allows the newly created SSTables to be deleted together after merging in the future. CAZA is implemented in RocksDB's ZenFS and our extensive evaluations show that CAZA significantly reduces the WA overhead compared to LIZA.
- Nitin Agrawal, Vijayan Prabhakaran, Ted Wobber, John D Davis, Mark Manasse, and Rina Panigrahy. 2008. Design Tradeoffs for SSD Performance. In Proceedings of the USENIX Annual Technical Conference (ATC '08). 57--70.Google Scholar
- Matias Bjørling. 2019. From Open-channel SSDs to Zoned Namespaces. In Linux Storage and Filesystems Conference (Vault '19), Vol. 1.Google Scholar
- Matias Bjørling, Abutalib Aghayev, Hans Holmberg, Aravind Ramesh, Damien Le Moal, Gregory R Ganger, and George Amvrosiadis. 2021. ZNS: Avoiding the Block Interface Tax for Flash-based SSDs. In Proceedings of the USENIX Annual Technical Conference (ATC '21). 689--703.Google Scholar
- Gunhee Choi, Kwanghee Lee, Myunghoon Oh, Jongmoo Choi, Jhuyeong Jhin, and Yongseok Oh. 2020. A New LSM-style Garbage Collection Scheme for ZNS SSDs. In Proceedings of the 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage '20).Google Scholar
- Western Digital Corporation. 2021. nullbk. https://zonedstorage.io/docs/getting-started/nullblkGoogle Scholar
- Western Digital Corporation. 2022. ZenFS. https://github.com/westerndigitalcorporation/zenfsGoogle Scholar
- Western Digital Corporation. 2022. Zoned Stroage. https://zonedstorage.io/docs/introduction/zoned-storageGoogle Scholar
- Facebook. 2022. RocksDB. https://github.com/facebook/rocksdbGoogle Scholar
- Google. 2021. LevelDB. https://github.com/google/leveldbGoogle Scholar
- Kyuhwa Han, Hyunho Gwak, Dongkun Shin, and Jooyoung Hwang. 2021. ZNS+: Advanced Zoned Namespace Interface for Supporting In-Storage Zone Compaction. In Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21). 147--162.Google Scholar
- Hans Holmberg. 2020. ZenFS, Zones and RocksDB - Who Likes to Take out the Garbage Anyway? https://snia.org/sites/default/files/SDC/2020/074-Holmberg-ZenFS-Zones-and-RocksDB.pdfGoogle Scholar
- Xiao-Yu Hu, Evangelos Eleftheriou, Robert Haas, Ilias Iliadis, and Roman Pletka. 2009. Write Amplification Analysis in Flash-based Solid State Drives. In Proceedings of the ACM International Systems and Storage ConferenceS (SYSTOR '09). 1--9.Google ScholarDigital Library
- MongoDB Inc. 2022. MongoDB. https://github.com/mongodb/mongoGoogle Scholar
- Changman Lee, Dongho Sim, Joo Young Hwang, and Sangyeun Cho. 2015. F2FS: A New File System for Flash Storage. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST '15). 273--286.Google Scholar
- Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. 1996. The Log-Structured Merge-Tree (LSM-tree). Acta Informatica 33, 4 (1996), 351--385.Google ScholarDigital Library
- Reza Salkhordeh, Kevin Kremer, Lars Nagel, Dennis Maisenbacher, Hans Holmberg, Matias Bjørling, and André Brinkmann. 2021. Constant Time Garbage Collection in SSDs. In Proceedings of the IEEE International Conference on Networking, Architecture and Storage (NAS '21). 1--9.Google ScholarCross Ref
- Theano Stavrinos, Daniel S Berger, Ethan Katz-Bassett, and Wyatt Lloyd. 2021. Don't be a blockhead: Zoned namespaces make work on conventional SSDs obsolete. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS '21). 144--151.Google ScholarDigital Library
- Qiuping Wang, Jinhong Li, Patrick PC Lee, Tao Ouyang, Chao Shi, and Lilong Huang. 2022. Separating Data via Block Invalidation Time Inference for Write Amplification Reduction in Log-Structured Storage. In Proceedings of the 20th USENIX Conference on File and Storage Technologies (FAST '22). 429--443.Google Scholar
- Shiqin Yan, Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminathan Sundararaman, Andrew A Chien, and Haryadi S Gunawi. 2017. Tiny-Tail Flash: Near-Perfect Elimination of Garbage Collection Tail Latencies in NAND SSDs. ACM Transactions on Storage (TOS) 13, 3 (2017), 1--26.Google ScholarDigital Library
Index Terms
- Compaction-aware zone allocation for LSM based key-value store on ZNS SSDs
Recommendations
Design of LSM-tree-based Key-value SSDs with Bounded Tails
Key-value store based on a log-structured merge-tree (LSM-tree) is preferable to hash-based key-value store, because an LSM-tree can support a wider variety of operations and show better performance, especially for writes. However, LSM-tree is difficult ...
LSM-tree managed storage for large-scale key-value store
SoCC '17: Proceedings of the 2017 Symposium on Cloud ComputingKey-value stores are increasingly adopting LSM-trees as their enabling data structure in the backend storage, and persisting their clustered data through a file system. A file system is expected to not only provide file/directory abstraction to organize ...
Overlapping Aware Zone Allocation for LSM Tree-Based Store on ZNS SSDs
ASPDAC '24: Proceedings of the 29th Asia and South Pacific Design Automation ConferenceNVMe Zoned Namespace (ZNS) devices partition the storage space into sequential-write zones, notably reducing the costs of address mapping, garbage collection (GC), and over-provisioning. Log-Structured Merge (LSM) tree-based databases convert random ...
Comments