Skip to main content

TrieKV: Managing Values After KV Separation to Optimize Scan Performance in LSM-Tree

  • Conference paper
  • First Online:
Web and Big Data (APWeb-WAIM 2023)

Abstract

Persistent key-value(KV) stores are mainly designed based on the Log-Structured Merge-tree(LSM-tree) for high write performance, yet the LSM-tree suffers from the inherently high I/O amplification which influences the read and write performance when KV stores grow in size. KV separation mitigates I/O amplification by storing only keys in the LSM-tree while values are in separated storage. However, the KV separation breaks the key sequence of values, which influences their range query performance. We propose TrieKV make the most of the hard-disk drives(HDD)’s sequential read performance advantages to improve range query performance. TrieKV uses a dynamic prefix index and a collaborative KV data merging and sorting mechanism to manage values after KV separation. Compared with the typical KV separation storage system WiscKey, TrieKV achieves \(2.35\times \) range query performance under HDD. Meanwhile, TrieKV also performs better than WiscKey in all six YCSB workloads.

Z. Yao and Y. Song—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pebblesdb: Building key-value stores using fragmented log-structured merge trees. In: The 26th Symposium, pp. 497–514 (2017)

    Google Scholar 

  2. Triad: Creating synergies between memory, disk and log in log structured key-value stores. In: Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference (2017)

    Google Scholar 

  3. Chan, H.H.W., Li, Y., Lee, P.P.C., Xu, Y.: HashKV: enabling efficient updates in KV storage via hashing. In: Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference, p. 14 (2018)

    Google Scholar 

  4. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM symposium on Cloud computing - SoCC 2010, Indianapolis, Indiana, USA, p. 143. ACM Press (2010)

    Google Scholar 

  5. Dai, Y., et al.: USENIX Assoc: From WiscKey to Bourbon: A Learned Index for Log-Structured Merge Trees, pp. 155–171 (2020)

    Google Scholar 

  6. Dayan, N., Athanassoulis, M., Idreos, S.: Optimal bloom filters and adaptive merging for LSM-trees. ACM Trans. Database Syst. 43(4) (2018). https://doi.org/10.1145/3276980

  7. Facebook: Rocksdb documentation. https://rocksdb.org.cn/doc.html

  8. Google: Leveldb documentation. https://github.com/google/leveldb/blob/master/doc/index.md

  9. Hu, X.Y., Eleftheriou, E., Haas, R., Iliadis, I., Pletka, R.: Write amplification analysis in flash-based solid state drives. In: Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference on - SYSTOR 2009, Haifa, Israel, p. 1. ACM Press (2009)

    Google Scholar 

  10. Huang, C., Hu, H., Wei, X., Qian, W., Zhou, A.: Partition pruning for range query on distributed log-structured merge-tree. Front. Comput. Sci. 14(3) (2020). https://doi.org/10.1007/s11704-019-8234-x

  11. Im, J., Bae, J., Chung, C., Arvind, Lee, S.: Design of LSM-tree-based Key-value SSDs with Bounded Tails. ACM Trans. Storage 17(2) (2021). https://doi.org/10.1145/3452846

  12. Li, C., Chen, H., Ruan, C., Ma, X., Xu, Y.: Leveraging NVME SSDS for building a fast, cost-effective, LSM-tree-based KV store. ACM Trans. Storage 17(4) (2021). https://doi.org/10.1145/3480963

  13. Li, Y., Tian, C., Guo, F., Li, C., Xu, Y., USENIX Assoc: ElasticBF: elastic bloom filter with hotness awareness for boosting read performance in large key-value stores, pp. 739–752 (2019)

    Google Scholar 

  14. Li, Y., et al.: Differentiated Key-Value storage management for balanced I/O performance. In: 2021 USENIX Annual Technical Conference (USENIX ATC 21), pp. 673–687. USENIX Association (2021). https://www.usenix.org/conference/atc21/presentation/li-yongkun

  15. Lu, K., Zhao, N., Wan, J., Fei, C., Zhao, W., Deng, T.: TridentKV: a read-optimized LSM-tree based KV Store via adaptive indexing and space-efficient partitioning. IEEE Trans. Parallel Distrib. Syst. 33(8), 1953–1966 (2022). https://doi.org/10.1109/TPDS.2021.3118599, https://ieeexplore.ieee.org/document/9563237/

  16. Lu, L., Pillai, T.S., Gopalakrishnan, H., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Wisckey: separating keys from values in SSD-conscious storage. ACM Trans. Storage (TOS) 13(1), 5 (2017)

    Google Scholar 

  17. Lu, Z., Cao, Q., Mei, F., Jiang, H., Li, J.: A novel multi-stage forest-based key-value store for holistic performance improvement. IEEE Trans. Parallel Distrib. Syst. 31(4), 856–870 (2020). https://doi.org/10.1109/TPDS.2019.2950248

    Article  Google Scholar 

  18. Luo, S., et al.: A robust space-time optimized range filter for key-value stores, pp. 2071–2086 (2020).https://doi.org/10.1145/3318464.3389731

  19. Ouaknine, K., Agra, O., Guz, Z.: Optimization of RocksDB for Redis on Flash. In: Proceedings of the International Conference on Compute and Data Analysis - ICCDA 2017, Lakeland, FL, USA, pp. 155–161. ACM Press (2017)

    Google Scholar 

  20. O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Informatica 33(4), 351–385 (1996)

    Article  Google Scholar 

  21. Pan, F.F., Yue, Y.L., Xiong, J.: dcompaction: speeding up compaction of the LSM-tree via delayed compaction. J. Comput. Sci. Technol. 32(1), 41–54 (2017)

    Article  Google Scholar 

  22. Pugh, W.: Skip lists: a probabilistic alternative to balanced trees. In: Workshop on Algorithms & Data Structures (1990)

    Google Scholar 

  23. Ren, K., Zheng, Q., Arulraj, J., Gibson, G.: Slimdb: a space-efficient key-value storage engine for semi-sorted data. Proc. VLDB Endow. 10(13), 2037–2048 (2017). https://doi.org/10.14778/3151106.3151108

  24. Sun, X., Yu, J., Zhou, Z., Xue, C.J.: FPGA-based compaction engine for accelerating LSM-tree key-value stores. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 1261–1272 (2020). https://doi.org/10.1109/ICDE48307.2020.00113

  25. Yao, T., et al.: Matrixkv: reducing write stalls and write amplification in LSM-tree based KV stores with matrix container in NVM. In: The 2020 USENIX Annual Technical Conference, pp. 17–31 (2020)

    Google Scholar 

  26. Wang, H., Yue, Y., He, S., Wang, W.: KT-store: a key-order and write-order hybrid key-value store with high write and range-query performance. In: Zhang, F., Zhai, J., Snir, M., Jin, H., Kasahara, H., Valero, M. (eds.) NPC 2018. LNCS, vol. 11276, pp. 64–76. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-05677-3_6

    Chapter  Google Scholar 

  27. Wang, Y., Jin, P., Wan, S.: HotKey-LSM: a hotness-aware LSM-tree for big data storage, pp. 5849–5851 (2020). https://doi.org/10.1109/BigData50022.1010.9377736

  28. Wu, F., Yang, M., Zhang, B., Du, D., USENIX Assoc: AC-key: adaptive caching for LSM-based key-value stores, pp. 603–615 (2020)

    Google Scholar 

  29. Yao, T., Wan, J., Huang, P., He, X., Wu, F., Xie, C.: Building efficient key-value stores via a lightweight compaction tree. ACM Trans. Storage 13(4) (2017). https://doi.org/10.1145/3139922

  30. Yue, Y., Wang, W., Li, Y., He, B.: Building an efficient put-intensive key-value store with skip-tree. IEEE Trans. Parallel Distrib. Syst. 23, 961–973 (2017)

    Article  Google Scholar 

  31. Zhang, B., Du, D.H.C.: NVLSM: a persistent memory key-value store using log-structured merge tree with accumulative compaction. ACM Trans. Storage 17(3) (2021). https://doi.org/10.1145/3453300

  32. Zhang, H., et al.: Succinct range filters. ACM Trans. Database Syst. 45(2) (2020). https://doi.org/10.1145/3375660

  33. Zhang, Q., Li, Y., Lee, P.P.C., Xu, Y., Cui, Q., Tang, L.: UniKV: toward high-performance and scalable KV storage in mixed workloads via unified indexing. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, pp. 313–324. IEEE (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yinliang Yue .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yao, Z., Song, Y., Yue, Y., Liu, J., Fan, Z. (2024). TrieKV: Managing Values After KV Separation to Optimize Scan Performance in LSM-Tree. In: Song, X., Feng, R., Chen, Y., Li, J., Min, G. (eds) Web and Big Data. APWeb-WAIM 2023. Lecture Notes in Computer Science, vol 14333. Springer, Singapore. https://doi.org/10.1007/978-981-97-2387-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-2387-4_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-2386-7

  • Online ISBN: 978-981-97-2387-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics