Skip to main content

LayerBF: A Space Allocation Policy for Bloom Filter in LSM-Tree

  • Conference paper
  • First Online:
Web and Big Data (APWeb-WAIM 2023)

Abstract

LSM-Tree based key-value stores commonly suffer from the issue of read amplification, as the retrieval of a particular key typically requires examination of multiple layers of SSTables. To enhance query performance, a bloom filter is commonly employed, although it is susceptible to the problem of false positives, which leads to additional I/Os. To mitigate the issue of false positives, the bloom filter size can be increased, but this in turn results in higher memory consumption. In response, we have developed LayerBF, a space allocation strategy for layered bloom filters. By leveraging access frequency, LayerBF dynamically allocates bits-per-key of bloom filters in each layer. Hotter layers are allocated a larger space, while colder layers are allocated a smaller space. This approach reduces the average false positive rate, improves storage read performance, and simultaneously minimizes memory consumption. We have implemented LayerBF in the widely used RocksDB key-value store and evaluated its performance with and without LayerBF on both hard disk drives (HDDs) and solid-state drives (SSDs). The evaluation results demonstrate that LayerBF improves read performance by 5% to 14% and reduces the false positive rate by 8% to 10%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bloom, B.H.: Space time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    Article  Google Scholar 

  2. Chang, F., et al.: Bigtable: a distributed storage system for structured data. TOCS

    Google Scholar 

  3. Choi, W.G., Kim, D., Roh, H., Park, S.: Ourrocks: Offloading disk scan directly to GPU in write-optimized database system. IEEE Trans. Computers 70(11), 1831–1844 (2021). https://doi.org/10.1109/TC.2020.3027671, https://doi.org/10.1109/TC.2020.3027671

  4. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing. pp. 143–154

    Google Scholar 

  5. Dai, Y., Xu, Y., Arpaci-Dusseau, R.H.: From wisckey to bourbon: A learned index for log-structured merge trees. In: OSDI (2020)

    Google Scholar 

  6. Dayan, N., Athanassoulis, M., Idreos, S.: Monkey: Optimal navigable key-value store. In: Proceedings of the 2017 ACM International Conference on Management of Data (2017)

    Google Scholar 

  7. DeCandia, G., Hastorun, D., Pilchin, A., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: Bressoud, T.C., Kaashoek, M.F. (eds.) SOSP. ACM (2007)

    Google Scholar 

  8. FaceBook: Rocksdb documentation (2012). http://rocksdb.org/

  9. George, L.: HBase: the definitive guide: random access to your planet-size data. O’Reilly Media, Inc. (2011)

    Google Scholar 

  10. Google: Leveldb documentation (2021). https://github.com/google/leveldb

  11. Huang, H., Ghandeharizadeh, S.: Nova-LSM: a distributed, component-based LSM-tree key-value store. In: Proceedings of the 2021 International Conference on Management of Data, pp. 749–763 (2021)

    Google Scholar 

  12. Kaiyrakhmet, O., Lee, S., Nam, B., Noh, S.H., ri Choi, Y.: SLM-DB: single-level key-value store with persistent memory. In: 2019 USENIX Conference on File and Storage Technologies, pp. 191–205. Boston, MA (2019)

    Google Scholar 

  13. Lai, C., Jiang, S., Yang, L., Hou, Z., Cui, C., Cong, J.: Atlas: Baidu’s key-value storage system for cloud data. In: MSST

    Google Scholar 

  14. Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)

    Article  Google Scholar 

  15. Lee, H., Lee, M., Eom, Y.I.: Sfm: Mitigating read/write amplification problem of LSM-tree-based key-value stores. IEEE Access PP(99), 1–1 (2021)

    Google Scholar 

  16. Li, Y., Tian, C., Guo, F., Li, C., Xu, Y.: Elasticbf: elastic bloom filter with hotness awareness for boosting read performance in large key-value stores. In: 2019 USENIX Annual Technical Conference, pp. 739–752 (2019)

    Google Scholar 

  17. Lu, L., Pillai, T.S., Gopalakrishnan, H., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Wisckey: Separating keys from values in SSD-conscious storage. ACM Trans. Storage 13(1), 1–28 (2017)

    Article  Google Scholar 

  18. Luo, S., Dayan, N., Qin, W., Idreos, S.: Rosetta: A robust space-time optimized range filter for key-value stores. In: SIGMOD. ACM (2020)

    Google Scholar 

  19. O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Informatica 33(4), 351–385 (1996)

    Article  Google Scholar 

  20. Rhea, S., Wang, E., Wong, E., Atkins, E., Storer, N.: Littletable: A time-series database and its uses. In: Salihoglu, S., Zhou, W., Chirkova, R., Yang, J., Suciu, D. (eds.) SIGMOD (2017)

    Google Scholar 

  21. Sanfilippo, S.: Redis documentation (2021). https://redis.io/

  22. Sun, X., Yu, J., Zhou, Z., Xue, C.J.: Fpga-based compaction engine for accelerating LSM-tree key-value stores. In: ICDE, pp. 1261–1272 (2020)

    Google Scholar 

  23. Wu, F., Yang, M., Zhang, B., Du, D.H.C.: Ac-key: Adaptive caching for LSM-based key-value stores. In: Gavrilovska, A., Zadok, E. (eds.) 2020 USENIX Annual Technical Conference, USENIX ATC 2020, July 15-17, 2020, pp. 603–615. USENIX Association (2020)

    Google Scholar 

  24. Yao, T., et al.: Matrixkv: Reducing write stalls and write amplification in LSM-tree based KV stores with matrix container in NVM. In: 2020 USENIX Annual Technical Conference, pp. 17–31 (2020)

    Google Scholar 

  25. Zhang, H., Lim, H., Pavlo, A.: Succinct Range Filters. ACM Trans, Database Syst (2020)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yinliang Yue .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, J., Fan, Z., Yue, Y., Yao, Z., Liu, J., Zhou, J. (2024). LayerBF: A Space Allocation Policy for Bloom Filter in LSM-Tree. In: Song, X., Feng, R., Chen, Y., Li, J., Min, G. (eds) Web and Big Data. APWeb-WAIM 2023. Lecture Notes in Computer Science, vol 14333. Springer, Singapore. https://doi.org/10.1007/978-981-97-2387-4_33

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-2387-4_33

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-2386-7

  • Online ISBN: 978-981-97-2387-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics