Abstract
LSM-Tree based key-value stores commonly suffer from the issue of read amplification, as the retrieval of a particular key typically requires examination of multiple layers of SSTables. To enhance query performance, a bloom filter is commonly employed, although it is susceptible to the problem of false positives, which leads to additional I/Os. To mitigate the issue of false positives, the bloom filter size can be increased, but this in turn results in higher memory consumption. In response, we have developed LayerBF, a space allocation strategy for layered bloom filters. By leveraging access frequency, LayerBF dynamically allocates bits-per-key of bloom filters in each layer. Hotter layers are allocated a larger space, while colder layers are allocated a smaller space. This approach reduces the average false positive rate, improves storage read performance, and simultaneously minimizes memory consumption. We have implemented LayerBF in the widely used RocksDB key-value store and evaluated its performance with and without LayerBF on both hard disk drives (HDDs) and solid-state drives (SSDs). The evaluation results demonstrate that LayerBF improves read performance by 5% to 14% and reduces the false positive rate by 8% to 10%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bloom, B.H.: Space time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Chang, F., et al.: Bigtable: a distributed storage system for structured data. TOCS
Choi, W.G., Kim, D., Roh, H., Park, S.: Ourrocks: Offloading disk scan directly to GPU in write-optimized database system. IEEE Trans. Computers 70(11), 1831–1844 (2021). https://doi.org/10.1109/TC.2020.3027671, https://doi.org/10.1109/TC.2020.3027671
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing. pp. 143–154
Dai, Y., Xu, Y., Arpaci-Dusseau, R.H.: From wisckey to bourbon: A learned index for log-structured merge trees. In: OSDI (2020)
Dayan, N., Athanassoulis, M., Idreos, S.: Monkey: Optimal navigable key-value store. In: Proceedings of the 2017 ACM International Conference on Management of Data (2017)
DeCandia, G., Hastorun, D., Pilchin, A., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: Bressoud, T.C., Kaashoek, M.F. (eds.) SOSP. ACM (2007)
FaceBook: Rocksdb documentation (2012). http://rocksdb.org/
George, L.: HBase: the definitive guide: random access to your planet-size data. O’Reilly Media, Inc. (2011)
Google: Leveldb documentation (2021). https://github.com/google/leveldb
Huang, H., Ghandeharizadeh, S.: Nova-LSM: a distributed, component-based LSM-tree key-value store. In: Proceedings of the 2021 International Conference on Management of Data, pp. 749–763 (2021)
Kaiyrakhmet, O., Lee, S., Nam, B., Noh, S.H., ri Choi, Y.: SLM-DB: single-level key-value store with persistent memory. In: 2019 USENIX Conference on File and Storage Technologies, pp. 191–205. Boston, MA (2019)
Lai, C., Jiang, S., Yang, L., Hou, Z., Cui, C., Cong, J.: Atlas: Baidu’s key-value storage system for cloud data. In: MSST
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010)
Lee, H., Lee, M., Eom, Y.I.: Sfm: Mitigating read/write amplification problem of LSM-tree-based key-value stores. IEEE Access PP(99), 1–1 (2021)
Li, Y., Tian, C., Guo, F., Li, C., Xu, Y.: Elasticbf: elastic bloom filter with hotness awareness for boosting read performance in large key-value stores. In: 2019 USENIX Annual Technical Conference, pp. 739–752 (2019)
Lu, L., Pillai, T.S., Gopalakrishnan, H., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Wisckey: Separating keys from values in SSD-conscious storage. ACM Trans. Storage 13(1), 1–28 (2017)
Luo, S., Dayan, N., Qin, W., Idreos, S.: Rosetta: A robust space-time optimized range filter for key-value stores. In: SIGMOD. ACM (2020)
O’Neil, P., Cheng, E., Gawlick, D., O’Neil, E.: The log-structured merge-tree (LSM-tree). Acta Informatica 33(4), 351–385 (1996)
Rhea, S., Wang, E., Wong, E., Atkins, E., Storer, N.: Littletable: A time-series database and its uses. In: Salihoglu, S., Zhou, W., Chirkova, R., Yang, J., Suciu, D. (eds.) SIGMOD (2017)
Sanfilippo, S.: Redis documentation (2021). https://redis.io/
Sun, X., Yu, J., Zhou, Z., Xue, C.J.: Fpga-based compaction engine for accelerating LSM-tree key-value stores. In: ICDE, pp. 1261–1272 (2020)
Wu, F., Yang, M., Zhang, B., Du, D.H.C.: Ac-key: Adaptive caching for LSM-based key-value stores. In: Gavrilovska, A., Zadok, E. (eds.) 2020 USENIX Annual Technical Conference, USENIX ATC 2020, July 15-17, 2020, pp. 603–615. USENIX Association (2020)
Yao, T., et al.: Matrixkv: Reducing write stalls and write amplification in LSM-tree based KV stores with matrix container in NVM. In: 2020 USENIX Annual Technical Conference, pp. 17–31 (2020)
Zhang, H., Lim, H., Pavlo, A.: Succinct Range Filters. ACM Trans, Database Syst (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, J., Fan, Z., Yue, Y., Yao, Z., Liu, J., Zhou, J. (2024). LayerBF: A Space Allocation Policy for Bloom Filter in LSM-Tree. In: Song, X., Feng, R., Chen, Y., Li, J., Min, G. (eds) Web and Big Data. APWeb-WAIM 2023. Lecture Notes in Computer Science, vol 14333. Springer, Singapore. https://doi.org/10.1007/978-981-97-2387-4_33
Download citation
DOI: https://doi.org/10.1007/978-981-97-2387-4_33
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2386-7
Online ISBN: 978-981-97-2387-4
eBook Packages: Computer ScienceComputer Science (R0)