Abstract
The Log-Structured Merge-Tree has efficient writing performance and performs well in big data scenarios. An LSM-tree transforms random writes into batch sequential writes through the design of a multilayer storage structure. However, as the core operation, the compaction inevitably results in degrading periodically in the read performance. Regular but irregular data compaction operations make the cache challenging to track the access information of data blocks. This work studies how to address the cache invalidation problem. We propose a two-phase parallel prefetching approach, which can effectively improve the cache invalidation when the compaction occurs. Our experimental results show our method can effectively improve read performance.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
O’Neil, et al.: The log-structured merge-tree (LSM-tree). Acta Inf. 33(4), 351–385 (1996)
Jagadish, H.V., et al.: Incremental organization for data recording and warehousing. In: Proceedings of the 23rd VLDB (1997)
Ghemawat, S., Dean, J.: LevelDB. http://leveldb.org (2011)
Chang, F., Dean, J., et al.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 1–26 (2008)
Facebook. RocksDB: a persistent key-value store. http://rocksdb.org
Basescu, C., et al.: Robust data sharing with key-value stores. In: Proceedings of DSN (2012)
DeCandia, G., et al.: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)
Teng, D., et al.: LSbM-tree: re-enabling buffer caching in data management for mixed reads and writes. In: Proceedings of ICDCS (2017)
Raju, P., et al.: PebblesDB: building key-value stores using fragmented log-structured merge trees. In: Proceedings of SOSP (2017)
Ooper, B.F., et al.: Benchmarking cloud serving systems with YCSB. In: Proceedings of Cloud (2010)
Wu, X., et al.: LSM-trie: an LSM-tree-based ultra-large key-value store for small data items. In: Proceedings of USENIX ATC (2015)
Dayan, N., et al.: Dostoevsky: Better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging. In: Proceedings of SIGMOD (2018)
Kaiyrakhmet, O., et al.: SLM-DB: single-level key-value store with persistent memory. In: Proceedings of USENIX FAST (2019)
Athanassoulis, M., et al. MaSM: efficient online updates in data warehouses. In: Proceedings of SIGMOD (2011)
Pugh, W.: Skip lists: a probabilistic alternative to balanced trees. Commun. ACM 33, 668–676 (1990)
Pan, F.F., et al.: dCompaction: speeding up compaction of the LSM-tree via delayed compaction. Comput. Sci. Technol. 32(1), 41–54 (2017)
Cockroach Labs. CockroachDB. https://github.com/cockroachdb/cockroach
Apache Cassandra. http://cassandra.apache.org
Bansal, J.C.: Particle Swarm Optimization. In: Bansal, J.C., Singh, P.K., Pal, N.R. (eds.) Evolutionary and swarm intelligence algorithms. SCI, vol. 779, pp. 11–23. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-91341-4_2
Cao, Z., et al.: Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In: Proceedings of USENIX FAST (2020)
Yao, T., Zhang, Y., et al.: MatrixKV: reducing write stalls and write amplification in LSM-tree BasedKV stores with a matrix container in NVM. In: Proceedings of USENIX ATC (2020)
Kannan, S., et al. Redesigning LSMs for nonvolatile memory with NoveL SM. In: Proceedings of USENIX ATC (2018)
Luo, C., Carey, M.J.: LSM-based storage techniques: a survey. VLDB J. 29(1), 393–418 (2019). https://doi.org/10.1007/s00778-019-00555-y
Balmau, O., et al. TRIAD: creating synergies between memory, disk and log in log structured key-value stores. In: Proceedings of USENIX ATC (2017)
Wu, L., et al.: Building efficient key-value stores via a lightweight compaction tree. In: Proceedings of USENIX ICDE (2017)
Kim, Y., et al. A comparative study of log-structured merge-tree-based spatial indexes for big data. In: Proceedings of USENIX ICDE (2017)
Chen, H., et al.: SpanDB: a fast, cost-effective LSM-tree based KV store on hybrid storage. In: Proceedings of USENIX FAST (2021)
Acknowledgments
This work was supported in part the National Science Foundation of China Projects (61971309) and Tianjin Science Foundation Project (18JCYBJC85500).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, S., Xu, G., Jia, Y., Xue, Y., Zheng, W. (2022). Parallel Cache Prefetching for LSM-Tree Based Store: From Algorithm to Evaluation. In: Lai, Y., Wang, T., Jiang, M., Xu, G., Liang, W., Castiglione, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2021. Lecture Notes in Computer Science(), vol 13155. Springer, Cham. https://doi.org/10.1007/978-3-030-95384-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-95384-3_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95383-6
Online ISBN: 978-3-030-95384-3
eBook Packages: Computer ScienceComputer Science (R0)