Abstract
NoSQL databases have gained much attention recently. Due to utilizing Log Structured Merge (LSM) tree, they support fast write throughput and fast lookups on primary keys. Nevertheless, the implementation of the secondary index for these databases is still a challenging task. Modern data storage technologies like: flash memory or phase change memory make the situation even more difficult. The most important problems of such memory types are: limited write endurance and asymmetry between write and read latency. These limitations affect both the index structure and index modification methods.
In this paper, we propose a new bulk loading of the secondary index in LSM-based stores for flash memory. The bulk loading happens when many insert operations are performed in one batch. The method works on a new LSM tree variant optimized for flash memory called Flash Aware LSM (FA-LSM) tree. To reach the optimal performance, the method can be adapted to the changing workload. We conduct several experiments which confirm that our method outperforms the traditional LSM insert strategy by about 30% while preserving high search efficiency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
O’Neil, P.E., Cheng, E., Gawlick, D., O’Neil, E.J.: The Log-Structured Merge-Tree (LSM-Tree). Acta Informatica 33(4), 351–385 (1996)
Wu, C.H., Kuo, T.W., Chang, L.P.: An efficient B-tree layer implementation for flash-memory storage systems. ACM Trans. Embedded Comput. Syst. 6(3), 19-es (2007)
Li, Y., He, B., Yang, R.J., Luo, Q., Yi, K.: Tree indexing on solid state drives. Proc. VLDB Endow. 3(1–2), 1195–1206 (2010)
Agrawal, D., Ganesan, D., Sitaraman, R., Diao, Y., Singh, S.: Lazy-adaptive tree: an optimized index structure for flash devices. Proc. VLDB Endow. 2(1), 361–372 (2009)
Qader, M.A., Cheng, S., Hristidis, V.: A comparative study of secondary indexing techniques in LSM-based NoSQL databases. In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, 10–15 June 2018, pp. 551–566. ACM (2018)
Roumelis, G., Fevgas, A., Vassilakopoulos, M., Corral, A., Bozanis, P., Manolopoulos, Y.: Bulk-loading and bulk-insertion algorithms for xBR\({}^{\text{+ }}\)-trees in Solid State Drives. Computing 101(10), 1539–1563 (2019)
Roumelis, G., Vassilakopoulos, M., Corral, A., Manolopoulos, Y.: An efficient algorithm for bulk-loading xBR\({}^{\text{+ }}\)-trees. Comput. Stand. Interfaces 57, 83–100 (2018)
Zhu, Y., Zhang, Z., Cai, P., Qian, W., Zhou, A.: An efficient bulk loading approach of secondary index in distributed log-structured data stores. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10177, pp. 87–102. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55753-3_6
Dayan, N., Idreos, S.: Dostoevsky: better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging. In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, 10–15 June 2018, pp. 505–520. ACM (2018)
Lu, L., Pillai, T.S., Gopalakrishnan, H., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: WiscKey: separating keys from values in SSD-conscious storage. ACM Trans. Storage 13(1), 5:1–5:28 (2017)
Wu, S., Lin, K., Chang, L.: KVSSD: close integration of LSM trees and flash translation layer for write-efficient KV store. In: 2018 Design, Automation & Test in Europe Conference & Exhibition, DATE 2018, Dresden, Germany, 19–23 March 2018, pp. 563–568. IEEE (2018)
Wang, P., et al.: An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In: Ninth Eurosys Conference 2014, EuroSys 2014, Amsterdam, The Netherlands, 13–16 April 2014, pp. 16:1–16:14. ACM (2014)
Luo, C., Carey, M.J.: LSM-based storage techniques: a survey. VLDB J. 29(1), 393–418 (2020)
Cao, Z., Dong, S., Vemuri, S., Du, D.H.C.: Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In: 18th USENIX Conference on File and Storage Technologies, FAST 2020, Santa Clara, CA, USA, 24–27 February 2020, pp. 209–223. USENIX Association (2020)
Lee, H., Lee, M., Eom, Y.I.: SFM: mitigating read/write amplification problem of LSM-tree-based key-value stores. IEEE Access 9, 103153–103166 (2021)
Raab, F.: TPC-C - the standard benchmark for online transaction processing (OLTP). In: The Benchmark Handbook for Database and Transaction Systems, 2nd edn. Morgan Kaufmann (1993)
Acknowledgment
The paper is supported by Wroclaw University of Science and Technology (subvention number: IDUB/8211204601).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Macyna, W., Kukowski, M. (2022). Bulk Loading of the Secondary Index in LSM-Based Stores for Flash Memory. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-15743-1_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15742-4
Online ISBN: 978-3-031-15743-1
eBook Packages: Computer ScienceComputer Science (R0)