Skip to main content

Bulk Loading of the Secondary Index in LSM-Based Stores for Flash Memory

  • Conference paper
  • First Online:
New Trends in Database and Information Systems (ADBIS 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1652))

Included in the following conference series:

  • 992 Accesses

Abstract

NoSQL databases have gained much attention recently. Due to utilizing Log Structured Merge (LSM) tree, they support fast write throughput and fast lookups on primary keys. Nevertheless, the implementation of the secondary index for these databases is still a challenging task. Modern data storage technologies like: flash memory or phase change memory make the situation even more difficult. The most important problems of such memory types are: limited write endurance and asymmetry between write and read latency. These limitations affect both the index structure and index modification methods.

In this paper, we propose a new bulk loading of the secondary index in LSM-based stores for flash memory. The bulk loading happens when many insert operations are performed in one batch. The method works on a new LSM tree variant optimized for flash memory called Flash Aware LSM (FA-LSM) tree. To reach the optimal performance, the method can be adapted to the changing workload. We conduct several experiments which confirm that our method outperforms the traditional LSM insert strategy by about 30% while preserving high search efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. O’Neil, P.E., Cheng, E., Gawlick, D., O’Neil, E.J.: The Log-Structured Merge-Tree (LSM-Tree). Acta Informatica 33(4), 351–385 (1996)

    Article  Google Scholar 

  2. Wu, C.H., Kuo, T.W., Chang, L.P.: An efficient B-tree layer implementation for flash-memory storage systems. ACM Trans. Embedded Comput. Syst. 6(3), 19-es (2007)

    Google Scholar 

  3. Li, Y., He, B., Yang, R.J., Luo, Q., Yi, K.: Tree indexing on solid state drives. Proc. VLDB Endow. 3(1–2), 1195–1206 (2010)

    Article  Google Scholar 

  4. Agrawal, D., Ganesan, D., Sitaraman, R., Diao, Y., Singh, S.: Lazy-adaptive tree: an optimized index structure for flash devices. Proc. VLDB Endow. 2(1), 361–372 (2009)

    Article  Google Scholar 

  5. Qader, M.A., Cheng, S., Hristidis, V.: A comparative study of secondary indexing techniques in LSM-based NoSQL databases. In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, 10–15 June 2018, pp. 551–566. ACM (2018)

    Google Scholar 

  6. Roumelis, G., Fevgas, A., Vassilakopoulos, M., Corral, A., Bozanis, P., Manolopoulos, Y.: Bulk-loading and bulk-insertion algorithms for xBR\({}^{\text{+ }}\)-trees in Solid State Drives. Computing 101(10), 1539–1563 (2019)

    Article  Google Scholar 

  7. Roumelis, G., Vassilakopoulos, M., Corral, A., Manolopoulos, Y.: An efficient algorithm for bulk-loading xBR\({}^{\text{+ }}\)-trees. Comput. Stand. Interfaces 57, 83–100 (2018)

    Article  Google Scholar 

  8. Zhu, Y., Zhang, Z., Cai, P., Qian, W., Zhou, A.: An efficient bulk loading approach of secondary index in distributed log-structured data stores. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10177, pp. 87–102. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55753-3_6

    Chapter  Google Scholar 

  9. Dayan, N., Idreos, S.: Dostoevsky: better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging. In: Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, 10–15 June 2018, pp. 505–520. ACM (2018)

    Google Scholar 

  10. Lu, L., Pillai, T.S., Gopalakrishnan, H., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: WiscKey: separating keys from values in SSD-conscious storage. ACM Trans. Storage 13(1), 5:1–5:28 (2017)

    Google Scholar 

  11. Wu, S., Lin, K., Chang, L.: KVSSD: close integration of LSM trees and flash translation layer for write-efficient KV store. In: 2018 Design, Automation & Test in Europe Conference & Exhibition, DATE 2018, Dresden, Germany, 19–23 March 2018, pp. 563–568. IEEE (2018)

    Google Scholar 

  12. Wang, P., et al.: An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In: Ninth Eurosys Conference 2014, EuroSys 2014, Amsterdam, The Netherlands, 13–16 April 2014, pp. 16:1–16:14. ACM (2014)

    Google Scholar 

  13. Luo, C., Carey, M.J.: LSM-based storage techniques: a survey. VLDB J. 29(1), 393–418 (2020)

    Article  Google Scholar 

  14. Cao, Z., Dong, S., Vemuri, S., Du, D.H.C.: Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In: 18th USENIX Conference on File and Storage Technologies, FAST 2020, Santa Clara, CA, USA, 24–27 February 2020, pp. 209–223. USENIX Association (2020)

    Google Scholar 

  15. Lee, H., Lee, M., Eom, Y.I.: SFM: mitigating read/write amplification problem of LSM-tree-based key-value stores. IEEE Access 9, 103153–103166 (2021)

    Article  Google Scholar 

  16. Raab, F.: TPC-C - the standard benchmark for online transaction processing (OLTP). In: The Benchmark Handbook for Database and Transaction Systems, 2nd edn. Morgan Kaufmann (1993)

    Google Scholar 

Download references

Acknowledgment

The paper is supported by Wroclaw University of Science and Technology (subvention number: IDUB/8211204601).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wojciech Macyna .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Macyna, W., Kukowski, M. (2022). Bulk Loading of the Secondary Index in LSM-Based Stores for Flash Memory. In: Chiusano, S., et al. New Trends in Database and Information Systems. ADBIS 2022. Communications in Computer and Information Science, vol 1652. Springer, Cham. https://doi.org/10.1007/978-3-031-15743-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15743-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15742-4

  • Online ISBN: 978-3-031-15743-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics