Skip to main content

A High-Performance Hybrid Index Framework Supporting Inserts for Static Learned Indexes

  • Conference paper
  • First Online:
Web and Big Data (APWeb-WAIM 2023)

Abstract

The learned index is a new index structure that uses a trained model to directly predict the position of a key and thus has high query performance. However, static learned indexes cannot handle insert operations. Although static PGM-index uses a dynamic data structure to support inserts, it faces a serious read amplification problem under read-write workloads, as the inefficient lookup process of the buffers diminishes the learned indexes. Besides, this structure also leads to periodic retraining of the internal PGM-indexes because the buffers and the learned indexes are strongly coupled, which is unacceptable for those static learned indexes that need tuning. Obviously, this structure is not an ideal general framework. In this paper, we propose a two-layer Hybrid Index Framework (HIF) to address such issues. Specifically, the dynamic layer is used as a buffer for inserts, and the static layer consisting of static learned indexes is used for lookups only. HIF effectively alleviates read amplification by searching the static layer directly. And with this hierarchical structure, HIF isolates learned indexes from insert operations. Thus HIF can completely avoid the retraining of the learned indexes by transformation strategy from the dynamic layer to the static layer. Moreover, we provide a self-tuning algorithm for the learned indexes that cannot be built in a single pass over the data, allowing them to be applied to dynamic workloads with low training overhead. We have conducted experiments using multiple datasets and workloads and the results show that on average, three HIF-based static learned indexes, HLI, PGM, and RMI, achieve up to 1.8 \(\times \), 1.7 \(\times \), and 1.5 \(\times \) higher throughput than the original dynamic PGM-index for insert ratio below 70%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: SIGMOD, pp. 489—504 (2018)

    Google Scholar 

  2. Galakatos, A., Markovitch, M., Binnig, C., Fonseca, R., Kraska, T.: Fiting-tree: A data-aware index structure. In: SIGMOD, pp. 1189—1206 (2019)

    Google Scholar 

  3. Ferragina, P., Vinciguerra, G.: The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds. Proc. VLDB Endow. 13, 1162–1175 (2020)

    Article  Google Scholar 

  4. Ding, Y., Zhao, X., Jin, P.: An error-bounded space-efficient hybrid learned index with high lookup performance. In: DEXA, pp. 216–228. Springer (2022)

    Google Scholar 

  5. Bingmann, T.: STX B+ Tree (2013). https://panthema.net/2007/stx-btree

  6. Marcus, R., et al.: Benchmarking learned indexes. Proc. VLDB Endow. 14, 1–13 (2020)

    Article  Google Scholar 

  7. Wongkham, C., Lu, B., Liu, C., Zhong, Z., Lo, E., Wang, T.: Are updatable learned indexes ready? Proc. VLDB Endow. 15, 3004–3017 (2022)

    Article  Google Scholar 

  8. Xie, Q., Pang, C., Zhou, X., Zhang, X., Deng, K.: Maximum error-bounded piecewise linear representation for online stream approximation. VLDB J. 23, 915–937 (2014)

    Article  Google Scholar 

  9. Li, X., Li, J., Wang, X.: Aslm: Adaptive single layer model for learned index. In: DASFAA Workshops, pp. 80–95 (2019)

    Google Scholar 

  10. Li, P., Hua, Y., Jia, J., Zuo, P.: Finedex: a fine-grained learned index scheme for scalable and concurrent memory systems. Proc. VLDB Endow. 15, 321–334 (2021)

    Article  Google Scholar 

  11. Ding, J., et al.: ALEX: an updatable adaptive learned index. In: SIGMOD, pp. 969–984 (2020)

    Google Scholar 

  12. Wu, J., Zhang, Y., Chen, S., Wang, J., Chen, Y., Xing, C.: Updatable learned index with precise positions. Proc. VLDB Endow. 14, 1276–1288 (2021)

    Article  Google Scholar 

  13. Tang, C., et al.: Xindex: a scalable learned index for multicore data storage. In: PPoPP, pp. 308—320 (2020)

    Google Scholar 

  14. Lu, B., Ding, J., Lo, E., Minhas, U.F., Wang, T.: Apex: a high-performance learned index on persistent memory. Proc. VLDB Endow. 15, 597–610 (2021)

    Article  Google Scholar 

  15. Zhang, Z., et al.: Plin: a persistent learned index for non-volatile memory with high performance and instant recovery. Proc. VLDB Endow. 16, 243–255 (2022)

    Article  Google Scholar 

  16. Zhang, J., Gao, Y.: Carmi: a cache-aware learned index with a cost-based construction algorithm. Proc. VLDB Endow. 15, 2679–2691 (2021)

    Article  Google Scholar 

  17. Kipf, A., Marcus, R., van Renen, A., Stoian, M., Kemper, A., Kraska, T., Neumann, T.: RadixSpline: a single-pass learned index. In: aiDM@SIGMOD, pp. 1–5 (2020)

    Google Scholar 

Download references

Acknowledgements

This paper is supported by the Humanities and Social Sciences Foundation of the Ministry of Education (17YJCZH260), and the Sichuan Science and Technology Program (2020YFS0057).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xujian Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ding, Y., Zhao, X. (2024). A High-Performance Hybrid Index Framework Supporting Inserts for Static Learned Indexes. In: Song, X., Feng, R., Chen, Y., Li, J., Min, G. (eds) Web and Big Data. APWeb-WAIM 2023. Lecture Notes in Computer Science, vol 14333. Springer, Singapore. https://doi.org/10.1007/978-981-97-2387-4_30

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-2387-4_30

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-2386-7

  • Online ISBN: 978-981-97-2387-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics