A High-Performance Hybrid Index Framework Supporting Inserts for Static Learned Indexes

Ding, Yuquan; Zhao, Xujian

doi:10.1007/978-981-97-2387-4_30

Yuquan Ding¹² &
Xujian Zhao¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14333))

Included in the following conference series:

Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data

48 Accesses

Abstract

The learned index is a new index structure that uses a trained model to directly predict the position of a key and thus has high query performance. However, static learned indexes cannot handle insert operations. Although static PGM-index uses a dynamic data structure to support inserts, it faces a serious read amplification problem under read-write workloads, as the inefficient lookup process of the buffers diminishes the learned indexes. Besides, this structure also leads to periodic retraining of the internal PGM-indexes because the buffers and the learned indexes are strongly coupled, which is unacceptable for those static learned indexes that need tuning. Obviously, this structure is not an ideal general framework. In this paper, we propose a two-layer Hybrid Index Framework (HIF) to address such issues. Specifically, the dynamic layer is used as a buffer for inserts, and the static layer consisting of static learned indexes is used for lookups only. HIF effectively alleviates read amplification by searching the static layer directly. And with this hierarchical structure, HIF isolates learned indexes from insert operations. Thus HIF can completely avoid the retraining of the learned indexes by transformation strategy from the dynamic layer to the static layer. Moreover, we provide a self-tuning algorithm for the learned indexes that cannot be built in a single pass over the data, allowing them to be applied to dynamic workloads with low training overhead. We have conducted experiments using multiple datasets and workloads and the results show that on average, three HIF-based static learned indexes, HLI, PGM, and RMI, achieve up to 1.8 \(\times \), 1.7 \(\times \), and 1.5 \(\times \) higher throughput than the original dynamic PGM-index for insert ratio below 70%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: SIGMOD, pp. 489—504 (2018)
Google Scholar
Galakatos, A., Markovitch, M., Binnig, C., Fonseca, R., Kraska, T.: Fiting-tree: A data-aware index structure. In: SIGMOD, pp. 1189—1206 (2019)
Google Scholar
Ferragina, P., Vinciguerra, G.: The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds. Proc. VLDB Endow. 13, 1162–1175 (2020)
Article Google Scholar
Ding, Y., Zhao, X., Jin, P.: An error-bounded space-efficient hybrid learned index with high lookup performance. In: DEXA, pp. 216–228. Springer (2022)
Google Scholar
Bingmann, T.: STX B+ Tree (2013). https://panthema.net/2007/stx-btree
Marcus, R., et al.: Benchmarking learned indexes. Proc. VLDB Endow. 14, 1–13 (2020)
Article Google Scholar
Wongkham, C., Lu, B., Liu, C., Zhong, Z., Lo, E., Wang, T.: Are updatable learned indexes ready? Proc. VLDB Endow. 15, 3004–3017 (2022)
Article Google Scholar
Xie, Q., Pang, C., Zhou, X., Zhang, X., Deng, K.: Maximum error-bounded piecewise linear representation for online stream approximation. VLDB J. 23, 915–937 (2014)
Article Google Scholar
Li, X., Li, J., Wang, X.: Aslm: Adaptive single layer model for learned index. In: DASFAA Workshops, pp. 80–95 (2019)
Google Scholar
Li, P., Hua, Y., Jia, J., Zuo, P.: Finedex: a fine-grained learned index scheme for scalable and concurrent memory systems. Proc. VLDB Endow. 15, 321–334 (2021)
Article Google Scholar
Ding, J., et al.: ALEX: an updatable adaptive learned index. In: SIGMOD, pp. 969–984 (2020)
Google Scholar
Wu, J., Zhang, Y., Chen, S., Wang, J., Chen, Y., Xing, C.: Updatable learned index with precise positions. Proc. VLDB Endow. 14, 1276–1288 (2021)
Article Google Scholar
Tang, C., et al.: Xindex: a scalable learned index for multicore data storage. In: PPoPP, pp. 308—320 (2020)
Google Scholar
Lu, B., Ding, J., Lo, E., Minhas, U.F., Wang, T.: Apex: a high-performance learned index on persistent memory. Proc. VLDB Endow. 15, 597–610 (2021)
Article Google Scholar
Zhang, Z., et al.: Plin: a persistent learned index for non-volatile memory with high performance and instant recovery. Proc. VLDB Endow. 16, 243–255 (2022)
Article Google Scholar
Zhang, J., Gao, Y.: Carmi: a cache-aware learned index with a cost-based construction algorithm. Proc. VLDB Endow. 15, 2679–2691 (2021)
Article Google Scholar
Kipf, A., Marcus, R., van Renen, A., Stoian, M., Kemper, A., Kraska, T., Neumann, T.: RadixSpline: a single-pass learned index. In: aiDM@SIGMOD, pp. 1–5 (2020)
Google Scholar

Download references

Acknowledgements

This paper is supported by the Humanities and Social Sciences Foundation of the Ministry of Education (17YJCZH260), and the Sichuan Science and Technology Program (2020YFS0057).

Author information

Authors and Affiliations

Southwest University of Science and Technology, Mianyang, China
Yuquan Ding & Xujian Zhao

Authors

Yuquan Ding
View author publications
You can also search for this author in PubMed Google Scholar
Xujian Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xujian Zhao .

Editor information

Editors and Affiliations

Peng Cheng Laboratory, Shenzhen, China
Xiangyu Song
China University of Geosciences, Wuhan, China
Ruyi Feng
China University of Geosciences, Wuhan, China
Yunliang Chen
Deakin University, Burwood, VIC, Australia
Jianxin Li
University of Exeter, Exeter, UK
Geyong Min

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ding, Y., Zhao, X. (2024). A High-Performance Hybrid Index Framework Supporting Inserts for Static Learned Indexes. In: Song, X., Feng, R., Chen, Y., Li, J., Min, G. (eds) Web and Big Data. APWeb-WAIM 2023. Lecture Notes in Computer Science, vol 14333. Springer, Singapore. https://doi.org/10.1007/978-981-97-2387-4_30

Download citation

DOI: https://doi.org/10.1007/978-981-97-2387-4_30
Published: 28 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2386-7
Online ISBN: 978-981-97-2387-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A High-Performance Hybrid Index Framework Supporting Inserts for Static Learned Indexes