Function Interpolation for Learned Index Structures

Setiawan, Naufal Fikri; Rubinstein, Benjamin I. P.; Borovica-Gajic, Renata

doi:10.1007/978-3-030-39469-1_6

Function Interpolation for Learned Index Structures

Naufal Fikri Setiawan¹¹,
Benjamin I. P. Rubinstein¹¹ &
Renata Borovica-Gajic¹¹

Conference paper
First Online: 21 January 2020

1253 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12008))

Abstract

Range indexes such as B-trees are widely recognised as effective data structures for enabling fast retrieval of records by the query key. While such classical indexes offer optimal worst-case guarantees, recent research suggests that average-case performance might be improved by alternative machine learning-based models such as deep neural networks. This paper explores an alternative approach by modelling the task as one of function approximation via interpolation between compressed subsets of keys. We explore the Chebyshev and Bernstein polynomial bases, and demonstrate substantial benefits over deep neural networks. In particular, our proposed function interpolation models exhibit memory footprint two orders of magnitude smaller compared to neural network models, and 30–40% accuracy improvement over neural networks trained with the same amount of time, while keeping query time generally on-par with neural network models.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
It is sufficient but not necessary to prohibit insertions/deletions as done in [16].

References

Aldà, F., Rubinstein, B.I.P.: The Bernstein mechanism: function release under differential privacy. In: AAAI, pp. 1705–1711 (2017)
Google Scholar
Bayer, R., McCreight, E.: Organization and maintenance of large ordered indices. In: SIGFIDET, pp. 107–141 (1970)
Google Scholar
Boyd, J.P., Ong, J.R.: Exponentially-convergent strategies for defeating the Runge phenomenon for the approximation of non-periodic functions, part I: single-interval schemes. Commun. Comput. Phys. 5(2–4), 484–497 (2009)
MathSciNet MATH Google Scholar
Brisebarre, N., Joldeş, M.: Chebyshev interpolation polynomial-based tools for rigorous computing. In: Proceedings of the 2010 International Symposium on Symbolic and Algebraic Computation, pp. 147–154. ACM (2010)
Google Scholar
Cheney, E.W.: Introduction to Approximation Theory. McGraw-Hill, New York (1966)
MATH Google Scholar
Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: a Matlab-like environment for machine learning. In: BigLearn NIPS Workshop (2011)
Google Scholar
Galakatos, A., Markovitch, M., Binnig, C., Fonseca, R., Kraska, T.: Fiting-tree: a data-aware index structure. In: SIGMOD, pp. 1189–1206 (2019)
Google Scholar
Gammerman, A., Vovk, V., Vapnik, V.: Learning by transduction. In: UAI, pp. 148–155 (1998)
Google Scholar
Gil, A., Segura, J., Temme, N.M.: Numerical Methods for Special Functions. Society for Industrial and Applied Mathematics (2007)
Google Scholar
Goldstein, J., Ramakrishnan, R., Shaft, U.: Compressing relations and indexes. In: ICDE (1998)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Graefe, G., Larson, P.A.: B-tree indexes and CPU caches. In: ICDE, pp. 349–358 (2001)
Google Scholar
Hadian, A., Heinis, T.: Interpolation-friendly B-trees: bridging the gap between algorithmic and learned indexes. In: EDBT, pp. 710–713 (2019)
Google Scholar
Idreos, S., Kersten, M.L., Manegold, S.: Database cracking. In: CIDR, pp. 68–78 (2007)
Google Scholar
Kim, C., et al.: Fast: fast architecture sensitive tree search on modern CPUs and GPUs. In: SIGMOD, pp. 339–350 (2010)
Google Scholar
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: SIGMOD (2018)
Google Scholar
Kubica, J.M., Moore, A., Connolly, A.J., Jedicke, R.: Spatial data structures for efficient trajectory-based queries. Technical report, CMU-RI-TR-04-61, Carnegie Mellon University (2004)
Google Scholar
Leis, V., Kemper, A., Neumann, T.: The adaptive radix tree: artful indexing for main-memory databases. In: ICDE, pp. 38–49 (2013)
Google Scholar
Microsoft: Hardware and software requirements for installing SQL server
Google Scholar
Mitzenmacher, M.: A model for learned bloom filters, and optimizing by sandwiching. In: NIPS, pp. 462–471 (2018)
Chapter Google Scholar
Rao, J., Ross, K.A.: Making b+-trees cache conscious in main memory. In: SIGMOD, pp. 475–486 (2000)
Article Google Scholar
Schonfelder, J.: Chebyshev expansions for the error and related functions. Math. Comput. 32(144), 1232–1240 (1978)
Article MathSciNet Google Scholar
Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge (2014)
Book Google Scholar
Stonebraker, M.: The case for partial indexes. SIGMOD Rec. 18(4), 4–11 (1989)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing and Information Systems, University of Melbourne, Melbourne, Australia
Naufal Fikri Setiawan, Benjamin I. P. Rubinstein & Renata Borovica-Gajic

Authors

Naufal Fikri Setiawan
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin I. P. Rubinstein
View author publications
You can also search for this author in PubMed Google Scholar
Renata Borovica-Gajic
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Naufal Fikri Setiawan .

Editor information

Editors and Affiliations

University of Melbourne, Parkville, Australia
Renata Borovica-Gajic
School of Computing and Information Systems, University of Melbourne, Parkville, VIC, Australia
Jianzhong Qi
Monash University, Clayton, Australia
Weiqing Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Setiawan, N.F., Rubinstein, B.I.P., Borovica-Gajic, R. (2020). Function Interpolation for Learned Index Structures. In: Borovica-Gajic, R., Qi, J., Wang, W. (eds) Databases Theory and Applications. ADC 2020. Lecture Notes in Computer Science(), vol 12008. Springer, Cham. https://doi.org/10.1007/978-3-030-39469-1_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-39469-1_6
Published: 21 January 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-39468-4
Online ISBN: 978-3-030-39469-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics