ANSWER: Automatic Index Selector for Knowledge Graphs

Qi, Zhixin; Zhang, Haoran; Wang, Hongzhi; Chao, Zemin

doi:10.1007/978-981-97-2390-4_27

Zhixin Qi¹²,
Haoran Zhang¹²,
Hongzhi Wang¹² &
…
Zemin Chao¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14332))

Included in the following conference series:

Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data

84 Accesses

Abstract

Efficient access to knowledge graphs is identified as the basic premise to make full use of knowledge graphs. Since the query processing efficiency is mainly affected by index configuration, it is necessary to create effective indexes for knowledge graphs. However, none of existing studies of index selection focuses on the characteristics of knowledge graphs. To fill this gap, we propose an automatic index selector for knowledge graphs based on reinforcement learning, named ANSWER, to select an appropriate index configuration according to the historical workloads. However, it is challenging a learn a well-trained index selection model due to the large action space of reinforcement learning model and the requirement of lightweight embedding strategies. To address this problem, we first develop a novel predicate filter, which not only determines which vertical partitioning tables are valuable to create indexes, but also reduces the action space of model. Based on the filtered predicates, we derive an effective and lightweight encoder to not only embed the main features of workloads into the model, but also guarantee the high-efficiency of ANSWER. Experimental results on real-world knowledge graphs demonstrate the effectiveness of ANSWER in terms of knowledge graph query processing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Abadi, D.J., Marcus, A., Madden, S.R., Hollenbach, K.: SW-store: a vertically partitioned DBMS for semantic web data management. VLDB J. 18(2), 385–406 (2009)
Article Google Scholar
Chaudhuri, S., Narasayya, V.: Autoadmin what-if index analysis utility. ACM SIGMOD Rec. 27(2), 367–378 (1998)
Article Google Scholar
Chaudhuri, S., Narasayya, V.: Self-tuning database systems: a decade of progress. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 3–14 (2007)
Google Scholar
Ding, B., Das, S., Marcus, R., Wu, W., Chaudhuri, S., Narasayya, V.R.: AI meets AI: leveraging query executions to improve index recommendations. In: Proceedings of the 2019 International Conference on Management of Data, pp. 1241–1258 (2019)
Google Scholar
Dutt, A., Wang, C., Nazi, A., Kandula, S., Narasayya, V., Chaudhuri, S.: Selectivity estimation for range predicates using lightweight models. Proc. VLDB Endowment 12(9), 1044–1057 (2019)
Article Google Scholar
Hammer, M., Chan, A.: Index selection in a self-adaptive data base management system. In: Proceedings of the 1976 ACM SIGMOD International Conference on Management of Data, pp. 1–8 (1976)
Google Scholar
Harbi, R., Abdelaziz, I., Kalnis, P., Mamoulis, N., Ebrahim, Y., Sahli, M.: Accelerating SPARQL queries by exploiting hash-based locality and adaptive partitioning. VLDBJ (2016)
Google Scholar
Harris, S., Gibbins, N.: 3store: Efficient bulk RDF storage. In: Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems, pp. 81–95 (2004)
Google Scholar
Hasan, S., Thirumuruganathan, S., Augustine, J., Koudas, N., Das, G.: Deep learning models for selectivity estimation of multi-attribute queries. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1035–1050 (2020)
Google Scholar
Hopkins, M.S.: Big data, analytics and the path from insights to value. Sloan Management Review (2011)
Google Scholar
Horowitz, E., Sahni, S.: Computing partitions with applications to the knapsack problem. J. ACM (JACM) 21(2), 277–292 (1974)
Article MathSciNet Google Scholar
Kara, K., Eguro, K., Zhang, C., Alonso, G.: ColumnML: column-store machine learning with on-the-fly data transformation. Proc. VLDB Endowment 12(4), 348–361 (2018)
Article Google Scholar
Kossmann, J., Halfpap, S., Jankrift, M., Schlosser, R.: Magic mirror in my hand, which is the best in the land? An experimental evaluation of index selection algorithms. Proc. VLDB Endowment 13(12), 2382–2395 (2020)
Article Google Scholar
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: Proceedings of the 2018 International Conference on Management of Data, pp. 489–504 (2018)
Google Scholar
Lan, H., Bao, Z., Peng, Y.: An index advisor using deep reinforcement learning. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 2105–2108 (2020)
Google Scholar
Licks, G.P., Meneguzzi, F.: Automated database indexing using model-free reinforcement learning (2020). arXiv preprint arXiv:2007.14244
Müller, M., Moerkotte, G., Kolb, O.: Improved selectivity estimation by combining knowledge from sampling and synopses. Proc. VLDB Endowment 11(9), 1016–1028 (2018)
Article Google Scholar
Neumann, T., Weikum, G.: RDF-3X: a RISC-style engine for RDF. Proc. VLDB Endowment 1(1), 647–659 (2008)
Article Google Scholar
Pan, Z., Heflin, J.: DLDB: extending relational databases to support semantic web queries. In: Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems, pp. 109–113 (2004)
Google Scholar
Puterman, M.L.: Markov decision processes. Handbooks Oper. Res. Manag. Sci. 2, 331–434 (1990)
Article MathSciNet Google Scholar
Robinson, I., Webber, J., Eifrem, E.: Graph databases: new opportunities for connected data. “O’Reilly Media, Inc.” (2015)
Google Scholar
Sadri, Z., Gruenwald, L., Lead, E.: DRLindex: deep reinforcement learning index advisor for a cluster database. In: Proceedings of the 24th Symposium on International Database Engineering & Applications, pp. 1–8 (2020)
Google Scholar
Sattler, K.U., Geist, I., Schallehn, E.: QUIET: continuous query-driven index tuning. In: Proceedings 2003 VLDB Conference, pp. 1129–1132. Elsevier (2003)
Google Scholar
Schkolnick, M.: The optimal selection of secondary indices for files. Inf. Syst. 1(4), 141–146 (1975)
Article MathSciNet Google Scholar
Stonebraker, M.: The choice of partial inversions and combined indices. Int. J. Comput. Inf. Sci. 3(2), 167–188 (1974)
Article Google Scholar
Sun, W., Fokoue, A., Srinivas, K., Kementsietsidis, A., Hu, G., Xie, G.: SQLGraph: an efficient relational-based property graph store. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1887–1901 (2015)
Google Scholar
Wang, X., Qu, C., Wu, W., Wang, J., Zhou, Q.: Are we ready for learned cardinality estimation? Proc. VLDB Endowment 14(9), 1640–1654 (2021)
Article Google Scholar
Wilkinson, K., Wilkinson, K.: Jena property table implementation. In: Proceedings of the 2nd International Workshop on Scalable Semantic Web Knowledge Base Systems, pp. 35–46. Citeseer (2006)
Google Scholar
Yan, Yu., Wang, H.: General model for index recommendation based on convolutional neural network. In: Zeng, J., Jing, W., Song, X., Lu, Z. (eds.) ICPCSEE 2020. CCIS, vol. 1257, pp. 3–15. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-7981-3_1
Chapter Google Scholar
Yan, Y., Yao, S., Wang, H., Gao, M.: Index selection for NoSQL database with deep reinforcement learning. Inf. Sci. 561, 20–30 (2021)
Article Google Scholar
Zhang, R., Liu, P., Guo, X., Li, S., Wang, X.: A unified relational storage scheme for RDF and property graphs. In: Ni, W., Wang, X., Song, W., Li, Y. (eds.) WISA 2019. LNCS, vol. 11817, pp. 418–429. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30952-7_41
Chapter Google Scholar

Download references

Acknowledgements

This paper was partially supported by NSFC grant U1866602. Haoran Zhang and Zhixin Qi contributed to the work equally and should be regarded as co-first authors.

Author information

Authors and Affiliations

Harbin Institute of Technology, Xidazhi Street 92, Harbin, China
Zhixin Qi, Haoran Zhang, Hongzhi Wang & Zemin Chao

Authors

Zhixin Qi
View author publications
You can also search for this author in PubMed Google Scholar
Haoran Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hongzhi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zemin Chao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongzhi Wang .

Editor information

Editors and Affiliations

Peng Cheng Laboratory, Shenzhen, China
Xiangyu Song
China University of Geosciences, Wuhan, China
Ruyi Feng
China University of Geosciences, Wuhan, China
Yunliang Chen
Deakin University, Burwood, VIC, Australia
Jianxin Li
University of Exeter, Exeter, UK
Geyong Min

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qi, Z., Zhang, H., Wang, H., Chao, Z. (2024). ANSWER: Automatic Index Selector for Knowledge Graphs. In: Song, X., Feng, R., Chen, Y., Li, J., Min, G. (eds) Web and Big Data. APWeb-WAIM 2023. Lecture Notes in Computer Science, vol 14332. Springer, Singapore. https://doi.org/10.1007/978-981-97-2390-4_27

Download citation

DOI: https://doi.org/10.1007/978-981-97-2390-4_27
Published: 28 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2389-8
Online ISBN: 978-981-97-2390-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ANSWER: Automatic Index Selector for Knowledge Graphs