Abstract
Efficient access to knowledge graphs is identified as the basic premise to make full use of knowledge graphs. Since the query processing efficiency is mainly affected by index configuration, it is necessary to create effective indexes for knowledge graphs. However, none of existing studies of index selection focuses on the characteristics of knowledge graphs. To fill this gap, we propose an automatic index selector for knowledge graphs based on reinforcement learning, named ANSWER, to select an appropriate index configuration according to the historical workloads. However, it is challenging a learn a well-trained index selection model due to the large action space of reinforcement learning model and the requirement of lightweight embedding strategies. To address this problem, we first develop a novel predicate filter, which not only determines which vertical partitioning tables are valuable to create indexes, but also reduces the action space of model. Based on the filtered predicates, we derive an effective and lightweight encoder to not only embed the main features of workloads into the model, but also guarantee the high-efficiency of ANSWER. Experimental results on real-world knowledge graphs demonstrate the effectiveness of ANSWER in terms of knowledge graph query processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abadi, D.J., Marcus, A., Madden, S.R., Hollenbach, K.: SW-store: a vertically partitioned DBMS for semantic web data management. VLDB J. 18(2), 385–406 (2009)
Chaudhuri, S., Narasayya, V.: Autoadmin what-if index analysis utility. ACM SIGMOD Rec. 27(2), 367–378 (1998)
Chaudhuri, S., Narasayya, V.: Self-tuning database systems: a decade of progress. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 3–14 (2007)
Ding, B., Das, S., Marcus, R., Wu, W., Chaudhuri, S., Narasayya, V.R.: AI meets AI: leveraging query executions to improve index recommendations. In: Proceedings of the 2019 International Conference on Management of Data, pp. 1241–1258 (2019)
Dutt, A., Wang, C., Nazi, A., Kandula, S., Narasayya, V., Chaudhuri, S.: Selectivity estimation for range predicates using lightweight models. Proc. VLDB Endowment 12(9), 1044–1057 (2019)
Hammer, M., Chan, A.: Index selection in a self-adaptive data base management system. In: Proceedings of the 1976 ACM SIGMOD International Conference on Management of Data, pp. 1–8 (1976)
Harbi, R., Abdelaziz, I., Kalnis, P., Mamoulis, N., Ebrahim, Y., Sahli, M.: Accelerating SPARQL queries by exploiting hash-based locality and adaptive partitioning. VLDBJ (2016)
Harris, S., Gibbins, N.: 3store: Efficient bulk RDF storage. In: Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems, pp. 81–95 (2004)
Hasan, S., Thirumuruganathan, S., Augustine, J., Koudas, N., Das, G.: Deep learning models for selectivity estimation of multi-attribute queries. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pp. 1035–1050 (2020)
Hopkins, M.S.: Big data, analytics and the path from insights to value. Sloan Management Review (2011)
Horowitz, E., Sahni, S.: Computing partitions with applications to the knapsack problem. J. ACM (JACM) 21(2), 277–292 (1974)
Kara, K., Eguro, K., Zhang, C., Alonso, G.: ColumnML: column-store machine learning with on-the-fly data transformation. Proc. VLDB Endowment 12(4), 348–361 (2018)
Kossmann, J., Halfpap, S., Jankrift, M., Schlosser, R.: Magic mirror in my hand, which is the best in the land? An experimental evaluation of index selection algorithms. Proc. VLDB Endowment 13(12), 2382–2395 (2020)
Kraska, T., Beutel, A., Chi, E.H., Dean, J., Polyzotis, N.: The case for learned index structures. In: Proceedings of the 2018 International Conference on Management of Data, pp. 489–504 (2018)
Lan, H., Bao, Z., Peng, Y.: An index advisor using deep reinforcement learning. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 2105–2108 (2020)
Licks, G.P., Meneguzzi, F.: Automated database indexing using model-free reinforcement learning (2020). arXiv preprint arXiv:2007.14244
Müller, M., Moerkotte, G., Kolb, O.: Improved selectivity estimation by combining knowledge from sampling and synopses. Proc. VLDB Endowment 11(9), 1016–1028 (2018)
Neumann, T., Weikum, G.: RDF-3X: a RISC-style engine for RDF. Proc. VLDB Endowment 1(1), 647–659 (2008)
Pan, Z., Heflin, J.: DLDB: extending relational databases to support semantic web queries. In: Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems, pp. 109–113 (2004)
Puterman, M.L.: Markov decision processes. Handbooks Oper. Res. Manag. Sci. 2, 331–434 (1990)
Robinson, I., Webber, J., Eifrem, E.: Graph databases: new opportunities for connected data. “O’Reilly Media, Inc.” (2015)
Sadri, Z., Gruenwald, L., Lead, E.: DRLindex: deep reinforcement learning index advisor for a cluster database. In: Proceedings of the 24th Symposium on International Database Engineering & Applications, pp. 1–8 (2020)
Sattler, K.U., Geist, I., Schallehn, E.: QUIET: continuous query-driven index tuning. In: Proceedings 2003 VLDB Conference, pp. 1129–1132. Elsevier (2003)
Schkolnick, M.: The optimal selection of secondary indices for files. Inf. Syst. 1(4), 141–146 (1975)
Stonebraker, M.: The choice of partial inversions and combined indices. Int. J. Comput. Inf. Sci. 3(2), 167–188 (1974)
Sun, W., Fokoue, A., Srinivas, K., Kementsietsidis, A., Hu, G., Xie, G.: SQLGraph: an efficient relational-based property graph store. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1887–1901 (2015)
Wang, X., Qu, C., Wu, W., Wang, J., Zhou, Q.: Are we ready for learned cardinality estimation? Proc. VLDB Endowment 14(9), 1640–1654 (2021)
Wilkinson, K., Wilkinson, K.: Jena property table implementation. In: Proceedings of the 2nd International Workshop on Scalable Semantic Web Knowledge Base Systems, pp. 35–46. Citeseer (2006)
Yan, Yu., Wang, H.: General model for index recommendation based on convolutional neural network. In: Zeng, J., Jing, W., Song, X., Lu, Z. (eds.) ICPCSEE 2020. CCIS, vol. 1257, pp. 3–15. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-7981-3_1
Yan, Y., Yao, S., Wang, H., Gao, M.: Index selection for NoSQL database with deep reinforcement learning. Inf. Sci. 561, 20–30 (2021)
Zhang, R., Liu, P., Guo, X., Li, S., Wang, X.: A unified relational storage scheme for RDF and property graphs. In: Ni, W., Wang, X., Song, W., Li, Y. (eds.) WISA 2019. LNCS, vol. 11817, pp. 418–429. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30952-7_41
Acknowledgements
This paper was partially supported by NSFC grant U1866602. Haoran Zhang and Zhixin Qi contributed to the work equally and should be regarded as co-first authors.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Qi, Z., Zhang, H., Wang, H., Chao, Z. (2024). ANSWER: Automatic Index Selector for Knowledge Graphs. In: Song, X., Feng, R., Chen, Y., Li, J., Min, G. (eds) Web and Big Data. APWeb-WAIM 2023. Lecture Notes in Computer Science, vol 14332. Springer, Singapore. https://doi.org/10.1007/978-981-97-2390-4_27
Download citation
DOI: https://doi.org/10.1007/978-981-97-2390-4_27
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2389-8
Online ISBN: 978-981-97-2390-4
eBook Packages: Computer ScienceComputer Science (R0)