Abstract
We propose a general framework for predicting graph query performance with respect to three performance metrics: execution time, query answer quality, and memory consumption. The learning framework generates and makes use of informative statistics from data and query structure and employs a multi-label regression model to predict the multi-metric query performance. We apply the framework to study two common graph query classes—reachability and graph pattern matching; the two classes differ significantly in their query complexity. For both query classes, we develop suitable performance models and learning algorithms to predict the performance. We demonstrate the efficacy of our framework via experiments on real-world information and social networks. Furthermore, by leveraging the framework, we propose a novel workload optimization algorithm and show that it improves the efficiency of workload management by 54% on average.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akdere, M., Çetintemel, U., Riondato, M., Upfal, E., Zdonik, S.B.: Learning-based query performance modeling and prediction. In: ICDE, pp. 390–401 (2012)
Arias, M., Fernández, J.D., MartĂnez-Prieto, M.A., de la Fuente, P.: An empirical study of real-world SPARQL queries. arXiv preprint arXiv:1103.5043 (2011)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: KDD, pp. 785–794 (2016)
Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. J. Comput. Syst. Sci. 66(4), 614–656 (2003)
Fan, W., Li, J., Ma, S., Tang, N., Wu, Y.: Adding regular expressions to graph reachability and pattern queries. In: ICDE, pp. 39–50 (2011)
Guo, Q., White, R.W., Dumais, S.T., Wang, J., Anderson, B.: Predicting query performance using query, result, and user interaction features. In: RIAO (2010)
Hasan, R.: Predicting SPARQL query performance and explaining linked data. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 795–805. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07443-6_53
Hasan, R., Gandon, F.: A machine learning approach to SPARQL query performance prediction. In: WI-IAT (2014)
Hauff, C., Hiemstra, D., de Jong, F.: A survey of pre-retrieval query performance predictors. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management, pp. 1419–1420. ACM (2008)
Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. CSUR 40, 11 (2008)
Kossmann, D., Ramsak, F., Rost, S.: Shooting stars in the sky: an online algorithm for skyline queries. In: VLDB, pp. 275–286 (2002)
Lu, J., Lin, C., Wang, W., Li, C., Wang, H.: String similarity measures and joins with synonyms. In: SIGMOD (2013)
Lu, X., Bressan, S.: Sampling connected induced subgraphs uniformly at random. In: Ailamaki, A., Bowers, S. (eds.) SSDBM 2012. LNCS, vol. 7338, pp. 195–212. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31235-9_13
Ma, S., Cao, Y., Fan, W., Huai, J., Wo, T.: Capturing topology in graph pattern matching. VLDB 5, 310–321 (2011)
Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_29
Namaki, M.H., Lin, P., Wu, Y.: Event pattern discovery by keywords in graph streams. In: IEEE Big Data (2017)
Namaki, M.H., Chowdhury, R.R., Islam, M.R., Doppa, J.R., Wu, Y.: Learning to speed up query planning in graph databases. In: ICAPS (2017)
Namaki, M.H., Sasani, K., Wu, Y., Ge, T.: BEAMS: bounded event detection in graph streams. In: ICDE, pp. 1387–1388 (2017)
Namaki, M.H., Sasani, K., Wu, Y., Gebremedhin, A.H.: Performance prediction for graph queries. In: NDA (2017)
Namaki, M.H., Wu, Y., Song, Q., Lin, P., Ge, T.: Discovering graph temporal association rules. In: CIKM, pp. 1697–1706 (2017)
Papadias, D., Tao, Y., Fu, G., Seeger, B.: Progressive skyline computation in database systems. TODS 30, 41–82 (2005)
Wu, W., Chi, Y., Zhu, S., Tatemura, J., Hacigümüs, H., Naughton, J.F.: Predicting query execution time: Are optimizer cost models really unusable? In: ICDE, pp. 1081–1092 (2013)
Xu, Z., Hogan, C., Bauer, R.: Greedy is not enough: an efficient batch mode active learning algorithm. In: ICDMW, pp. 326–331 (2009)
Yang, S., Han, F., Wu, Y., Yan, X.: Fast top-k search in knowledge graphs. In: ICDE (2016)
Yang, S., Wu, Y., Sun, H., Yan, X.: Schemaless and structureless graph querying. VLDB 7, 565–576 (2014)
Zhang, W.E., Sheng, Q.Z., Taylor, K., Qin, Y., Yao, L.: Learning-based SPARQL query performance prediction. In: Cellary, W., Mokbel, M.F., Wang, J., Wang, H., Zhou, R., Zhang, Y. (eds.) WISE 2016. LNCS, vol. 10041, pp. 313–327. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48740-3_23
Acknowledgments
Sasani and Gebremedhin are supported in part by NSF CAREER award IIS-1553528. Namaki and Wu are supported in part by NSF IIS-1633629 and Huawei Innovation Research Program (HIRP).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Sasani, K., Namaki, M.H., Wu, Y., Gebremedhin, A.H. (2018). Multi-metric Graph Query Performance Prediction. In: Pei, J., Manolopoulos, Y., Sadiq, S., Li, J. (eds) Database Systems for Advanced Applications. DASFAA 2018. Lecture Notes in Computer Science(), vol 10827. Springer, Cham. https://doi.org/10.1007/978-3-319-91452-7_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-91452-7_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91451-0
Online ISBN: 978-3-319-91452-7
eBook Packages: Computer ScienceComputer Science (R0)