Abstract
Based on Semantic Web technologies, knowledge graphs help users to discover information of interest by using live SPARQL services. Answer-seekers often examine intermediate results iteratively and modify SPARQL queries repeatedly in a search session. In this context, understanding user behaviors is critical for effective intention prediction and query optimization. However, these behaviors have not yet been researched systematically at the SPARQL session level. This paper reveals secrets of session-level user search behaviors by conducting a comprehensive investigation over massive real-world SPARQL query logs. In particular, we thoroughly assess query changes made by users w.r.t. structural and data-driven features of SPARQL queries. To illustrate the potentiality of our findings, we employ an application example of how to use our findings, which might be valuable to devise efficient SPARQL caching, auto-completion, query suggestion, approximation, and relaxation techniques in the future (Code and data are available at: https://github.com/seu-kse/SparqlSession.).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
https://sparqles.ai.wu.ac.at/availability, accessed on 2020/10/24 17:39:51.
- 2.
A query can be executed multiple times on the same dataset.
- 3.
We remove the queries with parse errors and the contiguous same queries before the recognition. For example, \(q_1, q_2, q_2, q_3\) is processed into \(q_1, q_2, q_3\).
- 4.
We use randomly selected sessions in DBpedia because the GED computation on such large-scale data is NP-hard and time-consuming.
- 5.
We append a 1 to vectors to avoid all-zero vectors.
- 6.
Features in this vector can also be extended to more dimensions or features.
- 7.
The query representation method can be replaced by other distributed representations such as trained embedding. We do not use embedding here because training embeddings for 10 datasets is highly resource and time-consuming.
- 8.
We eliminate the processing error state here.
- 9.
References
Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_13
Arias, M., Fernández, J.D., Martínez-Prieto, M.A., de la Fuente, P.: An empirical study of real-world SPARQL queries. arXiv preprint arXiv:1103.5043 (2011)
Battle, R., Kolas, D.: Enabling the geospatial semantic web with parliament and GeoSPARQL. Semant. Web 3(4), 355–370 (2012)
Belleau, F., Nolin, M.-A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)
Bielefeldt, A., Gonsior, J., Krötzsch, M.: Practical linked data access via SPARQL: the case of Wikidata. In: LDOW Workshop, pp. 1–10 (2018)
Bizer, C., et al.: DBpedia-a crystallization point for the web of data. JWS 7(3), 154–165 (2009)
Bonifati, A., Martens, W., Timm, T.: An analytical study of large SPARQL query logs. VLDB J. 29(2), 655–679 (2019). https://doi.org/10.1007/s00778-019-00558-9
Bonifati, A., Martens, W., Timm, T.: Navigating the maze of Wikidata query logs. In: The World Wide Web Conference, pp. 127–138 (2019)
Campinas, S., Perry, T.E., Ceccarelli, D., Delbru, R., Tummarello, G.: Introducing RDF graph summary with application to assisted SPARQL formulation. In: DEXA, pp. 261–266 (2012)
Dividino, R., Gröner, G.: Which of the following SPARQL queries are similar? Why? In: Linked Data for Information Extraction, pp. 2–13. CEUR-WS.org (2013)
Guan, D., Zhang, S., Yang, H.: Utilizing query change for session search. In: ACM SIGIR, pp. 453–462 (2013)
Hogan, A., Mellotte, M., Powell, G., Stampouli, D.: Towards fuzzy query-relaxation for RDF. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 687–702. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30284-8_53
Kiefer, C., Bernstein, A., Stocker, M.: The fundamentals of iSPARQL: a virtual triple approach for similarity-based semantic web tasks. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 295–309. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_22
Lehmann, J., Bühmann, L.: AutoSPARQL: let users query your knowledge base. In: Antoniou, G., et al. (eds.) ESWC 2011. LNCS, vol. 6643, pp. 63–79. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21034-1_5
Liu, M., Mao, J., Liu, Y., Zhang, M., Ma, S.: Investigating cognitive effects in session-level search user satisfaction. In: ACM SIGKDD, pp. 923–931 (2019)
Lorey, J., Naumann, F.: Detecting SPARQL query templates for data prefetching. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 124–139. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38288-8_9
Möller, K., Heath, T., Handschuh, S., Domingue, J.: Recipes for semantic web dog food — the ESWC and ISWC metadata projects. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 802–815. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_58
Picalausa, F., Vansummeren, S.: What are real SPARQL queries like? In: The International Workshop on Semantic Web Information Management, pp. 1–6 (2011)
Raghuveer, A.: Characterizing machine agent behavior through SPARQL query mining. In: USEWOD, pp. 1–8 (2012)
Rico, M., Touma, R., Queralt Calafat, A., Pérez, M.S.: Machine learning-based query augmentation for SPARQL endpoints. In: The 14th International Conference on Web Information Systems and Technologies, pp. 57–67 (2018)
Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.-C.N.: LSQ: the linked SPARQL queries dataset. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 261–269. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25010-6_15
Saleem, M., Hasnain, A., Ngomo, A.C.N.: LargeRDFBench: a billion triples benchmark for SPARQL endpoint federation. J. Web Semant. 48, 85–125 (2018)
Saleem, M., Mehmood, Q., Ngonga Ngomo, A.-C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 52–69. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_4
Saleem, M., Szárnyas, G., Conrads, F., Bukhari, S.A.C., Mehmood, Q., Ngonga Ngomo, A.C.: How representative is a SPARQL benchmark? An analysis of RDF Triplestore benchmarks. In: TheWebConf, pp. 1623–1633 (2019)
Stadler, C., Lehmann, J., Höffner, K., Auer, S.: LinkedGeoData: a core for a web of spatial open data. Semant. Web 3(4), 333–354 (2012)
Stegemann, T., Ziegler, J.: Pattern-based analysis of SPARQL queries from the LSQ dataset. In: ISWC (Posters, Demos & Industry Tracks), pp. 1–4 (2017)
Wang, M., Wang, R., Liu, J., Chen, Y., Zhang, L., Qi, G.: Towards empty answers in SPARQL: approximating querying with RDF embedding. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11136, pp. 513–529. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00671-6_30
Acknowledgements
This work has been supported by by National Natural Science Foundation of China with Grant Nos. 61906037 and U1736204; the German Federal Ministry for Economic Affairs and Energy (BMWi) within the project RAKI under the grant no 01MD19012D, by the German Federal Ministry of Education and Research (BMBF) within the project DAIKIRI under the grant no 01IS19085B, and by the EU H2020 Marie Skłodowska-Curie project KnowGraphs under the grant agreement no 860801; National Key Research and Development Program of China with Grant Nos. 2018YFC0830201 and 2017YFB1002801; the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, X., Wang, M., Saleem, M., Ngomo, AC.N., Qi, G., Wang, H. (2020). Revealing Secrets in SPARQL Session Level. In: Pan, J.Z., et al. The Semantic Web – ISWC 2020. ISWC 2020. Lecture Notes in Computer Science(), vol 12506. Springer, Cham. https://doi.org/10.1007/978-3-030-62419-4_38
Download citation
DOI: https://doi.org/10.1007/978-3-030-62419-4_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62418-7
Online ISBN: 978-3-030-62419-4
eBook Packages: Computer ScienceComputer Science (R0)