Revealing Secrets in SPARQL Session Level

Zhang, Xinyue; Wang, Meng; Saleem, Muhammad; Ngomo, Axel-Cyrille Ngonga; Qi, Guilin; Wang, Haofen

doi:10.1007/978-3-030-62419-4_38

Xinyue Zhang¹⁶,
Meng Wang^16,17,
Muhammad Saleem¹⁸,
Axel-Cyrille Ngonga Ngomo¹⁹,
Guilin Qi^16,17 &
…
Haofen Wang²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12506))

Included in the following conference series:

International Semantic Web Conference

2567 Accesses

Abstract

Based on Semantic Web technologies, knowledge graphs help users to discover information of interest by using live SPARQL services. Answer-seekers often examine intermediate results iteratively and modify SPARQL queries repeatedly in a search session. In this context, understanding user behaviors is critical for effective intention prediction and query optimization. However, these behaviors have not yet been researched systematically at the SPARQL session level. This paper reveals secrets of session-level user search behaviors by conducting a comprehensive investigation over massive real-world SPARQL query logs. In particular, we thoroughly assess query changes made by users w.r.t. structural and data-driven features of SPARQL queries. To illustrate the potentiality of our findings, we employ an application example of how to use our findings, which might be valuable to devise efficient SPARQL caching, auto-completion, query suggestion, approximation, and relaxation techniques in the future (Code and data are available at: https://github.com/seu-kse/SparqlSession.).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An analytical study of large SPARQL query logs

Article 02 August 2019

Interactive SPARQL query formulation using provenance

Article 13 September 2023

Using SPARQL – The Practitioners’ Viewpoint

Notes

1.
https://sparqles.ai.wu.ac.at/availability, accessed on 2020/10/24 17:39:51.
2.
A query can be executed multiple times on the same dataset.
3.
We remove the queries with parse errors and the contiguous same queries before the recognition. For example, $q_1, q_2, q_2, q_3$ is processed into $q_1, q_2, q_3$.
4.
We use randomly selected sessions in DBpedia because the GED computation on such large-scale data is NP-hard and time-consuming.
5.
We append a 1 to vectors to avoid all-zero vectors.
6.
Features in this vector can also be extended to more dimensions or features.
7.
The query representation method can be replaced by other distributed representations such as trained embedding. We do not use embedding here because training embeddings for 10 datasets is highly resource and time-consuming.
8.
We eliminate the processing error state here.
9.
https://www.w3.org/TR/sparql11-query/#scopeFilters.

References

Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_13
Chapter Google Scholar
Arias, M., Fernández, J.D., Martínez-Prieto, M.A., de la Fuente, P.: An empirical study of real-world SPARQL queries. arXiv preprint arXiv:1103.5043 (2011)
Battle, R., Kolas, D.: Enabling the geospatial semantic web with parliament and GeoSPARQL. Semant. Web 3(4), 355–370 (2012)
Article Google Scholar
Belleau, F., Nolin, M.-A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)
Article Google Scholar
Bielefeldt, A., Gonsior, J., Krötzsch, M.: Practical linked data access via SPARQL: the case of Wikidata. In: LDOW Workshop, pp. 1–10 (2018)
Google Scholar
Bizer, C., et al.: DBpedia-a crystallization point for the web of data. JWS 7(3), 154–165 (2009)
Article Google Scholar
Bonifati, A., Martens, W., Timm, T.: An analytical study of large SPARQL query logs. VLDB J. 29(2), 655–679 (2019). https://doi.org/10.1007/s00778-019-00558-9
Article Google Scholar
Bonifati, A., Martens, W., Timm, T.: Navigating the maze of Wikidata query logs. In: The World Wide Web Conference, pp. 127–138 (2019)
Google Scholar
Campinas, S., Perry, T.E., Ceccarelli, D., Delbru, R., Tummarello, G.: Introducing RDF graph summary with application to assisted SPARQL formulation. In: DEXA, pp. 261–266 (2012)
Google Scholar
Dividino, R., Gröner, G.: Which of the following SPARQL queries are similar? Why? In: Linked Data for Information Extraction, pp. 2–13. CEUR-WS.org (2013)
Google Scholar
Guan, D., Zhang, S., Yang, H.: Utilizing query change for session search. In: ACM SIGIR, pp. 453–462 (2013)
Google Scholar
Hogan, A., Mellotte, M., Powell, G., Stampouli, D.: Towards fuzzy query-relaxation for RDF. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 687–702. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30284-8_53
Chapter Google Scholar
Kiefer, C., Bernstein, A., Stocker, M.: The fundamentals of iSPARQL: a virtual triple approach for similarity-based semantic web tasks. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 295–309. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_22
Chapter Google Scholar
Lehmann, J., Bühmann, L.: AutoSPARQL: let users query your knowledge base. In: Antoniou, G., et al. (eds.) ESWC 2011. LNCS, vol. 6643, pp. 63–79. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21034-1_5
Chapter Google Scholar
Liu, M., Mao, J., Liu, Y., Zhang, M., Ma, S.: Investigating cognitive effects in session-level search user satisfaction. In: ACM SIGKDD, pp. 923–931 (2019)
Google Scholar
Lorey, J., Naumann, F.: Detecting SPARQL query templates for data prefetching. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 124–139. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38288-8_9
Chapter Google Scholar
Möller, K., Heath, T., Handschuh, S., Domingue, J.: Recipes for semantic web dog food — the ESWC and ISWC metadata projects. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 802–815. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_58
Chapter Google Scholar
Picalausa, F., Vansummeren, S.: What are real SPARQL queries like? In: The International Workshop on Semantic Web Information Management, pp. 1–6 (2011)
Google Scholar
Raghuveer, A.: Characterizing machine agent behavior through SPARQL query mining. In: USEWOD, pp. 1–8 (2012)
Google Scholar
Rico, M., Touma, R., Queralt Calafat, A., Pérez, M.S.: Machine learning-based query augmentation for SPARQL endpoints. In: The 14th International Conference on Web Information Systems and Technologies, pp. 57–67 (2018)
Google Scholar
Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.-C.N.: LSQ: the linked SPARQL queries dataset. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 261–269. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25010-6_15
Chapter Google Scholar
Saleem, M., Hasnain, A., Ngomo, A.C.N.: LargeRDFBench: a billion triples benchmark for SPARQL endpoint federation. J. Web Semant. 48, 85–125 (2018)
Article Google Scholar
Saleem, M., Mehmood, Q., Ngonga Ngomo, A.-C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 52–69. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_4
Chapter Google Scholar
Saleem, M., Szárnyas, G., Conrads, F., Bukhari, S.A.C., Mehmood, Q., Ngonga Ngomo, A.C.: How representative is a SPARQL benchmark? An analysis of RDF Triplestore benchmarks. In: TheWebConf, pp. 1623–1633 (2019)
Google Scholar
Stadler, C., Lehmann, J., Höffner, K., Auer, S.: LinkedGeoData: a core for a web of spatial open data. Semant. Web 3(4), 333–354 (2012)
Article Google Scholar
Stegemann, T., Ziegler, J.: Pattern-based analysis of SPARQL queries from the LSQ dataset. In: ISWC (Posters, Demos & Industry Tracks), pp. 1–4 (2017)
Google Scholar
Wang, M., Wang, R., Liu, J., Chen, Y., Zhang, L., Qi, G.: Towards empty answers in SPARQL: approximating querying with RDF embedding. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11136, pp. 513–529. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00671-6_30
Chapter Google Scholar

Download references

Acknowledgements

This work has been supported by by National Natural Science Foundation of China with Grant Nos. 61906037 and U1736204; the German Federal Ministry for Economic Affairs and Energy (BMWi) within the project RAKI under the grant no 01MD19012D, by the German Federal Ministry of Education and Research (BMBF) within the project DAIKIRI under the grant no 01IS19085B, and by the EU H2020 Marie Skłodowska-Curie project KnowGraphs under the grant agreement no 860801; National Key Research and Development Program of China with Grant Nos. 2018YFC0830201 and 2017YFB1002801; the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Southeast University, Nanjing, China
Xinyue Zhang, Meng Wang & Guilin Qi
Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China
Meng Wang & Guilin Qi
AKSW, Leipzig University, Leipzig, Germany
Muhammad Saleem
University of Paderborn, Paderborn, Germany
Axel-Cyrille Ngonga Ngomo
Intelligent Big Data Visualization Lab, Tongji University, Shanghai, China
Haofen Wang

Authors

Xinyue Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Meng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Saleem
View author publications
You can also search for this author in PubMed Google Scholar
Axel-Cyrille Ngonga Ngomo
View author publications
You can also search for this author in PubMed Google Scholar
Guilin Qi
View author publications
You can also search for this author in PubMed Google Scholar
Haofen Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meng Wang .

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, UK
Jeff Z. Pan
University of Liverpool, Liverpool, UK
Valentina Tamma
University of Bari, Bari, Italy
Claudia d’Amato
University of California, Santa Barbara, Santa Barbara, CA, USA
Krzysztof Janowicz
California State University, Long Beach, Long Beach, CA, USA
Bo Fu
Vienna University of Economics and Business, Vienna, Austria
Axel Polleres
Rensselaer Polytechnic Institute, Troy, NY, USA
Oshani Seneviratne
Massachusetts Institute of Technology, Cambridge, MA, USA
Lalana Kagal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, X., Wang, M., Saleem, M., Ngomo, AC.N., Qi, G., Wang, H. (2020). Revealing Secrets in SPARQL Session Level. In: Pan, J.Z., et al. The Semantic Web – ISWC 2020. ISWC 2020. Lecture Notes in Computer Science(), vol 12506. Springer, Cham. https://doi.org/10.1007/978-3-030-62419-4_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-62419-4_38
Published: 01 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62418-7
Online ISBN: 978-3-030-62419-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the Semantic Web Science Association (opens in a new tab)