Skip to main content

Revealing Secrets in SPARQL Session Level

  • Conference paper
  • First Online:
The Semantic Web – ISWC 2020 (ISWC 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12506))

Included in the following conference series:

  • 2567 Accesses

Abstract

Based on Semantic Web technologies, knowledge graphs help users to discover information of interest by using live SPARQL services. Answer-seekers often examine intermediate results iteratively and modify SPARQL queries repeatedly in a search session. In this context, understanding user behaviors is critical for effective intention prediction and query optimization. However, these behaviors have not yet been researched systematically at the SPARQL session level. This paper reveals secrets of session-level user search behaviors by conducting a comprehensive investigation over massive real-world SPARQL query logs. In particular, we thoroughly assess query changes made by users w.r.t. structural and data-driven features of SPARQL queries. To illustrate the potentiality of our findings, we employ an application example of how to use our findings, which might be valuable to devise efficient SPARQL caching, auto-completion, query suggestion, approximation, and relaxation techniques in the future (Code and data are available at: https://github.com/seu-kse/SparqlSession.).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://sparqles.ai.wu.ac.at/availability, accessed on 2020/10/24 17:39:51.

  2. 2.

    A query can be executed multiple times on the same dataset.

  3. 3.

    We remove the queries with parse errors and the contiguous same queries before the recognition. For example, \(q_1, q_2, q_2, q_3\) is processed into \(q_1, q_2, q_3\).

  4. 4.

    We use randomly selected sessions in DBpedia because the GED computation on such large-scale data is NP-hard and time-consuming.

  5. 5.

    We append a 1 to vectors to avoid all-zero vectors.

  6. 6.

    Features in this vector can also be extended to more dimensions or features.

  7. 7.

    The query representation method can be replaced by other distributed representations such as trained embedding. We do not use embedding here because training embeddings for 10 datasets is highly resource and time-consuming.

  8. 8.

    We eliminate the processing error state here.

  9. 9.

    https://www.w3.org/TR/sparql11-query/#scopeFilters.

References

  1. Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_13

    Chapter  Google Scholar 

  2. Arias, M., Fernández, J.D., Martínez-Prieto, M.A., de la Fuente, P.: An empirical study of real-world SPARQL queries. arXiv preprint arXiv:1103.5043 (2011)

  3. Battle, R., Kolas, D.: Enabling the geospatial semantic web with parliament and GeoSPARQL. Semant. Web 3(4), 355–370 (2012)

    Article  Google Scholar 

  4. Belleau, F., Nolin, M.-A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)

    Article  Google Scholar 

  5. Bielefeldt, A., Gonsior, J., Krötzsch, M.: Practical linked data access via SPARQL: the case of Wikidata. In: LDOW Workshop, pp. 1–10 (2018)

    Google Scholar 

  6. Bizer, C., et al.: DBpedia-a crystallization point for the web of data. JWS 7(3), 154–165 (2009)

    Article  Google Scholar 

  7. Bonifati, A., Martens, W., Timm, T.: An analytical study of large SPARQL query logs. VLDB J. 29(2), 655–679 (2019). https://doi.org/10.1007/s00778-019-00558-9

    Article  Google Scholar 

  8. Bonifati, A., Martens, W., Timm, T.: Navigating the maze of Wikidata query logs. In: The World Wide Web Conference, pp. 127–138 (2019)

    Google Scholar 

  9. Campinas, S., Perry, T.E., Ceccarelli, D., Delbru, R., Tummarello, G.: Introducing RDF graph summary with application to assisted SPARQL formulation. In: DEXA, pp. 261–266 (2012)

    Google Scholar 

  10. Dividino, R., Gröner, G.: Which of the following SPARQL queries are similar? Why? In: Linked Data for Information Extraction, pp. 2–13. CEUR-WS.org (2013)

    Google Scholar 

  11. Guan, D., Zhang, S., Yang, H.: Utilizing query change for session search. In: ACM SIGIR, pp. 453–462 (2013)

    Google Scholar 

  12. Hogan, A., Mellotte, M., Powell, G., Stampouli, D.: Towards fuzzy query-relaxation for RDF. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 687–702. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30284-8_53

    Chapter  Google Scholar 

  13. Kiefer, C., Bernstein, A., Stocker, M.: The fundamentals of iSPARQL: a virtual triple approach for similarity-based semantic web tasks. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 295–309. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_22

    Chapter  Google Scholar 

  14. Lehmann, J., Bühmann, L.: AutoSPARQL: let users query your knowledge base. In: Antoniou, G., et al. (eds.) ESWC 2011. LNCS, vol. 6643, pp. 63–79. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21034-1_5

    Chapter  Google Scholar 

  15. Liu, M., Mao, J., Liu, Y., Zhang, M., Ma, S.: Investigating cognitive effects in session-level search user satisfaction. In: ACM SIGKDD, pp. 923–931 (2019)

    Google Scholar 

  16. Lorey, J., Naumann, F.: Detecting SPARQL query templates for data prefetching. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 124–139. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38288-8_9

    Chapter  Google Scholar 

  17. Möller, K., Heath, T., Handschuh, S., Domingue, J.: Recipes for semantic web dog food — the ESWC and ISWC metadata projects. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 802–815. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_58

    Chapter  Google Scholar 

  18. Picalausa, F., Vansummeren, S.: What are real SPARQL queries like? In: The International Workshop on Semantic Web Information Management, pp. 1–6 (2011)

    Google Scholar 

  19. Raghuveer, A.: Characterizing machine agent behavior through SPARQL query mining. In: USEWOD, pp. 1–8 (2012)

    Google Scholar 

  20. Rico, M., Touma, R., Queralt Calafat, A., Pérez, M.S.: Machine learning-based query augmentation for SPARQL endpoints. In: The 14th International Conference on Web Information Systems and Technologies, pp. 57–67 (2018)

    Google Scholar 

  21. Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.-C.N.: LSQ: the linked SPARQL queries dataset. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 261–269. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25010-6_15

    Chapter  Google Scholar 

  22. Saleem, M., Hasnain, A., Ngomo, A.C.N.: LargeRDFBench: a billion triples benchmark for SPARQL endpoint federation. J. Web Semant. 48, 85–125 (2018)

    Article  Google Scholar 

  23. Saleem, M., Mehmood, Q., Ngonga Ngomo, A.-C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 52–69. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_4

    Chapter  Google Scholar 

  24. Saleem, M., Szárnyas, G., Conrads, F., Bukhari, S.A.C., Mehmood, Q., Ngonga Ngomo, A.C.: How representative is a SPARQL benchmark? An analysis of RDF Triplestore benchmarks. In: TheWebConf, pp. 1623–1633 (2019)

    Google Scholar 

  25. Stadler, C., Lehmann, J., Höffner, K., Auer, S.: LinkedGeoData: a core for a web of spatial open data. Semant. Web 3(4), 333–354 (2012)

    Article  Google Scholar 

  26. Stegemann, T., Ziegler, J.: Pattern-based analysis of SPARQL queries from the LSQ dataset. In: ISWC (Posters, Demos & Industry Tracks), pp. 1–4 (2017)

    Google Scholar 

  27. Wang, M., Wang, R., Liu, J., Chen, Y., Zhang, L., Qi, G.: Towards empty answers in SPARQL: approximating querying with RDF embedding. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11136, pp. 513–529. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00671-6_30

    Chapter  Google Scholar 

Download references

Acknowledgements

This work has been supported by by National Natural Science Foundation of China with Grant Nos. 61906037 and U1736204; the German Federal Ministry for Economic Affairs and Energy (BMWi) within the project RAKI under the grant no 01MD19012D, by the German Federal Ministry of Education and Research (BMBF) within the project DAIKIRI under the grant no 01IS19085B, and by the EU H2020 Marie Skłodowska-Curie project KnowGraphs under the grant agreement no 860801; National Key Research and Development Program of China with Grant Nos. 2018YFC0830201 and 2017YFB1002801; the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Meng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, X., Wang, M., Saleem, M., Ngomo, AC.N., Qi, G., Wang, H. (2020). Revealing Secrets in SPARQL Session Level. In: Pan, J.Z., et al. The Semantic Web – ISWC 2020. ISWC 2020. Lecture Notes in Computer Science(), vol 12506. Springer, Cham. https://doi.org/10.1007/978-3-030-62419-4_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-62419-4_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62418-7

  • Online ISBN: 978-3-030-62419-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics