Skip to main content

LogLInc: LoG Queries of Linked Open Data Investigator for Cube Design

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11706))

Abstract

By avoiding the ‘data not invented here’ syndrome (NIH) (Data not invented here (NIH) syndrome is a mindset that consists in focusing solely on using data created inside the walls of a business (https://urlz.fr/9Yo9)), companies realized the benefit of including external sources in their data cube. In this context, Linked Open Data (LOD) is a promising external source that may contain valuable data and query-logs materializing the exploration of data by end users. Paradoxically, the dataset of this external source is structured whereas logs are “ugly”, and in the case, they are turned into rich structured data, they will contribute to building valuable data cubes. In this paper, we claim that the NIH syndrome must be also considered for query-logs. As a consequence, we propose an approach that investigates the particularity of SPARQL query logs performed on the LOD and augmented by the LOD to discover multidimensional patterns when leveraging and enriching a data cube. To show the effectiveness of our approach, different scenarios are proposed and evaluated using DBpedia.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.ibm.com/downloads/cas/MQBM7GOW.

  2. 2.

    https://www.w3.org/DesignIssues/LinkedData.html.

  3. 3.

    https://urlz.fr/9Yoc.

  4. 4.

    http://usewod.org/.

  5. 5.

    http://aksw.github.io/LSQ/.

  6. 6.

    http://wp.sigmod.org/?p=2277.

  7. 7.

    http://wiki.dbpedia.org/.

  8. 8.

    https://www.w3.org/RDF/.

  9. 9.

    Property path expressions are negligible in our corpus, our study do not focus on these expressions.

  10. 10.

    Details about the operations can be found in W3C recommandation https://urlz.fr/9Yqk.

  11. 11.

    A Sparql endpoint is an HTTP-based query service that executes SPARQL queries over the linked dataset. eg. http://dbpedia.org/sparql.

  12. 12.

    https://jena.apache.org/documentation/ontology/.

  13. 13.

    https://jena.apache.org/documentation/tdb/.

  14. 14.

    https://wiki.dbpedia.org/lookup.

References

  1. Abelló, A., et al.: Using semantic web technologies for exploratory OLAP: a survey. IEEE Trans. Knowl. Data Eng. 27(2), 571–588 (2015)

    Article  Google Scholar 

  2. Abelló Gamazo, A., Gallinucci, E., Golfarelli, M., Rizzi Bach, S., Romero Moral, Ó.: Towards exploratory OLAP on linked data. In: 2016 24th Italian Symposium on Advanced Database Systems, SEBD 2016, Italy, June 2016, pp. 86–93 (2016)

    Google Scholar 

  3. Aligon, J., Gallinucci, E., Golfarelli, M., Marcel, P., Rizzi, S.: A collaborative filtering approach for recommending olap sessions. DSS 69, 20–30 (2015)

    Google Scholar 

  4. Baldacci, L., Golfarelli, M., Graziani, S., Rizzi, S.: QETL: an approach to on-demand etl from non-owned data sources. DKE 112, 17–37 (2017)

    Article  Google Scholar 

  5. Bonchi, F., et al.: Web log data warehousing and mining for intelligent web caching. Data Knowl. Eng. 39(2), 165–189 (2001)

    Article  Google Scholar 

  6. Bonifati, A., Martens, W., Timm, T.: An analytical study of large SPARQL query logs. Proc. VLDB Endowment 11(2), 149–161 (2017)

    Article  Google Scholar 

  7. Cyganiak, R., Reynolds, D., Tennison, J.: The RDF Data Cube Vocabulary. World Wide Web Consortium, Cambridge (2014)

    Google Scholar 

  8. Etcheverry, L., Vaisman, A.A.: QB4OLAP: a new vocabulary for OLAP cubes on the semantic web. In: Proceedings of COLD (2012)

    Google Scholar 

  9. Gallinucci, E., Golfarelli, M., Rizzi, S., Abelló, A., Romero, O.: Interactive multidimensional modeling of linked data for exploratory OLAP. IS 77, 86–104 (2018)

    Google Scholar 

  10. Hilal, M.: A proposal for self-service OLAP endpoints for linked RDF datasets. In: Ciancarini, P., et al. (eds.) EKAW 2016. LNCS (LNAI), vol. 10180, pp. 245–250. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58694-6_38

    Chapter  Google Scholar 

  11. Hung, E., Deng, Y., Subrahmanian, V.S.: RDF aggregate queries and views. In: International Conference on Data Engineering ICDE, pp. 717–728. IEEE (2005)

    Google Scholar 

  12. Khouri, S., Bellatreche, L.: LOD query-logs as an asset for multidimensional modeling. In: Benczúr, A., et al. (eds.) ADBIS 2018. CCIS, vol. 909, pp. 45–53. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00063-9_6

    Chapter  Google Scholar 

  13. Kimball, R.: Newly emerging best practices for big data. Whitepaper, Kimball Group, September 2012

    Google Scholar 

  14. Komamizu, T., Amagasa, T., Kitagawa, H.: SPOOL: a SPARQL-based ETL framework for OLAP over linked data. In: IIWAS, p. 49. ACM (2015)

    Google Scholar 

  15. Marx, E., Zaveri, A., Moussallem, D., Rautenberg, S.: Dbtrends: exploring query logs for ranking RDF data. In: Semantic Systems, pp. 9–16. ACM (2016)

    Google Scholar 

  16. Mazumdar, S., et al.: SEMLEX-A framework for visually exploring semantic query log analysis. In: Semantic Web Conference-Poster and Demo Session (2011)

    Google Scholar 

  17. Ravat, F., Song, J.: Enabling OLAP analyses on the web of data. In: 2016 Eleventh International Conference on Digital Information Management (ICDIM), pp. 215–224. IEEE (2016)

    Google Scholar 

  18. Romero, O., Abelló, A.: Automatic validation of requirements to support multidimensional design. Data Knowl. Eng. 69(9), 917–942 (2010)

    Article  Google Scholar 

  19. Sabharwal, S., Nagpal, S., Aggarwal, G.: Empirical analysis of metrics for object oriented multidimensional model of data warehouse using unsupervised machine learning techniques. Int. J. Syst. Assur. Eng. Manag. 8(2), 703–715 (2017)

    Article  Google Scholar 

  20. Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.-C.N.: LSQ: the linked SPARQL queries dataset. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 261–269. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25010-6_15

    Chapter  Google Scholar 

  21. Wang, X., Staab, S., Tiropanis, T.: ASPG: generating OLAP queries for SPARQL benchmarking. In: Li, Y.-F., et al. (eds.) JIST 2016. LNCS, vol. 10055, pp. 171–185. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50112-3_13

    Chapter  Google Scholar 

  22. Zhang, J., Ling, T.W., Bruckner, R.M., Tjoa, A.M.: Building XML data warehouse based on frequent patterns in user queries. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 99–108. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45228-7_11

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Selma Khouri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khouri, S., Lanasri, D., Saidoune, R., Boudoukha, K., Bellatreche, L. (2019). LogLInc: LoG Queries of Linked Open Data Investigator for Cube Design. In: Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2019. Lecture Notes in Computer Science(), vol 11706. Springer, Cham. https://doi.org/10.1007/978-3-030-27615-7_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-27615-7_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-27614-0

  • Online ISBN: 978-3-030-27615-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics