Skip to main content

Distributed Storage of Large Knowledge Graphs with Mobility Data

  • Chapter
  • First Online:
Book cover Big Data Analytics for Time-Critical Mobility Forecasting

Abstract

This chapter presents novel solutions for storage and querying of large knowledge graphs, represented in RDF, which consist of mobility data. Such knowledge graphs are generated and updated daily based on incoming positional information of moving entities, possibly linked with contextual information and weather data. To cope with the massive size of knowledge graphs, several challenges need to be addressed related to distributed storage and parallel query processing. This chapter presents the design and implementation of a parallel processing engine for spatiotemporal RDF data built on top of Apache Spark. The engine is comprised of a storage layer, which stores deliberately encoded spatiotemporal RDF triples and a dictionary of mappings between integer identifiers and RDF resources, and also uses Property tables and columnar storage layout for improved performance. Also, the engine uses a processing layer, which is comprised by a query parsing component, a logical query builder, and a physical query constructor in order to produce execution plans that efficiently handle spatiotemporal constraints along with SPARQL processing. The performance of our engine is demonstrated by means of experiments over large knowledge graphs of real-life mobility data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abdelaziz, I., Harbi, R., Khayyat, Z., Kalnis, P.: A survey and experimental comparison of distributed SPARQL engines for very large RDF data. Proc. VLDB Endowment 10(13), 2049–2060 (2017)

    Article  Google Scholar 

  2. Bereta, K., Smeros, P., Koubarakis, M.: Representation and querying of valid time of triples in linked geospatial data. In: The Semantic Web: Semantics and Big Data, Proceedings of 10th International Conference, ESWC 2013, Montpellier, 26–30 May 2013, pp. 259–274 (2013)

    Google Scholar 

  3. Blanas, S., Patel, J.M., Ercegovac, V., Rao, J., Shekita, E.J., Tian, Y.: A comparison of join algorithms for log processing in MapReduce. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, IN, 6–10 June 2010, pp. 975–986 (2010). https://doi.org/10.1145/1807167.1807273

  4. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of 6th Symposium on Operating Systems Design and Implementation, pp. 137–149 (2004). https://doi.org/10.1145/1327452.1327492

  5. Doulkeridis, C., Nørvåg, K.: A survey of large-scale analytical query processing in MapReduce. VLDB J. 23(3), 355–380 (2014)

    Article  Google Scholar 

  6. Garbis, G., Kyzirakos, K., Koubarakis, M.: Geographica: a benchmark for geospatial RDF stores (long version). In: International Semantic Web Conference, pp. 343–359. Springer, Berlin (2013)

    Google Scholar 

  7. Kaoudi, Z., Manolescu, I.: RDF in the clouds: a survey. VLDB J. 24(1), 67–91 (2015)

    Article  Google Scholar 

  8. Koubarakis, M., Kyzirakos, K.: Modeling and querying metadata in the semantic sensor web: the model sTRDF and the query language stSPARQL. In: The Semantic Web: Research and Applications, Proceedings of 7th Extended Semantic Web Conference, ESWC 2010, Heraklion, Crete, 30 May–3 June 2010, Part I, pp. 425–439 (2010)

    Google Scholar 

  9. Koubarakis, M., Karpathiotakis, M., Kyzirakos, K., Nikolaou, C., Sioutis, M.: Data models and query languages for linked geospatial data. In: Reasoning Web. Semantic Technologies for Advanced Query Answering - Proceedings of 8th International Summer School 2012, Vienna, 3–8 Sept 2012, pp. 290–328 (2012). https://doi.org/10.1007/978-3-642-33158-9_8

  10. Kyzirakos, K., Karpathiotakis, M., Bereta, K., Garbis, G., Nikolaou, C., Smeros, P., Giannakopoulou, S., Dogani, K., Koubarakis, M.: The spatiotemporal RDF store Strabon. In: Proceedings of SSTD, pp. 496–500 (2013)

    Google Scholar 

  11. Lim, H., Han, D., Andersen, D.G., Kaminsky, M.: MICA: a holistic approach to fast in-memory key-value storage. In: Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2014, Seattle, WA, 2–4 April 2014, pp. 429–444 (2014). https://www.usenix.org/conference/nsdi14/technical-sessions/presentation/lim

  12. Liu, Q., Yuan, H.: A high performance memory key-value database based on Redis. J. Comput. 14(3), 170–183 (2019). http://www.jcomputers.us/index.php?m=content&c=index&a=show&catid=209&id=2925

    Google Scholar 

  13. Naacke, H., Amann, B., Curé, O.: SPARQL graph pattern processing with apache spark. In: Proceedings of the Fifth International Workshop on Graph Data-management Experiences & Systems, GRADES@SIGMOD/PODS 2017, Chicago, IL, 14–19 May 2017, pp. 1:1–1:7 (2017)

    Google Scholar 

  14. Nikitopoulos, P., Vlachou, A., Doulkeridis, C., Vouros, G.A.: Parallel and scalable processing of spatio-temporal rdf queries using spark. GeoInformatica (2019). https://doi.org/10.1007/s10707-019-00371-0

  15. Santipantakis, G.M., Vouros, G.A., Doulkeridis, C., Vlachou, A., Andrienko, G.L., Andrienko, N.V., Fuchs, G., Garcia, J.M.C., Martinez, M.G.: Specification of semantic trajectories supporting data transformations for analytics: the datAcron ontology. In: Proceedings of the 13th International Conference on Semantic Systems, SEMANTICS 2017, Amsterdam, 11–14 Sept 2017, pp. 17–24 (2017)

    Google Scholar 

  16. Schätzle, A., Przyjaciel-Zablocki, M., Berberich, T., Lausen, G.: S2X: graph-parallel querying of RDF with GraphX. In: Biomedical Data Management and Graph Online Querying - VLDB 2015 Workshops, Big-O(Q) and DMAH, Waikoloa, HI, 31 Aug–4 Sept 2015, Revised Selected Papers, pp. 155–168 (2015)

    Google Scholar 

  17. Schätzle, A., Przyjaciel-Zablocki, M., Skilevic, S., Lausen, G.: S2RDF: RDF querying with SPARQL on spark. Proc. VLDB Endowment 9(10), 804–815 (2016)

    Article  Google Scholar 

  18. Shi, J., Qiu, Y., Minhas, U.F., Jiao, L., Wang, C., Reinwald, B., Özcan, F.: Clash of the Titans: MapReduce vs. spark for large scale data analytics. Proc. VLDB Endowment 8(13), 2110–2121 (2015)

    Google Scholar 

  19. Vlachou, A., Doulkeridis, C., Glenis, A., Santipantakis, G.M., Vouros, G.A.: Efficient spatio-temporal RDF query processing in large dynamic knowledge bases. In: Proceedings of the 34th Annual ACM Symposium on Applied Computing, SAC 2019, Limassol, 08–12 April 2019

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christos Doulkeridis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Nikitopoulos, P., Koutroumanis, N., Vlachou, A., Doulkeridis, C., Vouros, G.A. (2020). Distributed Storage of Large Knowledge Graphs with Mobility Data. In: Vouros, G., et al. Big Data Analytics for Time-Critical Mobility Forecasting. Springer, Cham. https://doi.org/10.1007/978-3-030-45164-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-45164-6_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-45163-9

  • Online ISBN: 978-3-030-45164-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics