Abstract
In the context of disaster management, geospatial information plays a crucial role in the decision-making process to protect and save the population. Gathering a maximum of information from different sources to oversee the current situation is a complex task due to the diversity of data formats and structures. Although several approaches have been designed to integrate data from different sources into an ontology, they mainly require background knowledge of the data. However, non-standard data set schema (NSDS) of relational geospatial data retrieved from e.g. web feature services are not always documented. This lack of background knowledge is a major challenge for automatic semantic data integration. Focusing on this problem, this article presents an automatic approach for geospatial data integration in NSDS. This approach does a schema mapping according to the result of an ontology matching corresponding to a semantic interpretation process. This process is based on geocoding and natural language processing. This article extends work done in a previous publication by an improved unit detection algorithm, data quality and provenance enrichments, the detection of feature clusters. It also presents an improved evaluation process to better assess the performance of this approach compared to a manually created ontology. These experiments have shown the automatic approach obtains an error of semantic interpretation around 10% according to a manual approach.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
https://offenedaten-koeln.de Open data portal of Cologne to retrieve data that we have converted in shapefiles.
http://geoportal.saarland.de/arcgis/services/Internet/Gesundheit/MapServer/WFSServer Web service allowing for retrieving data from Saarland that we have converted in shapefiles.
A non-expert is someone who knows about Semantic Web technologies but does not know the context and the goal of the data set.
References
Alt H, Godau M (1995) Computing the Fréchet distance between two polygonal curves. Int J Comput Geom Appl 5(01n02):75–91
Arenas M, Bertails A, Prud’hommeaux E, Sequeda J (2012) A direct mapping of relational data to RDF. W3C recommendation. https://www.w3.org/TR/rdb-direct-mapping/
Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. In: The semantic web, Springer, pp 722–735
Auer S, Lehmann J, Hellmann S (2009) Linkedgeodata: adding a spatial dimension to the web of data. In: International semantic web conference, Springer, pp 731–746
Barron C, Neis P, Zipf A (2014) A comprehensive framework for intrinsic openstreetmap quality analysis. Trans GIS 18(6):877–895
Battle R, Kolas D (2011) Geosparql: enabling a geospatial semantic web. Semant Web J 3(4):355–370
Berretti S, Del Bimbo A, Pala P (2000) Retrieval by shape similarity with perceptual distance and effective indexing. IEEE Trans Multimed 2(4):225–239
Bizid I, Faiz S, Boursier Patriceand Yusuf JCM (2014) Integration of heterogeneous spatial databases for disaster management. In: Parsons J, Chiu D (eds) Advances in conceptual modeling: ER 2013 workshops, LSAWM, MoBiD, RIGiM, SeCoGIS, WISM, DaSeM, SCME, and PhD symposium, Hong Kong, China, November, 2013, revised selected papers. Springer, Cham, pp 77–86. https://doi.org/10.1007/978-3-319-14139-8_10
Brassel K, Bucher F, Stephan EM, Vckovski A (1995) Completeness. In: Guptill SC, Morrison JL (eds) Elements of spatial data quality. Elsevier, Amsterdam, pp 81–108
Burggraf DS (2006) Geography markup language. Data Sci J 5:178–204
Buscaldi D, Rosso P (2008) Geo-wordnet: automatic georeferencing of wordnet. In: LREC
Das S, Sundara S, Cyganiak R (2012) R2RML: RDB to RDF mapping language, W3C recommendation. World Wide Web Consortium, Cambridge
Debruyne C, McGlinn K, McNerney L, O’Sullivan D (2017) A lightweight approach to explore, enrich and use data with a geospatial dimension with semantic web technologies. In: Proceedings of the fourth international ACM workshop on managing and mining enriched geo-spatial data, ACM, p 1
Debruyne C, Meehan A, Clinton É, McNerney L, Nautiyal A, Lavin P, O’Sullivan D (2017) Ireland’s authoritative geospatial linked data. In: International semantic web conference, Springer, pp 66–74
Do HH, Rahm E (2002) Coma: a system for flexible combination of schema matching approaches. In: Proceedings of the 28th international conference on very large data bases, VLDB endowment, pp 610–621
Eren H (2016) 8 standards in process control and automation. In: Liptak BG, Eren H (eds) Instrument engineers’ handbook, volume 3: process software and digital networks, vol 3. CRC Press, Boca Raton, p 155
ESRI E (1998) Shapefile technical description. An ESRI white paper
Euzenat J, Shvaiko P (2007) Ontology matching. Springer, Berlin
Gao S, Sperberg-McQueen CM, Thompson HS, Mendelsohn N, Beech D, Maloney M (2009) W3C XML schema definition language (XSD) 1.1 part 1: structures. W3C Candidate Recomm 30(7.2):16
Goodchild MF, Hunter GJ (1997) A simple positional accuracy measure for linear features. Int J Geogr Inf Sci 11(3):299–306
Grantner E (2007) ISO 8000: a standard for data quality. Logist Spectr 41(4):4–6
Guo H, Song GF, Ma L, Wang SH (2009) Design and implementation of address geocoding system. Comput Eng 35(1):250–251
Hartig O, Zhao J (2009) Using web data provenance for quality assessment. CEUR workshop proceedings
Hillner S, Ngomo ACN (2011) Parallelizing limes for large-scale link discovery. In: 7th international conference on semantic systems, ACM, pp 9–16
Homburg T, Prudhomme C, Würriehausen F, Karmacharya A, Boochs F, Roxin A, Cruz C (2016) Interpreting heterogeneous geospatial data using semantic web technologies. In: International conference on computational science and its applications, Springer, pp 240–255
Huttenlocher DP, Klanderman GA, Rucklidge WJ (1993) Comparing images using the Hausdorff distance. IEEE Trans Pattern Anal Mach Intell 15(9):850–863
Jiménez-Ruiz E, Grau BC (2011) Logmap: logic-based and scalable ontology matching. In: International semantic web conference, Springer, pp 273–288
Jiménez-Ruiz E, Kharlamov E, Zheleznyakov D, Horrocks I, Pinkel C, Skjæveland MG, Thorstensen E, Mora J (2015) Bootox: practical mapping of RDBS to OWL 2. In: International semantic web conference, Springer, pp 113–132
Kainz W (1995) Logical consistency. Elem Spat Data Qual 202:109–137
Kalemi E, Martiri E (2011) FOAF-academic ontology: a vocabulary for the academic community. In: 2011 third international conference on intelligent networking and collaborative systems (INCoS), IEEE, pp 440–445
Lanter DP (1990) Lineage in GIS: the problem and a solution, NCGIA National Center for Geographic Information and Analysis. http://infoscience.epfl.ch/record/51713
Le Grange JJ, Lehmann J, Athanasiou S, Garcia-Rojas A, Giannopoulos G, Hladky D, Isele R, Ngomo ACN, Sherif MA, Stadler C, et al (2014) The geoknow generator: managing geospatial data in the linked data web. In: Linking geospatial data
Lebo T, Sahoo S, McGuinness D, Belhajjame K, Cheney J, Corsar D, Garijo D, Soiland-Reyes S, Zednik S, Zhao J (2013) PROV-O: the PROV ontology. W3C recommendation. https://www.w3.org/TR/prov-o/
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Phys Dokl 10:707–710
Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D (2014) The Stanford CoreNLP natural language processing toolkit. In: 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55–60
Melnik S, Garcia-Molina H, Rahm E (2002) Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: 18th international conference on data engineering, 2002. Proceedings, IEEE, pp 117–128
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38(11):39–41
Navigli R, Ponzetto SP (2010) BabelNet: building a very large multilingual semantic network. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Association for computational linguistics, pp 216–225
Nentwig M, Hartung M, Ngonga Ngomo AC, Rahm E (2017) A survey of current link discovery frameworks. Semant Web 8(3):419–436
Ngomo ACN, Auer S (2011) Limes-a time-efficient approach for large-scale link discovery on the web of data. In: IJCAI, pp 2312–2317
Niu X, Rong S, Zhang Y, Wang H (2011) Zhishi.links results for OAEI 2011. In: Ontology matching, vol 220
Niwattanakul S, Singthongchai J, Naenudorn E, Wanapu S (2013) Using of Jaccard coefficient for keywords similarity. In: Proceedings of the international multiconference of engineers and computer scientists, vol 1
OGC (2011) OGC geosparql—a geographic query language for RDF data. Technical report
Otero-Cerdeira L, Rodríguez-Martínez FJ, Gómez-Rodríguez A (2015) Ontology matching: a literature review. Expert Syst Appl 42(2):949–971
Pan JZ (2009) Resource description framework. In: Staab S, Studer R (eds) Handbook on ontologies. Springer, Berlin, pp 71–90
Patroumpas K, Alexakis M, Giannopoulos G, Athanasiou S (2014) Triplegeo: an ETL tool for transforming geospatial data into RDF triples. In: ICDT workshops, pp 275–278
Pinkel C, Binnig C, Jiménez-Ruiz E, Kharlamov E, May W, Nikolov A, Sasa Bastinos A, Skjæveland MG, Solimando A, Taheriyan M et al (2016) RODI: benchmarking relational-to-ontology mapping generation quality. Semant Web 9(1):25–52
Pinkel C, Binnig C, Jimenez-Ruiz E, Kharlamov E, Nikolov A, Schwarte A, Heupel C, Kraska T (2017) IncMap: a journey towards ontology-based data integration. In: Mitschang B, Nicklas D, Leymann F, Schöning H, Herschel M, Teubner J, Härder T, Kopp O, Wieland M (eds) Datenbanksysteme für Business, Technologie und Web (BTW 2017). Gesellschaft für Informatik, Bonn
Prudhomme C, Homburg T, Ponciano JJ, Boochs F, Roxin A, Cruz C (2017) Automatic integration of spatial data into the semantic web. In: WebIST 2017
Prud E, Seaborne A, et al (2008) SPARQL query language for RDF. W3C Recommendation. https://www.w3.org/2001/sw/DataAccess/rq23/
Rahm E, Bernstein PA (2001) A survey of approaches to automatic schema matching. VLDB J 10(4):334–350
Repici J (2010) The comma separated value (CSV) file format. Creativyst Inc, San Carlos
Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. arXiv:cmp-lg/9511007
Rijgersberg H, van Assem M, Top J (2013) Ontology of units of measure and related concepts. Semant Web 4(1):3–13
Scharffe F, Atemezing G, Troncy R, Gandon F, Villata S, Bucher B, Hamdi F, Bihanic L, Képéklian G, Cotton F, et al (2012) Enabling linked-data publication with the datalift platform. In: Proceedings of AAAI workshop on semantic cities
Schwering A (2008) Approaches to semantic similarity measurement for geo-spatial data: a survey. Trans GIS 12(1):5–29
Shvaiko P, Euzenat J (2013) Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng 25(1):158–176
Stadler C, Unbehauen J, Lehmann J, Auer S (2013) Connecting crowdsourced spatial information to the data web with sparqlify. Technical report, University of Leipzig
Svennerberg, G (2010) Beginning Google Maps API 3. Apress
Tarasowa D, Lange C, Auer S (2015) Measuring the quality of relational-to-RDF mappings. In: International conference on knowledge engineering and the semantic web, Springer, pp 210–224
van Rees E (2013) Open geospatial consortium (OGC). Geoinformatics 16(8):28
Veltkamp RC (2001) Shape matching: similarity measures and algorithms. In: SMI 2001 international conference on shape modeling and applications, IEEE, pp 188–197
Vertan C, Wozu O (2007) Web ontology language (OWL). W3C Recommendation. https://www.w3.org/TR/owl-features/
Volz J, Bizer C, Gaedke M, Kobilarov G (2009) Silk-a link discovery framework for the web of data. In: LDOW, vol 538
Vrandečić D, Krötzsch M (2014) Wikidata: a free collaborative knowledgebase. Commun ACM 57(10):78–85
Vretanos PA (2005) Web feature service implementation specification. Open Geospatial Consort Specif 1325:04–094
Wick M, Vatant B, Christophe B (2015) Geonames ontology. http://www.geonames.org/ontology/documentation.html
Zaveri A, Rula A, Maurino A, Pietrobon R, Lehmann J, Auer S (2016) Quality assessment for linked data: a survey. Semant Web 7(1):63–93
Acknowledgements
We are funded by the German Federal Ministry of Education and Research (https://www.bmbf.de/en/index.html Project Reference: 03FH032IX4).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Prudhomme, C., Homburg, T., Ponciano, JJ. et al. Interpretation and automatic integration of geospatial data into the Semantic Web. Computing 102, 365–391 (2020). https://doi.org/10.1007/s00607-019-00701-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-019-00701-y