ABSTRACT
Real-life spatial databases are inherently incomplete. This is in particular the case when data from different sources are combined. An extreme example are volunteered geographical information systems like OpenStreetMap.
When querying such databases the question arises how reliable are the retrieved answers. For instance, for positive queries, which ask for existing patterns of objects, further answers could show up if the data is completed. For queries with negation, it is furthermore possible that after data completion objects cease to satisfy a query.
On the OpenStreetMap wiki, contributors have started to record for some areas which object types have been mapped completely. Given a query, we show how such metainformation can be used to classify objects in the database as certain answers, which are certainly answers in reality, impossible answers, which in reality are definitely not answers, and possible answers, for which it is not known whether they are answers in reality. In addition, we compute the completeness area of a query, that is the maximal area for which it is certain that no further answer objects exist in reality.
All this additional information can be computed with standard operations on spatial data. Experiments suggest that the computation of such completeness information is feasible.
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of databases. In Addison-Wesley, 1995. Google ScholarDigital Library
- R. H. Güting. An introduction to spatial database systems. VLDB J., 3(4):357--399, 1994. Google ScholarDigital Library
- M. Haklay. How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets. Environment and Planning. B, Planning & Design, 37(4):682, 2010.Google ScholarCross Ref
- M. Haklay and C. Ellul. Completeness in volunteered geographical information---the evolution of OpenStreetMap coverage in England (2008--2009). Journal of Spatial Information Science, 2010.Google Scholar
- T. Imieliński and W. Lipski, Jr. Incomplete information in relational databases. J. ACM, 31: 761--791, 1984. Google ScholarDigital Library
- A. Y. Levy. Obtaining complete answers from incomplete databases. In Proceedings of the International Conference on Very Large Data Bases, pages 402--412, 1996. Google ScholarDigital Library
- P. Mooney, P. Corcoran, and A. Winstanley. Towards quality metrics for OpenStreetMap. In Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, pages 514--517. ACM, 2010. Google ScholarDigital Library
- A. Motro. Integrity = Validity + Completeness. ACM TODS, 14(4):480--502, 1989. Google ScholarDigital Library
- S. Razniewski and W. Nutt. Completeness of queries over incomplete databases. In VLDB, 2011.Google ScholarDigital Library
- S. Razniewski and W. Nutt. Assessing the completeness of geographical data (short paper). In BNCOD, 2013. Google ScholarDigital Library
- W. Shi, P. Fisher, and M. Goodchild. Spatial Data Quality. CRC, 2002.Google Scholar
- T. Wang and J. Wang. Visualisation of spatial data quality for internet and mobile GIS applications. Journal of Spatial Science, 49(1):97--107, 2004.Google ScholarCross Ref
- D. Zielstra, H. H. Hochmair, and P. Neis. Assessing the effect of data imports on the completeness of openstreetmap--a united states case study. Transactions in GIS, 17(3):315--334, 2013.Google ScholarCross Ref
Index Terms
- Adding completeness information to query answers over spatial databases
Recommendations
Identifying the Extent of Completeness of Query Answers over Partially Complete Databases
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataIn many applications including loosely coupled cloud databases, collaborative editing and network monitoring, data from multiple sources is regularly used for query answering. For reasons such as system failures, insufficient author knowledge or network ...
Completeness and soundness guarantees for conjunctive SPARQL queries over RDF data sources with completeness statements1
RDF generally follows the open-world assumption: information is incomplete by default. Consequently, SPARQL queries cannot retrieve with certainty complete answers, and even worse, when they involve negation, it is unclear whether they produce sound ...
Completeness of queries over SQL databases
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementData completeness is an important aspect of data quality. We consider a setting, where databases can be incomplete in two ways: records may be missing and records may contain null values. We (i) formalize when the answer set of a query is complete in ...
Comments