International Journal of Applied Earth Observation and Geoinformation
Spatio-temporal evaluation matrices for geospatial data
Introduction
The quality of geospatial data is a measure of the difference between the data and the real world that they represent. The greater this difference, the poorer the quality of data and the smaller its true value. It is not possible to create a perfect representation of the infinite complexity of the real world (Goodchild, 2006). Spatial data is inherently uncertain (Couclelis, 2003) and it is inherently likely that the same reality is modeled differently by different people (Bédard, 1987).
In describing data quality, there are many dimensions of data at different levels of detail and also many different components, known as quality elements, in the standards. Setting and implementing international quality standards is most important in various aspects, especially in improving quality, safety, security, environmental protection, and economical use of natural resources. The standardized descriptions of the eight basic quality management principles as they appear in the ISO 9000 (ISO, 2005a) and ISO 9004 (ISO, 2000) standards are:
Principle 1: Customer focus.
Principle 2: Leadership.
Principle 3: Involvement of people.
Principle 4: Process approach.
Principle 5: System approach to management.
Principle 6: Continual improvement.
Principle 7: Factual approach to decision making.
Principle 8: Mutually beneficial supplier relationships.
These principles provide conceptual guidelines for any business. Furthermore, the International Organization for Standardization (ISO) with its “Technical Committee TC 211–Geographic information/Geomatics” has developed the ISO 19100 series of domain-oriented standards in the last decade, which are specifically dedicated to geographic information and its quality (ISO, 2002, ISO, 2003a, ISO, 2003b, ISO, 2005a, ISO, 2005b, ISO, 2005c, ISO, 2005d, ISO, 2006, ISO, 2007a, ISO, 2007b, ISO, 2007c, ISO, 2008, ISO, 2009a, ISO, 2009b). These standards address the basic issue; that is, how well the geographic world is represented by the data. The basic goal of these standards is to enable interaction of geospatial datasets between different data models and different applications. The importance of user awareness of where and how specific datasets can be used in their applications is growing with the growing number of different data models and different levels of quality of geospatial datasets. Among the important aspects of these standards is the efficient and effective global dissemination of geospatial data and technologies as well as good practices, thus contributing to global economic and social progress.
Among these standards, ISO 19113 (ISO, 2005b), ISO 19114 (ISO, 2005c), and ISO 19115 (ISO, 2005d) establish principles for describing the quality of geographic data in the quality-evaluation process and summarize quality into the following well-known elements:
Positional accuracy.
Temporal accuracy.
Semantic, thematic, or attribute accuracy.
Logical consistency.
Completeness.
Knowledge about spatial data quality is contained in metadata, which are communicated to users through transmission of datasets by data producers.
Daily practice in the geospatial community shows numerous and constant efforts among geospatial data producers and providers to improve the quality of geospatial data, and also to make it possible for users of geospatial data to understand the effects of data quality on their work. Although geospatial data producers usually know the data they produce and their quality in detail, geospatial data users, with the exception of users within professional geospatial communities (e.g., GEOSS, etc.), mostly find information on quality too complicated and therefore either do not understand these quality details or do not even care about them. We all need to improve the understanding of geospatial data quality and its effects.
Several methodologies are known for how user requirements are turned into the quality requirements that have been developed and applied in recent decades, such as the approaches of Total Quality Management (TQM) and Workflow Management (WFM; Radwan et al., 2001). These include Quality Function Deployment (QFD), a method of Decision Support Systems (DSS) to transform user demands into design quality (Akao, 1994). QFD helps transform customer needs, using a market research technique termed Voice of the Customer (VOC) in business and information technology, into engineering characteristics and appropriate test methods for a product or service, as well as exploring how to turn user input into innovation (Ulwick, 2002). Extensive discussions on these approaches can be found in a variety of literature (e.g., Burstein and Holsapple, 2008). In this regard, Geographic Information Systems (GIS) are also described in the literature as a tool of Spatial DSS (Pick, 2005).
The next section of this paper demonstrates that specialists in geospatial data quality have good theories and techniques at their disposal. On the other hand, these theories and techniques are unfortunately not accessible to all and are still far from being sufficiently understood, especially in the user domain. The current situation in this regard is not what participants in the global geospatial data community need and expect in order to achieve quality results when using geospatial data. The size and effect of this problem will accelerate with new developments in geospatial data production principles, where the borders between geospatial data producers and data users are becoming blurred.
Various kinds of web-based user input of geospatial data, enabled by Web 2.0 technology (O’Reilly, 2005), are changing the processes of geospatial data production and use, with both good and bad consequences. Participants in the geospatial data production process are becoming both producers and users, which recently gave rise to the neologism produsers (Bruns, 2008). The terms neogeography (Turner, 2007), VGI or volunteered geographic information (Goodchild, 2007), crowdsourcing and urban sensing (McLaren, 2009), and cloud computing with its cloud service models and cloud deployment models (Mell and Grance, 2009) were coined recently and are rapidly gaining importance in the global geospatial community. Filling much of the agenda at the recent international geospatial events, it becomes obvious that the concepts of collaborative data development have very extensive effects on development practices and will revolutionize how participants in the geospatial community use and manage data. In the second half of 2009, we are witnessing the first of the established commercial producers (e.g., deCarta) of geospatial tools and services announcing support for crowd-sourced geospatial data across their product line. This is introducing new developments for companies in the geospatial tools business, for geospatial data producers, and for end users of geospatial data.
Users increasingly treat the web as an interactive environment in which the accumulation of information from individuals is as important as the distribution of information to individuals, as can be seen in various available solutions and examples such as Tele Atlas MultiNet (www.teleatlas.com), TomTom Map Share (www.tomtom.com/page/mapshare), Navteq Map Reporter (mapreporter.navteq.com), Nokia Maps (maps.nokia.com), Google Maps (maps.google.com), Bing Maps (www.bing.com/maps), Wikimapia (wikimapia.org), and OpenStreetMap (www.openstreetmap.org).
Therefore every effort to improve the systematic communication of data-quality information between data producers and data users is welcome. The authors’ contribution to spatio-temporal data-quality assessment is the concept of a simple and dynamic solution implementing the STEM (Spatio-Temporal Evaluation Matrix) and INSTANT (INdex of Spatio-Temporal ANTicipations) matrices on spatio-temporal data, explained in detail in Section 3.
Section snippets
Methodology and developments in spatio-temporal data quality
Because they deal with models of reality, spatial information processes covering such essential areas as data acquisition, geoinformation theory, spatio-temporal statistics, and dissemination are often imprecise, allowing for much interpretation of abstract figures and data. Therefore, quality aspects in spatio-temporal data mining are very important in providing practical and theoretical solutions for making sense of the often chaotic and overwhelming amount of concrete geospatial data
Matrix representation and communication of spatio-temporal data quality
Establishing principles for describing the quality of geographic data, specifying components for reporting quality, and organizing information about data quality are the essential purpose of the ISO 19113 standard (ISO, 2005b). Through the concept of data quality in the ISO 19113 standard shown in Fig. 1, data quality is the difference between the views of the real or hypothetical world (termed the universe of discourse and defined by a product specification) and a dataset. Geospatial data
Discussion
The authors are well aware that the concept of STEM/INSTANT matrices presented here is only a first step towards a simple, easily understandable, and systematic evaluation of geospatial data quality. The concept is meant as a numerically based visual complement to the established methods and procedures for data-quality assessment, with the basic intention of bringing the subject of spatio-temporal data quality closer to users’ comprehension. Taking into account that the current global trends in
Conclusions
One of the main points of this paper is to show how, in a shared and organized manner, producers and users can acquire a simple solution to defining available and necessary levels of spatio-temporal data quality. The goal is to bring geospatial data producers and users together, to enable them to jointly build the common universe of discourse and mutually interact within it. In the era of the progressive environment of Web 2.0 technology, average users of geospatial data and the web are capable
References (75)
Development history of quality function deployment
The Dumbest Generation—How the Digital Age Stupefies Young Americans and Jeopardizes Our Future (Or, Don’t Trust Anyone Under 30)
(2008)Uncertainties in land information systems databases
- et al.
Towards multidimensional user manuals for geospatial datasets: legal issues and their considerations into the design of a technological solution
- et al.
Towards a formal framework for spatio-temporal granularities
- et al.
- et al.
What communicates quality to the spatial data consumer?
The future is user-led: the path towards widespread produsage
FiberCulture Journal
(2008)The certainty of uncertainty: GIS and the limits of geographic knowledge
Transactions in GIS
(2003)
Issues on modeling spatial granularities
Building a multi-granularity based spatial database
Next-generation Digital Earth: A position paper from the Vespucci initiative for the advancement of geographic information science
International Journal of Spatial Data Infrastructures Research
Quality management, data quality and users
Communication and use of spatial data quality information in GIS
Fundamentals of Spatial Data Quality
Multidimensional management of geospatial data quality information for its dynamic use within geographical information systems
Photogrammetric Engineering & Remote Sensing (PE&RS)
Towards spatial data quality information analysis tools for experts assessing the fitness for use of spatial data
International Journal of Geographical Information Science
Data Warehousing Special Report: Data quality and the bottom line
Future and Emerging Technologies Programme
Seventh Framework Programme
Building a geospatial data framework—finding the best available data
Procedure to select the best dataset for a task
GEO 2009–2011 Work Plan
Strategic Guidance for Current and Potential Contributors to GEOSS
On the importance of external data quality in civil law
Auditing spatial data suitability for specific applications: professional and technological issues
Foreword
Citizens as voluntary sensors: spatial data infrastructure in the world of Web 2.0
International Journal of Spatial Data Infrastructures Research
How to select the best dataset for a task?
Quality needs more than standards
An ontology driven approach for spatial data quality evaluation
Testing the effects of positional uncertainty on spatial decision making
International Journal of Geographical Information Science
Testing the effects of thematic uncertainty on spatial decision making
Cartography and Geographic Information Science
Improving the usability of spatial information products and services
The European Information Society: Leading the Way in Geoinformation, Lecture Notes in Geoinformation and Cartography
INSPIRE Directive—Definition of Annex Themes and Scope (D 2.3 Version 3)
Cited by (8)
Hair-oriented data model for spatio-temporal data representation
2016, Expert Systems with ApplicationsCitation Excerpt :This concept refers to the necessary independence among the modeling dimensions of the data structures, namely, space and time (Parent, Spaccapietra, & Zimányi, 1999). However, problem solving is required when using spatio-temporal data in decision making (Triglav, Petrovič, & Stopar, 2011). The demand of having a decision support system for spatio-temporal data leads to an increase in the system's ability about data mining.
Hair-oriented data model for spatio-temporal data mining
2015, International Review on Computers and SoftwareSearching for spatial data resources by fitness for use
2013, Journal of Spatial ScienceRubric-Q: Adding quality-related elements to the GEOSS clearinghouse datasets
2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote SensingInfluence of raster data quality on spatial modelling and analyses
2012, International Journal of Mathematics and Computers in Simulation