A protocol for machine-readable cache policies in OGC web services: Application to the EuroGeoSource information system
Introduction
Geospatial services are widely used in environmental modeling software, as they provide a flexible, reusable alternative to monolithic and closed applications, and provide support for next-generation integrated environmental modeling systems (Granell et al., 2013, Laniak et al., 2013). Services following ISO/OGC (International Organization for Standardization/Open Geospatial Consortium) standards (we will refer to them as OGC web services) are being used to implement different computational models (e.g. alpine runoff events (Granell et al., 2010)), to share environmental data (e.g. historical landslides and floods (Salvati et al., 2009)), in environmental e-government applications (Latre et al., 2013), and also in combination with other services (Gebhardt et al., 2010, Peckham and Goodall, 2013).
Many types of OGC web services, such as Web Map Services (WMS), Web Map Tile Services (WMTS), Web Feature Services (WFS), Web Coverage Services (WCS) and Catalog Services for the Web (CSW), are “geographic model/information management services” according to the ISO 19119:2005 geographic services taxonomy (ISO, 2005). To access this kind of services with good performance, or to improve its availability, caching their contents is a good strategy. The most common practices are CSW metadata harvesting (Li et al., 2011, Deng and Wu, 2010) and WMS tiling (Liu and Nie, 2010, García et al., 2013). Tiling can be used for raster data (Lia et al., 2011), so it can be adapted to WCS. Caching is also starting to be applied to WFS vector data (Pla and Lleopart, 2010).
Caching is a technique often used in environmental software when performance is required, specially in distributed web-based systems and, more recently, in cloud based systems. For example, in a virtual database for ecological data, Frehner and Brändli (2006) use caching in their integration layer, on top of several WFSs. In another web services-based system, in this case for hydrologic data, Ames et al. (2012) use caches for keeping local copies of observational data series.
In the context of geospatial services, a cache often means a temporary storage for some of the contents offered by these services. In general terms, a cache improves the performance perceived by the users of a service because it allows to pre-generate, and reuse, the results produced by certain time-consuming operations offered by this service. The provider of the service may choose to establish a cache, but the users of this service may also choose to do so on their own side, i.e. in their own desktop computer or in a local server, if they need to, what is the more relevant case for this paper.
Caching a service can be a heavy load for it. For instance, tiling a WMS means making many thousands, even millions, of requests which must be added to those made by its regular users. Besides this, the cached contents, e.g. the map tiles, are stored for an undefined amount of time beyond the control of the service rights-holder. These are good reasons that may lead to establish and express certain conditions to cache a service contents. For instance, the Spain Cadastre WMS1 can be used freely, but includes in its capabilities the prohibition to make tiled requests and massive downloads. The UK Ordnance Survey Open Space developer agreement2 grants permission to the automatic, immediate and temporary storing (caching) of data. In France Géoportail, caching is prohibited unless an explicit license is obtained.3
As the contents provided through OGC web services change, caching those contents needs to be done periodically to keep them updated. Since the conditions to cache those services can change too, it would be useful to express them in a machine-readable way, so that a cooperating, automatic cache updater could react to those changes. However, this is not the current situation as the natural language licenses in the examples of the previous paragraph show.
This paper proposes a protocol to specify cache policies for OGC web services in a machine-readable Rights Expression Language (REL) that can be followed by cooperative harvesters. We will be using the term harvesters for automatic processes that cache the contents of geographic model/information management services. We need those harvesters to be “cooperative” because the protocol is not a full rights management system so it does not enforce the cache policies. The protocol is applicable to a nowadays common situation, and can also be a first step towards a Digital Rights Management (DRM) framework for those interested.
A preliminary version of this protocol has been tested in the EuroGeoSource project, see Section 4, where a number of Web Feature Services (WFS) providing data on mineral deposits and energy resources are periodically harvested and cached in a central node to improve the efficiency and availability of several applications. The data provided through these services can be used as an input in environmental models like those proposed by Côte et al. (2010) or González et al. (2011).
The rest of this paper is organized as follows. The next section reviews work related to RELs and licensing in the geospatial web. Section 3 details a protocol for OGC web services cache policies, based on ODRL 2.0. This protocol also establishes how to embed these policies in OGC web services (Section 3.3), and proposes an algorithm for cooperating harvesters to follow the policies (Section 3.4). Section 4 describes the EuroGeoSource project and how a first version of the protocol proposed in this paper was implemented there. Section 5 discusses the rationale behind some of the most significant decisions taken. To finish this paper, Section 6 summarizes the main conclusions and proposes some future lines of work.
Section snippets
Related work
Explicit license terms are necessary for geospatial assets (e.g. data and services) if the rights and obligations of their users must be clear. For instance, Spatial Data Infrastructures (SDIs) deal with this issue by defining more or less formal “Access Policies” (Béjar et al., 2012, p. 267) for their shared assets. Even open data and content4 are not really open unless their license terms, or the legal conditions under which they are available, are well-known.
In
A protocol for OGC web service cache policies
OGC web services allow for expressing some information about use “fees” and “access constraints” in their capabilities. However, neither the meaning nor the syntax of these elements is defined, as they are free text fields. This section describes a protocol to declare ISO/OGC service policies in a machine-readable format. This protocol is designed to regulate the behavior of harvesters which access data and metadata-providing OGC web services in order to download and cache those data and
Application to the EuroGeoSource information system
The EuroGeoSource project,13 co-funded by the Competitiveness and Innovation Framework Programme (CIP), under the Policy Support Programme (PSP), Geographic Information Theme, of the European Union, has developed an Internet pilot information system14 which provides access to geographical information on geo-energy and mineral resources on ten European countries.
The system has been developed following an SDI architecture based on INSPIRE
Rationale and discussion
During this work, a number of decisions have been taken. The rationale behind them is discussed in this section. These decisions have been guided by our interest to provide a novel solution to a real problem, which is both easy to adopt and extensible if needed.
An acceptable balance between simplicity (both for the service owners and for the service harvesters) and functionality has been difficult to achieve. The first issue was the granularity of the assets. From the whole service, to
Conclusions and future work
This paper proposes a protocol to specify cache policies for OGC web services in ODRL 2.0 that can be followed by cooperative harvesters. We have used ODRL 2.0 because our specific interest of using a REL not related to any particular DRM system. This separation is a recommended practice even when the final objective is the adoption of a DRM framework (Jamkhedkar et al., 2006) or the use of access control technologies (The UK Location Architecture Interoperability Board – Business
Acknowledgments
This work has been partially funded through the EuroGeoSource project (project number 250532), from the European Union's ICT Policy Support Programme as part of the Competitiveness and Innovation Framework Programme, by the Spanish Government (project TIN2012-37826-C02-01), by the Government of Aragon (project INNOVA-A1-038-13) and by the National Geographic Institute (IGN) of Spain. We also want to thank the members of the EuroGeoSource consortium their help, suggestions and hard work, and the
References (43)
- et al.
HydroDesktop: web services-based software for hydrologic data discovery, download, visualization, and analysis
Environ. Model. Softw.
(2012) - et al.
A comparative evaluation of technical solutions for long-term data repositories in integrative biodiversity research
Ecol. Inform.
(2012) - et al.
An RM-ODP enterprise view for spatial data infrastructures
Comput. Stand. Interfaces
(2012) - et al.
Systems modelling for effective mine water management
Environ. Model. Softw.
(2010) - et al.
Virtual database: spatial analysis in a web-based data management system for distributed ecological data
Environ. Model. Softw.
(2006) - et al.
A neural network based intelligent system for tile prefetching in web map services
Expert Syst. Appl.
(2013) - et al.
Improving data management and dissemination in web based information systems by semantic enrichment of descriptive data aspects
Comput. Geosci.
(2010) - et al.
Impact of unconfined sulphur-mine waste on a semi-arid environment (Almería, SE Spain)
J. Environ. Manag.
(2011) - et al.
Service-oriented applications for environmental models: reusable geospatial services
Environ. Model. Softw.
(2010) - et al.
Enhancing integrated environmental modelling by designing resource-oriented interfaces
Environ. Model. Softw.
(2013)
Spatial data infrastructures for environmental e-government services: the case of water abstractions authorisations
Environ. Model. Softw.
Integrated environmental modeling: a vision and roadmap for the future
Environ. Model. Softw.
Diverse or uniform? – intercomparison of two major German project databases for interdisciplinary collaborative functional biodiversity research
Ecol. Inform.
Long term ecological research and information management
Ecol. Inform.
Driving plug-and-play models with data from web services: a demonstration of interoperability between CSDMS and cuahsi-his
Comput. Geosci.
On the optimal level of protection in DRM
Inf. Econ. Policy
Digital rights expression languages (DRELs)
JISC Technol. Stand. Watch
“GeoBusinessLicence” – one licence for all
WP8-system Development and Implementation. Public Deliverable D8.1, EuroGeoSource Project Consortium
GeoDRM: towards digital management of intellectual property rights for spatial data infrastructures
Research on the harvest and cascade of catalogue service in GeoGlobe service platform
Cited by (8)
Drupal core 8 caching mechanism for scalability improvement of web services
2020, Software ImpactsDevelopment of the Land-use and Agricultural Management Practice web-Service (LAMPS) for generating crop rotations in space and time
2016, Soil and Tillage ResearchCitation Excerpt :Web services allow for easy access to information that is otherwise difficult to obtain because of the size, complexity, and legal constraints on the managed raw information. Specifically for environmental models and data, Web -services have become the preferred method for information access and dissemination (Ames et al., 2012; Béjar et al., 2014; Blower et al., 2013; Castronova et al., 2013; Dubois et al., 2013; Goodall et al., 2008, 2013). The Cloud Services Innovation Platform (CSIP) (David et al., 2014a) was developed by Colorado State University and the USDA.
A geoprocessing workflow system for environmental monitoring and integrated modelling
2015, Environmental Modelling and SoftwareCitation Excerpt :In this way, different models could be implemented as services, often accessible through Web Service interfaces, and coupled through service-oriented workflows (Granell et al., 2010; Bastin et al., 2013; Nativi et al., 2013). Workflows and Web Services are widely employed in environmental information infrastructures (Béjar et al., 2014; Laniak et al., 2013b). These technologies allow distributed models, data, and sensors to be accessed through Web Services, which later can be chained together to support environmental monitoring and integrated modelling.
A spatial data infrastructure integrating multisource heterogeneous geospatial data and time series: A study case in agriculture
2016, ISPRS International Journal of Geo-InformationImage data and metadata workflows automation in geospatial data infrastructure deployed for agricultural sector
2015, International Geoscience and Remote Sensing Symposium (IGARSS)