Keywords

1 Introduction

The importance of the agricultural sector is evident due its strategic importance around the world. The different groups of stakeholders involved in agricultural activities have to manage many different and heterogeneous sources of information that need to be combined in order to make economically and environmentally sound decisions, which include (among others) the definition of policies (with respect to subsidies, standardisation and regulation, national strategies for rural development, climate change), the development of sustainable agriculture, crop recollection timing and pricing, and plague detection, etc.

In the agriculture domain, the period of the past 20 years may be characterized by a shift from (conventional) farming to precision farming. This shift is caused by increased demand for higher yield on one hand but increased concern for loss of biodiversity which is quite significant in agricultural regions [1, 2] and might be also one of the driving forces in using precision farming. Even though precision farming is in general considered as a form of intensive agriculture [3] due to the fact that it enables using exact necessary amount of fertilizers or pesticides in explicitly spatially delineated plots it could be considered as an additional tool in so called ecological intensification [4] or conservation agriculture [5].

Precision farming relies on geospatial data and functionality that are available through a Farm Management Information System (FMIS). Any FMIS may also be considered as a kind of Geographic Information System (GIS) as it provides answers to two basic questions: what has happened and where. FMIS effectiveness from the geospatial point of view is limited chiefly by a lack of accurate data.

This is a paradoxical situation, since there are proprietary FMIS’s on the one hand and requests for open data and services to power proprietary FMISs on the other. A way out may be found in the form of an Open Farm Management Information System (OpenFMIS). We would like to emphasize that an OpenFMIS also uses both open and proprietary data. Open data are used as inputs into an OpenFMIS, geospatial data being the best example; e.g. starting from satellite and aerial images through cadastre to zones where the usage of nitrogen is forbidden. Proprietary data typically comprise sensitive economic information like the turnover of a farm, the consumption of fertilizers/pesticides/fuels, the wages of employees etc. Proprietary data remain in the system, no matter whether it is an FMIS or an OpenFMIS. The greatest difference between an FMIS and OpenFMIS lies in the presence of open application programming interfaces that allow (mostly) the (re)use of data and/or the functionality offered by other open applications. An OpenFMIS is also capable of exporting non-protected data, such as the better geometrical representation of a field. An OpenFMIS also offers modularity, i.e. any component with reference to geospatial/information technology standards may be added or taken away.

Moreover, world-wide concepts like the Digital Earth and Global Earth Observation System of Systems (GEOSS) are the ideal candidates for the integration of such agricultural pollution data from around the world.

The European project “Farm-Oriented Open Data in Europe” (FOODIE), funded between the years 2014 and 2017, addresses the above-mentioned issues. The FOODIE project aims at building an open and interoperable specialized agricultural platform hub on the cloud for the management of spatial and non-spatial data relevant for farming production, for the discovery of spatial and non-spatial agriculture-related data from heterogeneous sources, for the integration of existing and valuable European open datasets related to agriculture, and for data publication and the data linking of external agriculture data sources contributed by different public and private stakeholders allowing for the provision of specific and high-value applications and services to support the planning and decision-making processes of different stakeholder groups related to the agricultural and environmental domains.

2 Functionalities Provided by the Platform

The FOODIE approach supports the better adoption of Information and Communication Technologies (ICT) and also enables better collaboration between different stakeholders across the agri-food chain. It supports trusted software models as services for the farming sector. This will help farmers employ new types of solutions without the necessary investment and also allow the easy integration of existing platform through interoperable interfaces. This approach also incorporates the support of mobility platforms – the accessing of information and knowledge everywhere, and also the integration of different devices into all systems – as well as guarantees the accessibility of different types of information.

The FOODIE platform aims at being the “glue” in-between the public and private sectors by acting as a central platform (e.g., a Land Parcel Identification System) ICT system combined with intelligent and underlying hardware and software infrastructure that allows the storage of farmers’ data (unique land parcel ids inside a region, farming activities performed), landowners, land use, orthoimages, satellite data etc. An open data model for (precision) agriculture was defined as the backbone of the FOODIE platform, as described by [6, 7]. Portions of these datasets would remain private (e.g., famer data) whereas others like orthoimages and satellite data plus derived data from all European Member States and other open datasets could be accessed and used by all other user groups besides the Ministry of a Member State. The EU Directive concerning the re-use of public sector information sets down the legal framework in this respect. This would definitely transform ICT use in agriculture and forestry. As seen in the previous section, the INSPIRE Directive also goes in the same direction; that is, to establish an infrastructure for spatial information in Europe for the purposes of EC environmental policies or activities which might have an impact on the environment (see also [8]).

In order to implement the FOODIE concept and the associated service platform hub, the project aims at accomplishing the following technological objectives:

  • to make use of the existing spatial information resources and services for various domains coming from different initiatives like INSPIRE (INfrastructure for SPatial InfoRmation in Europe; see [9]), Copernicus, GNSS, Galileo, GEOSS (Global Earth Observation System of Systems); GBIF (Global Biodiversity Information Facility), and EUNIS (European Nature Information System) etc., in which the European Commission and the Member States have invested heavily over the past decade;

  • to design and provide an open and interoperable geospatial platform hub on the cloud based on existing software components derived from research results and available solutions in the market (mostly open-source). These include:

  • the integration of external agriculture production and food market data using the principles of Open Linked Data,

  • the creation of an open and flexible lightweight Application Programming Interface (API) that allows private and public stakeholders in the agricultural and environmental area to publish their own datasets (e.g., datasets provided by local sensor networks deployed in situ in farms, and knowledge from farm communities and agricultural services companies, etc.) and make them available via the platform hub as open linked data (enabling them to be subject to further processing),

  • the creation of specific and high-value applications and services to support the planning and decision-making processes of different stakeholder groups,

  • the provision of security mechanisms to prevent the unauthorised access and use of the platform users’ personal information as well as the data published by them,

  • the establishment of a marketplace where data can be discovered and exchanged but where external companies can also publish their own agricultural applications based on the data, services and applications provided by FOODIE.

    The FOODIE platform offers two levels of functionality as defined in its business and exploitation model:

  • open data relevant for (precision) farming in the most advanced exchangeable form possible;

  • added value services built on the top of open data to provide advice to the end users (such as farmers, consultant companies, environmentalists, policy makers, researchers, or citizens).

The first level functionality also includes a catalogue inventorying all available datasets and services to discover whether a FOODIE platform contains the desired data/services or not. The discovery concept originates from the INSPIRE domain [10], including the structure as well as the exchange format of metadata. Searching for data/services in the FOODIE catalogue is provided free of charge for all kinds of users. The FOODIE Catalogue respects the standards designed to publish and access digital metadata for geospatial data, services, and related (semantic) resource information as defined by [11,12,13,14,15,16]. The same also applies to the downloading of data, unless the particular owner of the data has imposed some restriction or condition on data sharing. The first level functionality is intended for users who are searching for European spatial data related to agriculture and would like to process data themselves.

Licenses are open for the first level functionality; Creative Commons Attribution Share Alike 4.0 [Chyba! Nenalezen zdroj odkazů.] for data components like data models and Open Data Commons Open Database License (ODbL) for databases [Chyba! Nenalezen zdroj odkazů.]. Searching and previewing capabilities are also available free of charge. Licenses are explicitly mentioned at each product (e.g. component, database) description typically at GitHub repository [Chyba! Nenalezen zdroj odkazů.] like [Chyba! Nenalezen zdroj odkazů.].

In concert with its five star rating, the FOODIE platform offers relevant data in one place at the highest possible level of data openness (see [Chyba! Nenalezen zdroj odkazů.]). For example, the Open Transport Network, which also includes unpaved rural roads, is being offered in the RDF through the SPARQL endpoint; meanwhile, satellite data from the Copernicus programme are in a machine readable georeferenced (TIFF) format.

Meanwhile, the second level functionality brings added value services derived from open (and sometimes also proprietary) data. For instance, an 8-year series of satellite images combined with other data allows the computation of yield potential, see Fig. 1. The yield potential concept of the FOODIE platform aims at establishing a general model for predicting yield potential zones for almost any kind of crop. Such added value services are being offered on the basis of payment through standardised Web interfaces, e.g. Open Geospatial Consortium’s Web Processing Service.

Fig. 1.
figure 1

Yield potential map for one farm as a reference for ecologically and economically sound decisions

Licenses are commercial for the second level functionality; individually set according to the purpose(s) of use, number of end users, included/excluded maintenance etc. The functionality of added value services is described in the following sub-sections. The FOODIE architecture follows the so-called Service-Oriented Architecture, which enables orchestration between and beyond the FOODIE platform through standardized services. Open interfaces and formats, as defined by the Open Geospatial Consortium [Chyba! Nenalezen zdroj odkazů.], [Chyba! Nenalezen zdroj odkazů.], [24] as well as [25] and/or World Wide Web Consortium [26] as well as [27] enable connections to various (external) systems (services) as well as to other layers of the FOODIE platform.

2.1 Yield Potential

Yield potential zones are areas with the same yield level within fields. Yield is the integrator of landscape and climatic variability and, as such, it provides useful information for identifying management zones as defined by [28, 29]. This represents the basic delineation of management zones for site-specific crop management, which is usually based on yield maps for the past few years. The presence of complete series of yield maps for all fields is rare; thus, remotely sensed data are analysed to determine the field variability of crops through vegetation indices.

The yield potential concept of the FOODIE platform aims at establishing a general model for yield potential zones for almost any kind of crop. However, such universality comes at a price. The model is capable of expressing significant spatial variations for a given crop yield on a plot by distinguishing three kinds of values: below average, average, and above average. However, the model depicts spatial variations within a plot, and therefore it may be misleading when trying to compare yield zones between plots. Altogether, we may conclude that some areas in a plot have significantly lower productivity than others and take such information into the decision making process. However, we cannot determine that it is better to produce a crop on plot A or plot B.

So far, the ESPA (United States Geological Survey’s Earth Resources Observation and Science Center Science Processing Architecture repository) of LANDSAT satellite images [30] is being used as the main data source, which offers surface reflectance products, main vegetation indices (NDVI, EVI; for details see [31]), and cloud identification by the CFmask algorithm. A selection of scenes from the past eight years was made for a particular farm area in order to collect cloud-free data related to the second half of the vegetation period. Yield potential was calculated for separate scenes as the relation of each pixel to the mean value of the whole field. In the last step, all scenes were combined and the median value of yield potential was calculated. After the full operation of Sentinel 2A/B satellites, the calculation of yield potential will be enhanced by these vegetation products.

The concept of yield potential was successfully validated. It was proven that the whole evaluated farm, Tršická zemědělská in the Czech Republic, has the spatial variation of a yield potential equal to 80% when comparing yield results with estimations of the yield potential. The evaluated farm has acreage about 1,284 hectares. Ongoing research follows two main directions. The first one aims at optimising the algorithm to increase the probability of yield prediction up to 90%. Another direction of the current research focuses on the implementation of yield potential calculation as an open Web service, namely as an OGC Web Processing Service [24].

2.2 Sensor Monitoring

It goes without saying that the most precise data will always be the data gained through observations and measurements performed directly on the farm. This is partly true in the contemporary practices of precision agriculture. Sensor networks for atmospheric and meteorological conditions such as aerial and ground thermometers, anemometers, hygrometers etc. are available on farms [32]. Similar sensor networks for collecting information on soil conditions [33] or ground water levels can also be established directly on farms. However, information on the application of treatments in certain places and at certain times may often be of significantly lower quality (see Sect. 2.3 for details). The best data source in this respect is the tractor and its application machine. Issues related to the manual and aerial application of treatments are, however, beyond the scope of this paper. Sensor data collected on farms by different sensor networks are stored and managed by the SensLog application.

SensLog is a complex component of the FOODIE platform for collecting, storing, processing, analysing and publishing sensor data. SensLog receives observations and measurements from individual gateways deployed in farm fields by means of a system of Web services to the repositories of the FOODIE platform. The internet connections for deployed gateways can be realized by a number of methods; therefore, SensLog provides for variable usage. SensLog is capable of storing the observations and positions of mobile sensors in same way as for static sensors. Additional data sets can be provided by storing data in the database in different schemas. SensLog provides processing procedures and analytical functions to pre-process raw collected data or to detect alert events in operating sensor networks. Additional data sources can be involved in these analytical functions (e.g. the geometries of plots from LPIS, hydrography geometries, statistical data, zone pricing etc.). Such additional data sources are stored in the FOODIE platform and/or in external repositories.

The database schema of SensLog follows the ISO 19156 Observations and Measurements concept [34]. Extended functionality was implemented in the database model to fulfil all farmer requirements. Data are stored in a PostgreSQL (open source) database in version 9.3.6 together with its spatial extension PostGIS in version 2.1.0. In addition, the developed database schema has been replicated in the Cloud (Infrastructure as a Service) provided by the Poznań Supercomputing and Networking Center (in Poland) using OpenStack as an Open Source Cloud Computing Software.

A system of Web services in several forms was developed to provide a variety of methods of data publishing. RESTful services publishing data in JSON format are used for operational usage and communication between FOODIE platform components. Services based on OGC Sensor Observation Service 1.0.0 core methods are used for the standardised publishing of data. An overview diagram of the SensLog application is shown in Fig. 2 above.

Fig. 2.
figure 2

Overview diagram of SensLog application and communication interfaces

2.3 Machinery Fleet Monitoring

Machinery fleet monitoring, abbreviated as MapLogAgri, is an extension of the SensLog component intended for collecting, storing, processing, analysing and publishing (near) real-time measurements of tractors and their application machines like spreaders or sprayers. Especially collecting, processing and analysing capabilities differ the most from the SensLog solution due to amount of data measured within one time frame.

As depicted in Fig. 3, a tractor contains a monitoring unit that is the centralised point from which it is possible to configure which data will be collected and how. A monitoring unit is connected to a GPS receiver to obtain the position of the tractor over the whole course of an agricultural intervention. The position of the application machine is computed from the position of the tractor and the size of the application machine. Communication between the tractor and the application machine is achieved through RFID (Radio Frequency Identification). The RFID interface enables the whole solution to be set as a modular one. It is then easy to combine any tractor with any application machine as far as they follow the same standards. The same applies to the ECU CAN/BUS (Electronic Control Unit Controlled Area Network) interface as a mediator between the monitoring unit and the software of the tractor.

Fig. 3.
figure 3

Communication scheme between a tractor and its application machine

To summarize, the whole solution consists in a set of hardware (e.g. the monitoring unit, GPS receiver) and software (e.g. ECU CAN/BUS and RFID interfaces as well as the user application). MapLogAgri may be understood as a hardware and software solution offering data for an OpenFMIS. The FOODIE system as a whole is then an example of an OpenFMIS.

Standardised communication exists also on the level of machinery sensor monitoring. ECU/CAN (Electronic Control Unit/Controlled Area Network) BUS represents a robust vehicle bus used in car industry, no matter whether it is a personal car or a bus [35]. On the contrary, so called ISO BUS is a specific definition of a universal protocol for electronic communication between tractors and computers originating from ISO 11783 standard – “Tractors and machinery for agriculture and forestry - Serial control and communications data network” [36]. So far (June 2017), ISO BUS is understood as a part of the Internet of Things concept [37], however beyond the Industrial Internet Consortium (IIC [38]) activities. ISO BUS and ECU/CAN BUS are used for communication on the sensor level. Note that the latest versions of ISO BUS do not offer straightforward open communication as their predecessors did. It is an open question whether this is a consequence of a changing policy on the part of agricultural machine producers. Some of them have recently introduced a new business model where the farmer only rents agricultural machines. The license agreement prohibits the farmer from buying the machine. As a result, all the data measured remain the property of the agricultural machine producer. We have identified this change as a threat to the openness of machinery sensor monitoring.

In total, 9 tractors and 23 machines were monitored second by second on the “Tršice” farm in the Czech Republic from March 2015 till the end of 2016. Such monitoring has proven the need for a robust cloud-based geospatial solution since, in this case, ten megabytes of data were generated each day, i.e. when operating on a real commercial farm with almost 1,300 hectares. Geospatial data revealed significant differences between economic (e.g. fuel consumption or time needed to perform the same operation) and ecological characteristics (e.g. the number of fertilizer or pesticide applications as a result of an inappropriate trajectory); see also Fig. 4. The discovered differences are the subject of ongoing analyses.

Fig. 4.
figure 4

Environmental, economic and geospatial results of machinery sensor monitoring

2.4 Conclusions

The three presented modules, i.e. yield potential, sensor measurement, and machinery fleet monitoring, were developed to minimize the environmental burden arising from agricultural activities. All the modules aim to reduce the amount of fertilizer or pesticide applied during agricultural production, and thus to ensure that lower amounts of nutrients or chemical residues such as nitrogen appear in soil and/or (ground) water. That is, they should help to reduce the eutrophication of natural water sources. Therefore, the integration of these modules was adopted as part of the GEOSS AIP-8 (Global Earth Observation System of Systems’ Architecture Implementation Pilot 8) targeting agricultural and water pollution.

The presented geospatial technologies focus on the environment during the agricultural planning phase, i.e. via yield potential monitoring, as well as during the application phase, where near real-time monitoring is supported by means of meteorological and hydrological sensors as well as by the machinery fleet monitoring module. The machinery fleet monitoring module was developed as a crowdsourcing extension of the sensor network concept (SensLog). In addition, an alerting service with an open (REST, Representation State Transfer) API was developed in order to forward notifications to environmental information systems and/or to iOS and Android based mobile devices.

The first discovered bottleneck lies in the general lack of farms producing yield results in the form of maps with sufficient detail (i.e. details of management zones within a plot). So far, only two such farms out of a possible thousand have been identified. Without such maps, spatial correlation between the theoretically computed yield potential on the one hand and yield results on the other is not feasible.

The second discovered bottleneck lies in the technological and licencing limitations relating to machinery fleet monitoring. Technological limitations arise from the various implementations of vague ISO/CAN bus standards by different machine manufacturers and from the need for tractors to have on-board computers. Licensing agreements prohibit farmers from buying machines. As a result, all data measured remain the property of the agricultural machine producer.

The third discovered bottleneck lies in Big Data processing. For instance, yield potential was computed for the country of Luxembourg (an area of about 2’500 km2, i.e. 1’000 sq mi, comprising some 92’000 plots) in order to verify the concept of transferability. The computation of yield potential through the Empirical Bayesian Kriging interpolation of Landsat 8, Sentinel 2, and MODIS (Moderate Resolution Imaging Spectroradiometer) satellite data, together with smoothing to a spatial resolution of 5 meters, took two weeks at the Poznań Supercomputing and Networking Center (64 computer nodes in the cloud, each node with two E5 2670 v2 / 2697 v3 @ 2.6 GHz processors, an operational memory of 256 GB, and a storage capacity of 40 TB, connected via Fibre Channel, 50 TB Ceph via 10 Gbit ethernet). PostgreSQL XL (Structured Query Language, eXtra Large), i.e. a version for cloud-based solutions, was found to be insufficient for the storage and processing of computed yield potential zones.

Ongoing research of the DataBio project focuses on the development of more efficient storage and processing capabilities in order to increase the performance of the provided yield potential-, sensor-, and machinery monitoring-related open Web services for environmental information systems. The improvement of yield potential computation is also a subject of further research.