MEMOn: Modular Environmental Monitoring Ontology to link heterogeneous Earth observed data
Introduction
In recent years, the Earth has undergone rapid climate changes which are believed to have in increasing natural disasters, such as storms, floods, and hurricanes. These disasters have dramatically influenced not only the natural environment but also human life. Consequently, research communities have given great importance to the development and the implementation of Earth Observation (EO) systems (such as sensors and satellite platforms1) and EO programs (such as the Copernicus2 Program and SERVIR3 Global).
Along with the increased number of monitoring solutions, a multitude of heterogeneous environmental data is generated. This data includes hundreds of millions of climate data, ocean and coast data,4 land data, and more stored in different formats (databases, CSV files, Raster images, etc.). This volume of data keeps growing regarding semantic heterogeneity (synonymy, polysemy, etc.). For example, in the sentence “Maps of daily temperature and precipitation are produced,” an expert would recognize that the observation is “temperature” but could not determine the details related to the temperature concept (atmospheric temperature, sea surface temperature, etc.). He needs to ask the data provider to get more details.
Additionally, in different disciplines, the same term may correspond to various meanings, in one hand. For example, the term “environment” is defined as “the biological and abiotic elements surrounding an individual organism,” in the biological domain. However, it refers to “all the natural components of the Earth (air, water, oils, etc.)” (Sauvé et al., 2016). On the other hand, various terms may correspond to the same meaning. For instance, the Observatory of Sahara and Sahel5 (OSS) may use the word “rainfall” for the same real-world feature that usually refers to “precipitation” in other sources. The variety of terms complicates the work of emergency responders who should be familiar with the terms used in each discipline.
With this extensive variety of environmental data, it is becoming increasingly difficult for domain experts to understand natural phenomena and reduce the adverse effects of climate changes. The exploitation of EO data is still limited due to the silos between data sources, systems, and programs. Each source offers data or models to be discovered or reused. Both data and semantic models encode domain knowledge that resides with the experts. However, this knowledge is often not available, and data analysts need to establish contact with original data sources and model producers to understand and use them properly. For example, the OSS provides data about climate change such as precipitation and temperature, the National Oceanic and Atmospheric Administration (NOAA6) offers marine data, and the EMergency events DATabase (EM-DAT7) provides data about flood events. Unfortunately, these organizations don't collaborate to link the produced data. Even if it is available online, this data is kept as isolated silos.
Undeniably, we have not reached a level where data and models are interoperable and linked so that the experts can reuse them soundly. We are still far away from the vision of common environmental information space (Athanasiadis, 2015). Our purpose is to break down with those silos to provide what we call a global data view, where different EO systems and programs will have unhampered and uniform access to the available environmental data that will be linked and synthesized into a single knowledge graph. This global data view allows the data sources to speak the same language and to share information so that domain experts could transform information into actionable knowledge. We refer here to a knowledge graph (Ehrlinger and Wöß, 2016), which is defined as a multi-relational graph composed of entities and relationships between them. With this knowledge graph, experts can look at all of this data and try to find meaning out of its correlations to understand natural phenomena and make the right decisions about disaster risk preventions. The Hurricane Irma, which occurred across the Caribbean in 2017, serves as an example of how the lack of a global view of the environmental knowledge and the absence of linked observed data hinder the anticipation and the understanding of natural phenomena. The traditional conditions of a hurricane development (such as the sea level, the wind speed, and the atmospheric pressure) were monitored, as usual, by the NOAA National Hurricane Center. Besides, the African Sahara Desert was observed by NASA. However, there was no link between these observations. Indeed, the absence of the dry air resulting from the lack of Saharan dust across the Atlantic (NOAA, 2014), acted in favor of high-altitude winds. When these waves of the air have enough moisture, lift, and instability, they readily form clusters of thunderstorms, and a tropical cyclone was formed as the areas of disturbed weather moving westward across the Atlantic resulting in the creation of the hurricane Irma. If experts had the link beforehand and understood the phenomena better, they might have been able to predict the power of the disaster a little bit before and alert the governments.
A global data view is further challenged by data integration. Data integration is the process of combining data retrieved from multiple and independent sources to provide an integrated and interoperable structure (Lenzerini, 2002). The main challenges confronted by data integration are data linking and semantic interoperability. To deal with these issues, many studies applied the ontology-based approach in the field of environmental monitoring (Lv and El-Gohary, 2016, Stasch et al., 2014). However, many problems are encountered when using these ontologies in software development for various reasons. First, some existing ontologies are dealing with specific aspects of the environment monitoring (Boughannam et al., 2013, Ma et al., 2014). They cannot be used in other contexts. So, the coverage to annotate or link observed data is most cases not possible with these. Undoubtedly, little attention was given to cover all environmental monitoring disciplines simultaneously, which cannot ensure a global view of knowledge. Second, many ontologies are built without considering their reusability (Zhang et al., 2016, Dahleh and Fox, 2016). Finally, while there have been several fundamental ideas regarding the application of ontologies to environmental monitoring (e.g.,Curry et al., 2013; Devaraju et al., 2015), there was limited take-up by broader communities. Only a few applied attempts have been put into practice (Corominas et al., 2018).
To deal with these issues, a semantic-oriented platform, named PREDICAT (PREDIct natural CATastrophes), that aims at providing data interoperability and linking in EO and disaster prediction, was proposed in (Masmoudi et al., 2018). PREDICAT aims to 1) ensure uniform access to heterogeneous data by providing adequate services, 2) integrate EO data coming from several sources, including that provided by citizens, and 3) provide a decision support solution to analyze in real time all the data to effectively prevent and react against natural disasters. Fig. 1 presents the global architecture of the PREDICAT platform and its tiers. In this paper, we only address the semantic layer by proposing a Modular Environmental Monitoring Ontology (MEMOn) developed based on an original agile methodology. This ontology will not only provide a common vocabulary of the domain but also facilitate semantic linking of data from different sources through a knowledge graph.
The rest of this paper is organized as follows: In section 2, we provide an overview of the current work related to ontologies that can be used to semantically model sensors, observations, and environmental monitoring domains. In section 3, we describe the approach of MEMOn ontology development. In section 4, we present and discuss the evaluation of the approach through real case studies. Section 5 is devoted to discuss the proposed approach and present some concluding remarks as well as research perspectives.
Section snippets
Background, state of the art and motivations
In this section, we briefly introduce the ontology classification based on the domain scope. Then, we give an overview of sensor, observation, and measurement ontologies followed by related work on environmental monitoring ontologies. We also underline the motivation for our proposal.
MEMOn: Modular Environmental Monitoring Ontology
This section presents the methodology to build the proposed modular ontology for the environmental monitoring field. Building a modular ontology is not a straightforward task, especially when ontologies become more significant and more complex (Fernandez-Lopez and Corcho, 2004). Therefore, a methodology that guides and manages the modular ontology development is necessary.
But, one of the thorniest problems is how to choose the appropriate methodology taking account the complexity of the domain
MEMOn ontology evaluation
In this section, we present the MEMOn evaluation step through the verification (Section 4.1) as well as the validation processes (Section 4.2).
Discussion and conclusion
In the present work, we proposed MEMOn, a Modular Environmental Monitoring Ontology that will support semantic interoperability, data integration, and data linking to provide a global data view. One of the strengths of the adopted methodology for ontology development in this work was relying on real environmental data for both collecting the vocabulary and testing the ontology. We believe conforming to the vocabulary that can guarantee a global vision of data from multiple sources. This global
Software availability
The ontology modules described in this paper are available in a complete view at https://sites.google.com/view/predicat/memon.
The current version of MEMOn is available in OWL format. To direct download, readers may refer to https://github.com/MEMOntology/memon. The repository contains MEMOn's source code and imported ontologies.
Acknowledgments
This research was financially supported by the “PHC Utique” program of the French Ministry of Foreign Affairs, managed by Campus France, and the Tunisian Ministry of higher education and scientific research, managed by the CMCU (project number 17G1122/ CODE CF 37T03 NJ).
The authors would like to thank the OSS experts for their cooperation by providing support, domain knowledge, and environmental data.
References (54)
- et al.
The SSN ontology of the W3C semantic sensor network incubator group
- et al.
Transforming data into knowledge for improved wastewater treatment operation: a critical review of techniques
Environ. Model. Softw
(2018) - et al.
Linking building data in the cloud: Integrating cross-domain building data using linked data
Adv. Eng. Inform.
(2013) - et al.
Enhanced context-based document relevance assessment and ranking for improved information retrieval to support environmental decision making
Adv. Eng. Inf.
(2016) - et al.
Ontology engineering in provenance enablement for the national climate assessment
Environ. Model. Softw
(2014) - et al.
An ontology for describing and synthesizing ecological observation data
Ecol. Inf.
(2007) - et al.
Reasoning about river basins: WaWO+ revisited
Environ. Model. Softw
(2017) - et al.
An integrated flood management system based on linking environmental models and disaster-related data
Environ. Model. Softw
(2017) - et al.
Knowledge representation in the semantic web for Earth and environmental terminology (SWEET)
Comput. Geosci.
(2005) - et al.
Semantic validation of environmental observations data
Environ. Model. Softw. J.
(2016)
Meaningful spatial prediction and aggregation
Environ. Model. Softw. J.
Building Ontologies with Basic Formal Ontology
Challenges in modelling of environmental semantics
A place and event-based context model for environmental monitoring
Context-Aware. Geogr. Inf. Serv. (CAGIS 2014)
OpenGIS Sensor Model Language (SensorML) Implementation Specification
Smarter Environmental Analytics Solutions: Offshore Oil and Gas Installations Example
The environment ontology in 2016: bridging domains with increased scope, semantic density, and interoperation
J. Biomed. Semant.
Common core ontologies for data integration
An Environment Ontology for Global City Indicators (ISO 37120), Enterprise Integration Laboratory
A formal model to infer geographic events from sensor observations
Int. J. Geogr. Inf. Sci.
Generating ontologies via language components and ontology reuse
Ontology Modularization: Principles and Practice
Towards a definition of knowledge graphs
Toward the use of upper-level ontologies for semantically interoperable systems: an emergency management use case
Ontologies in Urban Development Projects
Ontological Engineering
A Foundation Ontology for Global City Indicators
Cited by (8)
Knowledge hypergraph-based approach for data integration and querying: Application to Earth Observation
2021, Future Generation Computer SystemsCitation Excerpt :for the third one. Let us now consider an example of a SPARQL query based on the MEMOn ontology (Modular Environmental Monitoring Ontology) [24], illustrated in Fig. 3. The query asks for precipitation data observed in “Miami” on 16 October 2019 and comprises five triple patterns T1, T2, T3, T4, and T5.
Deep Learning-Based Extraction of Concepts: A Comparative Study and Application on Medical Data
2023, Journal of Information and Knowledge ManagementWMO: an ontology for the semantic enrichment of wetland monitoring data
2023, International Journal of Digital EarthCharacterizing water quality datasets through multi-dimensional knowledge graphs: a case study of the Bogota river basin
2022, Journal of HydroinformaticsConstituent vs Dependency Parsing-Based RDF Model Generation from Dengue Patients' Case Sheets
2022, Journal of Information and Knowledge ManagementBFO-based ontology enhancement to promote interoperability in BIM
2021, Applied Ontology