A machine learning approach to detect crude oil contamination in a real scenario using hyperspectral remote sensing

https://doi.org/10.1016/j.jag.2019.101901Get rights and content

Highlights

  • An airborne hyperspectral image was taken 2.5 years after an oil spill.

  • The image was utilized to map the longstanding contamination.

  • A new approach for mapping was developed using machine learning techniques.

  • High classification accuracy metrics (>0.85) were achieved.

  • Test scenes showed a good agreement with hi-res image.

Abstract

One of the most ubiquitous and detrimental types of environmental contamination in the world is crude oil pollution. When released into either the aquatic or terrestrial environments, this pollution can negatively impact flora and fauna, as well as human health. Hence, a rapid and affordable spatial assessment of the pollution is favored to limit the spill’s effects. Using airborne hyperspectral remote sensing (HRS) for crude oil detection in terrestrial areas has been investigated in previous studies, which mainly relied on heavily oiled artificial samples. These studies and others based their methodologies on the premise that the spectral features of petroleum hydrocarbon (PHC) are clearly observable, which might not be true in all cases. In this study, we aimed at assessing the true potential of using HRS for terrestrial oil spill mapping in a real disaster site in southern Israel, where laboratory and controlled conditions do not apply. Using the AISA SPECIM Fenix1 K sensor, we collected airborne image of the study site and analyzed the data with advanced data mining techniques. Various challenges and limitations arose from the airborne HRS image being taken two and a half years after the crude oil had been released into the environment and exposed to the surface. Here, no spectral features of PHC were detectable in the spectrum, preventing the use of PHC indices and spectral methods developed by others. Nevertheless, by using standardization techniques, vicarious band selection, dimension reduction, multivariate calibration, and supervised machine-learning, we were able to successfully distinguish between contaminated pixels from non-contaminated ones. Classification accuracy metrics of overall accuracy, sensitivity, specificity, and Kappa yielded good results of 0.95, 0.95, 0.95 and 0.9, respectively, for cross-validation, and 0.93, 0.91, 0.94 and 0.85, for the validation dataset. Classified image and test scenes also showed strong agreement with an orthophoto image taken several days after the disaster, when the pollution was clearly visible. Thus, we conclude that HRS technology can detect PHC traces in an oil spill site, even under the most challenging conditions.

Introduction

Crude oil pollution is a global anthropogenic problem. It mainly occurs through oil spills, leaks, and accidents caused by either system failures, human errors, or negligence. The presence of crude oil in the environment is hazardous or even deadly to wildlife, as well as to humans, and has negative effects on the natural environment (Aguilera et al., 2010; Chima and Vure, 2014; Kruse et al., 1993; Ramirez et al., 2017). In the case of an oil spill event, the location and spatial extent of the pollution must be precisely mapped. This information can then be transferred to response personnel to implement countermeasures and focus on cleaning, rehabilitation, and long-term monitoring efforts, in order to minimize the environmental damage.

Hyperspectral remote sensing (HRS) technology, also known as imaging spectroscopy, records the radiation reflected or emitted from the surface in tens or hundreds of continuously spaced narrow spectral bands in each pixel. The spectral response in each pixel can then be related to chemical and\or physical properties of the surface or material (Ben-Dor et al., 2009; Ben‐Dor et al., 2008; Kruse, 1988). Hence, HRS can be utilized to detect individual absorption features (i.e., the fraction of incident radiation absorbed by the material over a range of wavelengths) due to specific chemical bonds in a solid, liquid, or gas. However, the actual detection of materials using HRS depends on the spectral coverage, spectral resolution, and signal-to- noise ratio (SNR) of the sensor, as well as the abundance of the material and the strength of the absorption features (Andreoli et al., 2007).

Crude oil transported in ships and pipelines from the production locations to the distillation facilities, contains petroleum hydrocarbons (PHC), which have distinctive spectral absorptions in ∼1200, ∼1700, and ∼2300 nm (further in Fig. 3) due to the vibrational stretching and bending of the Csingle bondH bond in alkanes (Cloutis, 1989; Schwartz et al., 2009; Winkelmann, 2005). Therefore, and as opposed to methods such as drilling and geochemical analysis, HRS can be used to detect PHC on the surface (of bare soils) from a distance (Asadzadeh and de Souza Filho, 2017; Ellis et al., 2001; Hörig et al., 2001; Kühn et al., 2004; Pabón et al., 2019; Scafutto et al., 2017; Van Der Meer et al., 2002), thereby offering a time-saving, cost-effective, and non-destructive method of characterizing oil spills and their impact on the environment.

To date, most studies conducted on the ability to detect PHC using HRS in terrestrial areas have been based on an indirect approach, which mainly identifies anomalies in vegetation or soil caused by the presence of PHC (e.g., Khanna et al., 2013; Kokaly et al., 2013; Li et al., 2005; Noomen et al., 2012; Sanches et al., 2013; Van Der Werff et al., 2008; Yang et al., 2000). Fewer studies have focused on the direct approach, meaning to directly sense the spectral signature of PHC on the surface (bare soil) that results from natural seepage or an oil spill. More specifically, as this study deals with the latter situation, only a handful of published studies (Hörig et al., 2001; Kühn et al., 2004; Lenz et al., 2015; Lever et al., 2015; Pabón et al., 2019; Scafutto et al., 2017) used in-situ airborne HRS for that purpose, and only one of them (Pabón et al., 2019) was performed in an actual oil spill situation (oil refinery). The different approaches and methods explored in these studies reported good results in demonstrating the potential ability of HRS as an operative tool for detecting and monitoring PHC. All of them showed that the spectral features of PHC were clearly seen in the spectra collected from an airborne hyperspectral sensor upon which they based their analyses.

Nevertheless, they all operated under optimal and controlled conditions (excluding Pabón et al., 2019), such as: (i) use of artificial man-made samples which, among other things, ensure at least one pure pixel of their samples to facilitate identification of PHC and end-member selection; (ii) HRS image acquisition performed shortly after the artificial samples had been contaminated; (iii) use of reference spectra from the laboratory or in-situ; and most importantly (iv) PHC spectral absorptions were clearly visible in the HRS spectra. None of these studies attempted to evaluate direct detection of PHC by HRS in a real disaster zone, where more limitations need to be considered than under controlled conditions. For example, pollution that has been in the soil for some time causes the PHC spectral effects to attenuate from the surface to the airborne sensor. The study conducted by Pabón et al. (2019) was executed in an oil refinery in Brazil (area of ∼0.7 km2). They showed the spectral features of the PHC in the HRS image (1 m ground spatial distance-GSD) and reported high classification accuracy. However, there was no information about how long the spill was on the surface, or whether it was ongoing leakage which might continuously enforce the amount of oil absorbed in the soil. Hence, it could keep the spectral signal of the oil evident in the spectrum of the airborne image.

In this study, an airborne HRS image was acquired over an arid area in southern Israel two and a half years following a major crude oil spill. The detection and mapping of PHC in a real disaster zone poses new challenges that might not exist in controlled environments. From the moment in which the oil spill occurred until the time when the image was taken (two and a half years later), the spilled crude oil was exposed to atmospheric conditions as well as to biological, chemical, and physical processes. In addition, the area itself was subjected to remediation and rehabilitation efforts shortly after the spill, including the removal of soil together with oil. Moreover, the illumination conditions at the time of the flight were suboptimal, due to sparse clouds, resulting in varying illumination intensities at the scene. These challenges resulted in lack of clear PHC spectral absorption features in the spectra (further discussed in section 2.7.2).

Therefore, this study, in contrast to others, could not be based on the premise that PHC spectral features will appear in the spectrum of contaminated pixels. In other words, relying on the PHC absorption features and/or absorption minimum/maximum or even laboratory spectra would have resulted in poor classification accuracy and thus, a different approach had to be taken.

Hence, a supervised machine learning workflow was developed, i.e., training a classifier (a machine learning model) on a set of contaminated and uncontaminated pixels from the scene and predicting, per pixel, whether it is clean or contaminated. Machine learning models can detect the slightest variation between classes in the data, which is extremely difficult or even impossible for humans to notice. However, training the model blindly based on the entire spectrum of selected pixels might give poor results, because not all the spectral bands contain relevant information. Instead, this study used standardization techniques combined with vicarious band selection and dimension reduction to create a superior training dataset upon which the model could be built.

The purpose of this study was to develop an approach to detect and map contaminated pixels, despite the mentioned obstacles, long after the oil spill had occurred.

Section snippets

Study area

The Evrona Nature Reserve (Fig. 1) in Israel is located ∼20 km north of Eilat in the Arava Valley, which is part of the Dead Sea Rift extending between the Dead Sea to the north and the Gulf of Eilat to the south. It is crossed by the Evrona drainage system going southward, leaving topographically elevated playa deposits high above the channels. The climate and environmental conditions of Evrona make it one of the most interesting acidic habitats in the world. It contains a variety of unique

Vicarious band selection

When analyzing a spectrum (in the VNIR-SWIR) to detect PHC in a soil substrate, the analysis is guided by the diagnostic spectral ranges of ∼1700 and ∼2300 nm. However, in our case, a statistical test (ANOVA) was used on each spectral band to identify important wavelength ranges for classification. Fig. 9A shows the results of the ANOVA test alongside lab spectra for reference. The higher the F statistic, the better the spectral band is in separating the two classes (oil and clean). An F

Discussion

The predominant spectral features of PHC in the VNIR-SWIR, which are located at ∼1700 and ∼2300 nm, can be captured with hyperspectral sensors. This has been previously demonstrated, often in heavily oiled settings, by different researchers, for example Hörig et al. (2001) (HyMap sensor, 128 bands, GSD 2–4 m), Ellis et al. (2001) (PROBE-1 sensor, 128 bands, GSD 5 m), Kokaly et al. (2013) (AVIRIS sensor, 224 bands, GSD 3.5 m), Asadzadeh and de Souza Filho (2017) (AVIRIS sensor, 224 bands, GSD

Conclusion

This study demonstrated the power of utilizing a machine learning approach on airborne hyperspectral data for rapid detection of PHC in a terrestrial spill area long after the event occurred. The study was quite distinctive because it was not carried out in laboratory or controlled conditions but was rather situated in a real oil spill zone—where the oil was exposed on the surface for two and a half years prior to the acquisition of the image. Instead of relying on the spectral absorption

Acknowledgment

The authors would like to thank the members of the remote-sensing laboratory at Tel Aviv University, who assisted in executing the flight campaign and for their constructive comments. We greatly appreciate the crude oil provided by Arnon Karnieli from Ben-Gurion University, Ittai Herrmann from the Hebrew University, Image Sat International – ISI for providing the EROS subset image of the study area, and the Israeli Nature and Parks Authority for providing the GIS layers of the leakage. We also

References (37)

  • I.D. Sanches et al.

    Assessing the impact of hydrocarbon leakages on vegetation using reflectance spectroscopy

    ISPRS J. Photogramm. Remote Sens.

    (2013)
  • R.D.M. Scafutto et al.

    Hyperspectral remote sensing detection of petroleum hydrocarbons in mixtures with mineral substrates: implications for onshore exploration and monitoring

    ISPRS J. Photogramm. Remote Sens.

    (2017)
  • S. Wold et al.

    PLS-regression: a basic tool of chemometrics

    Chemom. Intell. Lab. Syst., PLS Methods

    (2001)
  • F. Aguilera et al.

    Review on the effects of exposure to spilled oils on human health

    J. Appl. Toxicol. JAT

    (2010)
  • G. Andreoli et al.

    Hyperspectral Analysis of Oil and Oil-Impacted Soils for Remote Sensing Purposes

    (2007)
  • A. Baratloo et al.

    Part 1: simple definition and calculation of accuracy, sensitivity and specificity

    Emergency

    (2015)
  • R.J. Barnes et al.

    Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra

    Appl. Spectrosc.

    (1989)
  • A. Brook et al.

    Supervised vicarious calibration (SVC) of multi-source hyperspectral remote-sensing data

    Remote Sens.

    (2015)
  • Cited by (33)

    • Airborne imaging spectroscopy for assessing land-use effect on soil quality in drylands

      2022, ISPRS Journal of Photogrammetry and Remote Sensing
      Citation Excerpt :

      Moreover, airborne IS is still expensive and requires a complex infrastructure to operate (Ben-Dor et al., 2009; Chabrillat et al., 2019a). Despite these drawbacks, the use of airborne IS to study properties and processes related to soil has emerged and grown substantially in the last couple of decades (Chabrillat et al., 2019a), through its mapping and monitoring of multiple aspects such as soil salinity (Ben-Dor et al., 2002; Zhang et al., 2019), soil composition (Li, 2020; Žížala et al., 2017), soil organic carbon (Stevens et al., 2006; Tziolas et al., 2020), soil moisture (Diek et al., 2016; Haubrock et al., 2008), soil erosion and stability (Schmid et al., 2005, 2016), soil contamination (Davies and Calvin, 2017; Pelta et al., 2019), and many other soil aspects. Paz-Kagan et al. (2014) demonstrated the use of 14 soil indicators in determining the variability of soil attributes among three different LU types that changed from managed to unmanaged and vice versa.

    • Terrestrial oil spill mapping using satellite earth observation and machine learning: A case study in South Sudan

      2021, Journal of Environmental Management
      Citation Excerpt :

      Mahdianpari et al. (2018) distinguished different subclasses that represent varying degrees of contamination by combining very high-resolution images in an object-based classification framework with electromagnetic induction (EM) survey measurements that were based on airborne hyperspectral data and machine learning techniques. Mapping longstanding contamination more than two years after an oil spill was possible only under restrained conditions due to a lack of clearly distinguishable spectral features of petroleum hydrocarbon in the hyperspectral data (Pelta et al., 2019). Both the radiative transfer model (Lassalle et al., 2019b) and the regression modelling (Lassalle et al., 2019a) were applied to estimate persistent oil contamination in tropical regions.

    • Intelligent computational techniques in marine oil spill management: A critical review

      2021, Journal of Hazardous Materials
      Citation Excerpt :

      Although wind-related features were previously reported important in oil spill detection, no significant influence of those features was found on the classification because a manual annotation of the initial dataset had considered the effects of low-wind areas in advance (Mera et al., 2017). Most oil spill detection models have been using binary statistical classifiers to detect dark spots in images (Bianchi et al., 2020; Guo and Zhang, 2014; Pelta et al., 2019). For this purpose, classifiers should be trained based on influential features extracted from a manually annotated training dataset of oil spill images.

    • The surface expression of hydrocarbon seeps characterized by satellite image spectral analysis and rock magnetic data (Falcon basin, western Venezuela)

      2021, Journal of South American Earth Sciences
      Citation Excerpt :

      The likely chemical pathways that explain near-surface hydrocarbon-mediated mineral alteration have been thoroughly discussed and summarized by Thompson and Oldfield (1986) and Schumacher (1996). Satellite images have proven to be a fast and cost-effective way to depict the spatial extension of terrains that have been affected by oil spills as well as hydrocarbon-induced vegetation stress and mineral diagenesis in soils and sediments (e.g., Yang et al., 2000; Petrovic et al., 2008; Khan and Jacobson, 2008; Petrovic et al., 2012; Noomen et al., 2012; Kokaly et al., 2013; Scafutto et al., 2017; Asadzadeh and de Souza Filho, 2017; Pelta et al., 2017; Pabón et al., 2019). In fact, these images show characteristic spectral signatures and linear, textural, and tonal features that outline the scope and boundaries of zones that are more influenced by the presence of hydrocarbon seeps (e.g., Simpson et al., 1991; Zhu and Zhang, 1991; Thompson et al., 1994; Li et al., 2005; Petrovic et al., 2008; Petrovic et al., 2012; Sanches et al., 2013).

    • Introduction to Machine Learning in the Oil and Gas Industry

      2021, Machine Learning and Data Science in the Oil and Gas Industry: Best Practices, Tools, and Case Studies
    View all citing articles on Scopus
    View full text