Modelling LiDAR derived tree canopy height from Landsat TM, ETM+ and OLI satellite imagery—A machine learning approach

https://doi.org/10.1016/j.jag.2018.08.013Get rights and content

Highlights

  • Canopy height was predicted from Landsat imagery (RMSE values between 2.3 m and 4.1 m).

  • Random forest regression accounted for complex vegetation structural types.

  • The model was robust across a range of vegetation communities and Landsat platforms.

  • Canopy height was used to identify structural change through time (1987–2016).

Abstract

Understanding ecological changes in native vegetation communities often requires information over long time periods (multiple decades). Tropical cyclones can have a major impact on woody vegetation structure across northern Australia, however understanding the impacts on woody vegetation structure is limited. Woody vegetation structural attributes such as height are used in ecological studies to identify long term changes and trends. LiDAR has been used to measure woody vegetation structure, however LiDAR datasets cover relatively small areas and historical coverage is restricted, limiting the use of this technology for monitoring long-term change. The Landsat archive spans multiple decades and is suitable for regional/continental assessment. Advances in predictive modelling using machine learning algorithms have enabled complex relationships between dependent and independent variables to be identified. The aim of this study is to develop a predictive model to estimate woody vegetation height from Landsat imagery to assist in understanding change through space and time. A LiDAR canopy height model was produced covering a range of vegetation communities in northern Australia (Darwin region) for use as the dependent variable. A random forest regression model was developed to predict mean LiDAR canopy height (30 m spatial resolution) from Landsat-5 Thematic Mapper (TM). Validation of the random forest model was undertaken on independent data (n = 30,500) resulting in an overall R2 = 0.53, RMSE of 2.8 m. Assessment of the RMSE within four broad vegetation communities ranged from 2.5 to 3.7 m with the two dominant communities in the study area Mangrove forests and Eucalyptus communities recording an RMSE value of 2.9 m and 2.5 m respectively. The model was also applied to Landsat-7 Enhanced Thematic Mapper Plus (ETM+) resulting in an R2 of 0.49, RMSE of 2.8 m. The model was then applied to all cloud free Landsat-5 TM, Landsat-7 ETM+ and Landsat-8 Operational Land Imager (OLI) imagery (106/69 path/row) available between the months April, May and June for 1987 to 2016 to produce annual estimates (29 years) of canopy height. A number of time traces were produced to illustrate tree canopy height through time in the Darwin region which was severely impacted by cyclone (hurricane) Tracy on the 25th December 1974.

Introduction

The value of remote sensing in ecological studies has been well recognised (Roughgarden et al., 1991, Wang et al., 2010, Pettorelli et al., 2014). Landsat satellites have been capturing multispectral imagery of the earth surface since 1972 representing the longest record of temporal space-borne land observations (Roy et al., 2010). Landsat data has been used for a variety of applications, such as natural hazard assessment (Barlow et al., 2003, Joyce et al., 2009), fire scar mapping (Gill et al., 2000, Goodwin and Collett, 2014), coral reef mapping (Joyce et al., 2004), rangeland monitoring (Wallace et al., 2004, Scarth et al., 2010), temperate and tropical forest mapping (Brown et al., 2000, Renó et al., 2011), and many others. The characteristics of the Landsat sensors have been identified as valuable for regional monitoring applications (Cohen and Goward, 2004). The spectral and spatial resolution of the Landsat imagery combined with its temporal record make it valuable for monitoring woody cover change across large regions (Woodcock et al., 2001, Danaher et al., 2004, Staben et al., 2016, Gill et al., 2017). Amongst its many applications Landsat imagery has been utilised to detect severe forest damage (Ekstrand, 1996) including damage as a result of cyclonic (hurricane) winds (Preston, 1987, Paling et al., 2008, Staben and Evans, 2008).

Tropical cyclones occur on a frequent basis across the coastline of the Australian Northern Territory. The destructive winds associated with these cyclones can have a major impact on both the man-made and natural environments. The impact of cyclonic winds are greatest on the coastal regions, however they also have the potential to cause significant disturbance further inland (e.g. Cyclone Monica) (Staben and Evans, 2008). The impact on native vegetation can be significant, resulting in major structural changes to vegetation communities. A number of studies have reported on the impact of cyclones on vegetation in the Northern Territory (Stocker, 1976, Fox, 1980, Cameron et al., 1983, Bowman and Panton, 1994, Cook and Goyens, 2004, Staben and Evans, 2008, Williamson et al., 2011, Hutley et al., 2013). These studies have used a number of methods ranging from collection of field data, aerial photography and satellite imagery. Although cyclones are frequent and have the potential to be a major disturbance agent in ecosystems across the Northern Territory (Murphy, 1984), very few studies have been undertaken to quantify the impact and potential role they play in driving the structure of these communities (Cook and Goyens, 2004). While it is well recognised that fire and the stress of the seasonal drought (a characteristic of the of the wet-dry tropics of northern Australia) are frequent disturbance factors on vegetation communities, very little focus has been given to the impact cyclones have on these ecosystems (Cook and Goyens, 2004, Hutley et al., 2013).

While severe damage to woody vegetation can be relatively easy to identify by comparing satellite imagery captured directly before and after the change event (e.g. cyclones), accurate assessment of the subtle changes through time is enhanced by relating biophysical variables to satellite remote sensing observations. To obtain quantitative information from optical satellite data relationships between biophysical variables need to be established (Moulin et al., 1998). Numerous studies have derived empirical relationships between Landsat imagery and field based measurements such as leaf area index (Coops et al., 1997, Eriksson et al., 2006), above ground biomass of woody vegetation (Foody et al., 2003, Powell et al., 2010, Avitabile et al., 2012), fractional cover (Scarth et al., 2010) and woody vegetation foliage projective cover (Danaher et al., 2004, Armston et al., 2009). A variety of statistical methods have been used to develop these relationships including, linear and non-linear regression models based on single or multiple predictor variables (Cohen et al., 2003), while others have used machine learning algorithms such as neural networks, tree-based models, K-nearest neighbours and support vector machines (Labrecque et al., 2006, Li et al., 2010, Avitabile et al., 2012).

Vegetation height has been identified as a key parameter for inferring long term trends in biomass and carbon stock (Skidmore et al., 2015, Cook et al., 2015). Combined with species and site quality information vegetation height helps to inform estimates of stand age and successional stages (Stojanova et al., 2010). Light detection and ranging (LiDAR) data has been used extensively to measure woody vegetation structure, and while LiDAR is an efficient way to map and measure woody vegetation structure (Lim et al., 2003, Wulder et al., 2012, Goldbergs et al., 2018), the use of these data at a regional level can be prohibitive due to financial constraints (Pascual et al., 2010). Furthermore, the availability of LiDAR for long-term studies (multiple decades) is limited due to the paucity of data. Ecological processes can occur over long time frames, and understanding these processes often requires information recorded over multiple decades, captured at an appropriate spatial, spectral and temporal resolution. Numerous studies have used structural information obtained from LiDAR data to develop predictive models using Landsat sensors with an aim to enhance the spatial and temporal coverage (Hudak et al., 2002, Pascual et al., 2010, Hill et al., 2011, Ota et al., 2014, Ahmed et al., 2015). These studies have been undertaken across a variety of vegetation communities ranging from conifer forests (Ahmed et al., 2015) to tropical evergreen and deciduous forests (Ota et al., 2014, Hill et al., 2011, Wilkes et al., 2015). In southern Australia Wilkes et al. (2015) predicted canopy height over a 2.9 million ha area of heterogeneous temperate forests by developing a relationship between LiDAR derived canopy height and a combination of satellite imagery (Landsat and Moderate Resolution Imaging Spectroradiometer) using the random forest algorithm. Machine learning techniques based on ensemble models such as random forest have been used successfully for a variety of remote sensing classification and regression modelling applications (Pal, 2005, Avitabile et al., 2012, Mellor et al., 2013, Mellor et al., 2015, Mascaro et al., 2014, Karlson et al., 2015, Wilkes et al., 2015). These studies demonstrate the advantages of random forest algorithm such as its robustness to outliers in the training data, ability to handle non-parametric data, its ability to uncover complicated non-linear relationships between variables and the ease in tuning the models parameters.

In this study, we investigate the application of Landsat satellite sensors to predict woody vegetation canopy height and develop a model predicting canopy height across a range of vegetation communities in the wet-dry tropics of Northern Australia. While previous studies have demonstrated a fusion of different sensors and LiDAR to derive predictive models of canopy height in Australia (Wilkes et al., 2015), this study investigates the use of Landsat sensors only for the estimation of canopy height over a long time series of multiple decades. To our knowledge this is the first study to look at predicting LiDAR derived canopy height from Landsat sensors in the wet-dry tropics of northern Australia. A canopy height model (1 m spatial resolution) was produced from a LiDAR dataset captured in 2009 for use as the dependent variable. Random forest regression was used to produce a model to predict LiDAR derived canopy height from a single Landsat-5 Thematic Mapper (TM) image captured in 2009 (30 m spatial resolution). We developed a three-stage approach to identify the important independent variables and optimise the parameters used in the random forest model, which was applied to Landsat-5 TM, Landsat-7 Enhanced Thematic Mapper Plus (ETM+) and Landsat-8 Operational Land Imager (OLI) sensors.

Section snippets

Study area

This study was undertaken in the Darwin region, located in northern Australia's wet dry tropics (Fig. 1). The average annual temperature for the Darwin region is 32 °C with average annual rainfall of 1729 mm, with the majority of the precipitation occurring during October and April. The study site covers an area of approximately 1800 km2 consisting of urban, peri-urban development and native vegetation. The dominant native vegetation communities occurring in the study area include Mangrove

Model Development Stage One: optimising number of trees

To reduce the computational burden of the random forest model we undertook an experiment to identify the optimal number of decision trees, the results are presented as box plots in Fig. 4. Each box plot represents the RMSE values for the number of trees in the random forest model (based on 100 using independent test data) with mean RMSE values ranging between 3.18 m and 3.92 m. The lowest mean RMSE score was recorded for n_estimator values 512 and 4096. These results are consistent with other

Conclusions

In this study we implemented a random forest regression model to predict canopy height from a single date Landsat-5 TM scene, across a variety of natural vegetation communities in the Northern Territory, Australia. The model was trained with a LiDAR-derived canopy height model (CHM) (R2 = 0.53, RMSE = 2.8 m). A three-stage approach was undertaken to tune the random forest model and select the predictor variables used in the final model. Despite none of the individual independent predictor

Acknowledgements

This study would not have been possible without the support of the Northern Territory Government and the collaborative partnership between the Northern Territory Government’s Department of Environment and Natural Resources, Rangelands Division and Queensland Government’s Department of Environment and Science, Remote Sensing Centre. Also thanks to Neil Flood for assistance and advice in the development of the python code used in this study.

References (96)

  • A.T. Hudak et al.

    Integration of lidar and Landsat ETM+ data for estimating and mapping forest canopy height

    Remote Sens. Environ.

    (2002)
  • A.R. Huete

    A soil-adjusted vegetation index (SAVI)

    Remote Sens. Environ.

    (1988)
  • E.R. Hunt et al.

    Detection of changes in leaf water content using near- and middle-infrared reflectances

    Remote Sens. Environ.

    (1989)
  • S. Labrecque et al.

    A comparison of four methods to map biomass from Landsat-TM and inventory data in western Newfoundland

    Forest Ecol. Manage.

    (2006)
  • H. Li et al.

    A framework for creating and validating a non-linear spectrum-biomass model to estimate the secondary succession biomass in moist tropical forests

    ISPRS J. Photogramm. Remote Sens.

    (2010)
  • E. Paling et al.

    Assessing the extent of mangrove change caused by Cyclone Vance in the eastern Exmouth Gulf, northwestern Australia

    Estuar. Coast. Shelf Sci.

    (2008)
  • S.L. Powell et al.

    Quantification of live aboveground forest biomass dynamics with Landsat time-series and field inventory data: a comparison of empirical modeling approaches

    Remote Sens. Environ.

    (2010)
  • J. Qi et al.

    A modified soil adjusted vegetation index

    Remote Sens. Environ.

    (1994)
  • V.F. Renó et al.

    Assessment of deforestation in the Lower Amazon floodplain using historical Landsat MSS/TM imagery

    Remote Sens. Environ.

    (2011)
  • V.F. Rodriguez-Galiano et al.

    An assessment of the effectiveness of a random forest classifier for landcover classification

    ISPRS J. Photogramm. Remote Sens.

    (2012)
  • D. Roy et al.

    A general method to normalize Landsat reflectance data to nadir BRDF adjusted reflectance

    Remote Sens. Environ.

    (2016)
  • G. Staben et al.

    Obtaining biophysical measurements of woody vegetation from high resolution digital aerial photography in tropical and arid environments: Northern Territory, Australia

    Int. J. Appl. Earth Observ. Geoinform.

    (2016)
  • D. Stojanova et al.

    Estimating vegetation height and canopy cover from remotely sensed data with machine learning

    Ecol. Inform.

    (2010)
  • C.J. Tucker

    Red and photographic infrared linear combinations for monitoring vegetati

    Remote Sens. Environ.

    (1979)
  • S.M. Vicente-Serrano et al.

    Assessment of radiometric correction techniques in analyzing vegetation variability and change using time series of Landsat images

    Remote Sens. Environ.

    (2008)
  • C.E. Woodcock et al.

    Monitoring large areas for forest change using Landsat: generalization across space, time and Landsat sensors

    Remote Sens. Environ.

    (2001)
  • M.A. Wulder et al.

    Lidar sampling for large-area forest characterization: a review

    Remote Sens. Environ.

    (2012)
  • J.D. Armston et al.

    Prediction and validation of foliage projective cover from Landsat-5 TM and Landsat-7 ETM + imagery

    J. Appl. Remote Sens.

    (2009)
  • C.S. Bach

    Phenological patterns in monsoon rainforests in the Northern Territory, Australia

    Austral Ecol.

    (2002)
  • A. Bannari et al.

    A review of vegetation indices

    Remote Sens. Rev.

    (1995)
  • J. Barlow et al.

    Detecting translational landslide scars using segmentation of Landsat ETM+ and DEM data in the northern Cascade Mountains, British Columbia

    Can. J. Remote Sens.

    (2003)
  • M. Belgiu et al.

    Random forest in remote sensing: a review of applications and future directions

    ISPRS J. Photogramm. Remote Sens.

    (2016)
  • D. Bowman et al.

    Fire and cyclone damage to woody vegetation on the north coast of the Northern Territory, Australia

    Aust. Geogr.

    (1994)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • J. Brock

    Remnant Vegetation Survey: Darwin to Palmerston Region

    (1995)
  • P. Brocklehurst et al.

    Mangrove survey of Darwin Harbour Northern Territory (N.T.)

    (1996)
  • C. Buschmann et al.

    In vivo spectroscopy and internal optics of leaves as basis for remote sensing of vegetation

    Int. J. Remote Sens.

    (1993)
  • W.B. Cohen et al.

    Landsat's role in ecological applications of remote sensing

    BioScience

    (2004)
  • G.D. Cook et al.

    The impact of wind on trees in Australian tropical savannas: lessons from Cyclone Monica

    Austral Ecol.

    (2008)
  • G.D. Cook et al.

    Stocks and dynamics of carbon in trees across a rainfall gradient in a tropical savanna

    Austral Ecol.

    (2015)
  • N. Coops et al.

    Estimation of eucalypt forest leaf area index on the South Coast of New South Wales using Landsat MSS data

    Aust. J. Bot.

    (1997)
  • D.R. Cutler et al.

    Random forests for classification in ecology

    Ecology

    (2007)
  • T. Danaher et al.

    A regression model approach for mapping woody foliage projective cover using Landsat Imagery in Queensland, Australia

  • S. Ekstrand

    Landsat TM-based forest damage assessment: correction for topographic effects

    Photogramm. Eng. Remote Sens.

    (1996)
  • J.G. Ferwerda et al.

    Differences in regeneration between hurricane damaged and clear-cut mangrove stands 25 years after clearing

    Hydrobiologia

    (2007)
  • N. Flood

    Continuity of reflectance data between Landsat-7 ETM+ and Landsat-8 OLI, for both top-of-atmosphere and surface reflectance: a study in the Australian landscape

    Remote Sens.

    (2014)
  • N. Flood et al.

    An operational scheme for deriving standardised surface reflectance from Landsat TM/ETM+ and SPOT HRG imagery for Eastern Australia

    Remote Sens.

    (2013)
  • E.R. Fox

    Deciduous vine thickets of the Darwin area and effects of cyclone ‘Tracy’ 25 December 1974

    (1980)
  • Cited by (0)

    Fully documented templates are available in the elsarticle package on CTAN.

    View full text