International Journal of Applied Earth Observation and Geoinformation
Classifiers vs. input variables—The drivers in image classification for land cover mapping
Introduction
Detailed and accurate land cover data are among the most crucial information that are required for large-scale environmental research. The knowledge of the spatial configuration of the Earth's surface is the key for assessing habitat distribution, landscape composition or land use changes and is an essential requirement for landscape modelling and scenario building, particularly in times of global change. The suitability of remote sensing for acquiring land cover data has long been recognised, but the process of generating land cover information from remotely sensed data is still far from being standardised or optimised (Foody, 2002, Lu and Weng, 2007). An extensive variety of multi-spectral image classification methods have been developed, which were recently reviewed by Lu and Weng (2007), though none of the developed classifiers is described as inherently superior to any other, as their performance largely depends on the kind and quality of the input data for the classification and the desired output. Even unsupervised ISODATA classification has been used successfully, for example to extract specific, spectrally distinct features such as forests, fire scars, coastlines or urban areas (Ekercin, 2007, Heinl et al., 2006, Kaya and Curran, 2006, Souza et al., 2003). However, for obtaining thematic land cover data, supervised classification is to be preferred in most cases (Foody, 2001, Jensen, 2005, Kavzoglu, 2009), as desired output classes are already pre-defined and post-classification analyses and class aggregations are not necessarily required. Especially the use of advanced approaches such as artificial neural networks, fuzzy-sets or support vector machines produced levels of accuracy higher than, e.g. the popular maximum likelihood classifier or discriminant analysis (Berberoglu et al., 2007, Dixon and Candade, 2008, Jensen, 2005, Kavzoglu and Mather, 2003, Kavzoglu and Reis, 2008, Pal and Mather, 2005). But only few specific comparisons have been published (Berberoglu et al., 2007, Hardin, 2000, Kavzoglu and Reis, 2008, Paola and Schowengerdt, 1995, Zhang et al., 2007), usually documenting a superiority of the advanced approaches, but also suggesting maximum likelihood classification as better alternative (Carvalho et al., 2004). The use of different numbers and types of land cover classes and sample sizes complicates a quantitative comparison of the results. And despite the often documented inferiority in classification success, maximum likelihood classification is still one of the most widely used classification algorithms (Jensen, 2005), most likely also due to advantages in data handling and processing times (Paola and Schowengerdt, 1995). Therefore, many applied landscape-scale studies and land use/land cover research rely on these standard classification approaches (Brandt and Townsend, 2006, Cushman and Wallin, 2000, Jianchu et al., 2005, Joy et al., 2003, Ruiz-Luna and Berlanga-Robles, 2003). In contrast, advanced approaches are primarily limited to methodological studies for optimising the classification process, often using only very limited sample sizes (Fassnacht et al., 2006, Foody, 2001, Kavzoglu and Mather, 2003, Kavzoglu and Reis, 2008, Ouyang and Ma, 2006, Paola and Schowengerdt, 1997, Yemefack et al., 2006).
Besides the type of image classifier, the use of ancillary data is recognised as being crucial for the performance of image classification. Ancillary data have been used successfully to improve image classification, especially by including topographic measures, NDVI or texture measures in the classification process additionally to the spectral information for separating features with similar spectral properties (Berberoglu et al., 2007, Carpenter et al., 1999, Giannetti et al., 2001, Islam et al., 2008, Joy et al., 2003, Kozak et al., 2008, Lu and Weng, 2007, Saadat et al., 2008, Watanachaturaporn et al., 2008).
Despite extensive research on classifiers and ancillary data since decades, comparisons and applications of image classifiers using standardised samples on landscape-scale are largely missing (Lu and Weng, 2007). To overcome this discrepancy, the present study was conducted mutually both on the performance of different classifiers and on the importance of ancillary data for landscape-scale land cover assessments using pre-defined land cover classes. The present study investigates therefore the effect of a variety of selected and widely accessible input variables and classifiers on classification accuracy overall and on the level of specific land cover classes, and assesses and especially quantifies the importance of these components in image classification. We hypothesize that advanced classification approaches achieve higher overall accuracies compared to standard classifiers with little or no ancillary data, while incorporating ancillary data reduces the importance of the type of classifier. Specifically compared are the performance of maximum likelihood classification, discriminant analysis and artificial neural networks, covering presumably the most widely used hard classifiers and representing parametric and non-parametric classifiers. Ancillary data in the form of topographic measures and NDVI were incorporated step-wise into the classification to document the relevance of these input data. Classification results on the level of land cover classes are discussed in the context of reference data selection and land cover class definition.
Section snippets
Spectral data properties and study region
The spectral information for the image classification was acquired by the Landsat7 ETM+ sensor (path193/row027) on 13 September 1999. The imagery was provided by the Global Land Cover Facility (GLCF) (www.landcover.org) as orthorectified GeoCover data set in GeoTIFF format with UTM projection (UTM 32N), WGS-84 datum, and 28.5 m pixel size. The six bands representing the visible and infrared spectrum (ETM+ bands 1–5, 7) were used in the study. The scene was cut to 1650 × 3300 pixels to fit to the
Overall classification accuracy related to classifiers and input variables
The classifications by DA and MLC produced very similar overall accuracies for all input combinations. Accuracies were in the range of 55–60% for using only spectral data (ETM) as input variables and reached about 75% when ancillary data were included (Fig. 2). The classifications using ANN produced higher overall accuracies for all input combinations compared to MLC and DA, reaching about 75% for using only spectral data (ETM) and 85% with ancillary data. Maximum overall classification
The relevance of input variables and classifiers for image classification accuracy
Spectral data, topographic measures and NDVI data were used to test their performance in image classifications by maximum likelihood classification (MLC), discriminant analysis (DA) and artificial neural networks (ANN). The use of ancillary data significantly improved the classification accuracy for the present data set compared to using spectral data (ETM) only. These increases in overall accuracy were observed independent of the classifier. Especially incorporating topographic information
Conclusion
The comparison of the performance of MLC, DA and ANN in image classification revealed advantages of ANN classifications in image accuracy overall and for single land cover classes. The incorporation of ancillary data into the classification process clearly increased classification accuracy overall and on the level of single land cover classes, independent of the used classifier. However, ANN produced high accuracies also with limited input information, while MLC and DA produced comparable
Acknowledgements
The research was kindly supported by the University of Innsbruck Vice Rectorate for Research and the European Academy Bolzano (EURAC). The authors thank two anonymous reviewers for their valuable comments and suggestions.
References (57)
- et al.
Accuracy and congruency of three different digital land-use maps
Landscape and Urban Planning
(2006) - et al.
Texture classification of Mediterranean land cover
International Journal of Applied Earth Observation and Geoinformation
(2007) - et al.
National Park vegetation mapping using multitemporal Landsat 7 data and a decision tree classifier
Remote Sensing of Environment
(2003) - et al.
A neural network method for efficient vegetation mapping
Remote Sensing of Environment
(1999) - et al.
Contribution of multispectral and multitemporal information from MODIS images to land cover classification
Remote Sensing of Environment
(2008) - et al.
Selection of imagery data and classifiers for mapping Brazilian semideciduous Atlantic forests
International Journal of Applied Earth Observation and Geoinformation
(2004) - et al.
Land cover classification with AVHRR multichannel composites in northern environments
Remote Sensing of Environment
(1996) - et al.
Key issues in making and using satellite-based maps in ecology: a primer
Forest Ecology and Management
(2006) Status of land cover classification accuracy assessment
Remote Sensing of Environment
(2002)- et al.
Integrated use of satellite images, DEMs, soil and substrate data in studying mountainous lands
International Journal of Applied Earth Observation and Geoinformation
(2001)
Exploring the spatial and temporal dynamics of land use in Xizhuang watershed of Yunnan, southwest China
International Journal of Applied Earth Observation and Geoinformation
Increasing the accuracy of neural network classification using refined training data
Environmental Modelling & Software
Monitoring urban growth on the European side of the Istanbul metropolitan area: A case study
International Journal of Applied Earth Observation and Geoinformation
European forest cover mapping with high resolution satellite data: the Carpathians case study
International Journal of Applied Earth Observation and Geoinformation
A comparison of single date and multitemporal satellite image classifications in a semi-arid grassland
Journal of Arid Environments
Accuracy assessment using sub-pixel fractional error matrices of global land cover products derived from satellite data
Remote Sensing of Environment
Mapping land use/cover in a tropical coastal area using satellite sensor data, GIS and artificial neural networks
Estuarine Coastal and Shelf Science
Pixel- and site-based calibration and validation methods for evaluating supervised classification of remotely sensed data
Remote Sensing of Environment
Landform classification from a digital elevation model and satellite imagery
Geomorphology
Land use classification in mountainous areas: integration of image processing, digital elevation data and field knowledge (application to Nepal)
International Journal of Applied Earth Observation and Geoinformation
Mapping forest degradation in Eastern Amazon from SPOT4 through spectral mixture models
Remote Sensing of Environment
Investigating relationships between Landsat-7 ETM+ data and spatial segregation of LULC types under shifting agriculture in southern Cameroon
International Journal of Applied Earth Observation and Geoinformation
Neural Networks for Pattern Recognition
Land use–land cover conversion, regeneration and degradation in the high elevation Bolivian Andes
Landscape Ecology
An approach for land cover mapping with multi-temporal MERIS imagery
Rates and patterns of landscape change in the Central Sikhote-alin Mountains, Russian Far East
Landscape Ecology
NDVI-derived land-cover classifications at a global-scale
International Journal of Remote Sensing
Cited by (42)
Land cover classification in an era of big and open data: Optimizing localized implementation and training data selection to improve mapping outcomes
2022, Remote Sensing of EnvironmentCitation Excerpt :Ancillary datasets are a relevant information source to characterize the physical geographic context and establish the relationship of various land cover types to particular environmental conditions (Yang et al., 2018). Ancillary data provide descriptive information on factors such as topographic characteristics (derived from digital elevation models, DEM), climate, and hydrological conditions, which can enhance large-area land cover classification models (Amatulli et al., 2018; Hurskainen et al., 2019), allowing for the separation of classes with similar spectral characteristics (Heinl et al., 2009). Besides elevation data and related derivatives (Franklin, 2020), an example of a unique and underutilized ancillary data source for land cover classification is lidar.
Value of dimensionality reduction for crop differentiation with multi-temporal imagery and machine learning
2017, Computers and Electronics in AgricultureCitation Excerpt :Nevertheless, the value of multi-temporal data for crop discrimination has been demonstrated by Wardlow et al. (2007), Ozelkan et al. (2015), Zheng et al. (2015). Multi-temporal data allows for the generation of a large number of features (variables) for each image acquisition date, which has been shown to substantially improve results (Heinl et al., 2009). However, the use of multi-temporal data often leads to very high feature counts (Lu and Weng, 2007; Heinl et al., 2009).
Object-Oriented Random Forest for High Resolution Land Cover Mapping Using Quickbird-2 Imagery
2017, Handbook of Neural ComputationIntegrating rapideye and ancillary data to map alpine habitats in South Tyrol, Italy
2015, International Journal of Applied Earth Observation and GeoinformationCitation Excerpt :However, SVM have not been fully tested in mapping vegetation in alpine regions. Ancillary data such as topographic parameters have proven useful for land cover mapping (Heinl et al., 2009; Fan, 2013) especially in alpine regions where vegetation types are closely related to topographic relief (Hoersch et al., 2002; Schirpke et al., 2012). Moreover, texture features can increase the classification accuracy for heterogeneous land cover compositions (Franklin et al., 2000; Rodriguez-Galiano et al., 2012; Paneque-Gálvez et al., 2013) and classifications in mountainous areas (Hurni et al., 2013) such as the alpine landcaspe since they can represent vegetation patterns and capture differences between classes (Corbane et al., 2013).
The urgent need to develop a new grassland map in China: based on the consistency and accuracy of ten land cover products
2023, Science China Life SciencesSand Dunes Spectral Index Determination Using Machine Learning Model: Case Study of Baiji Sand Dunes Field Northern Iraq
2022, Iraqi Geological Journal