Comparing three global parametric and local non-parametric models to simulate land use change in diverse areas of the world
Graphical abstract
Introduction
Land cover refers to the physical cover of the earth surface (e.g., water, vegetation and man-made features) while land use is driven by the human activity on the land to change or maintain it (Turner et al., 1995). Many earth, environmental, and atmospheric science applications are concerned about the spatial distribution of land use change (LUC) and the resulting impacts on land cover and related ecosystem processes (NRC, 2005, NCR, 2007). The way humans alter the land can significantly influence different ecosystem services (Meehan et al., 2013) such as the global climate change patterns (Pielke, 2005, Pijanowski et al., 2011), economy (Wernick, 2007), biodiversity (Sala et al., 2000), forest growth (Willert et al., 2010), food security (Foley et al., 2005), and the water cycle (Peng et al., 2002).
Urbanization (Alberti and Marzluff, 2004, Jokar et al., 2014), agriculture change, and forest growth are important themes in socio-economic research (Grimm et al., 2008). At a global scale (Letourneau et al., 2012), for instance, extensive conversion from natural vegetation to agricultural classes occurred during the 1970s and into the mid-1980s (Armesto et al., 2009). However, after the mid-1980s, agricultural growth occurred with some degree of intensification in those areas more suitable for agriculture (Armesto et al., 2009). Some recovery of forests from abandoned agricultural land has occurred recently (Diaz et al., 2011), especially in the Eastern United States (Nair, 1993). Researchers (Lambin and Meyfroidt, 2011, Pijanowski and Robinson, 2011) have assessed the effects of LUC on agriculture lands across the globe in various studies (e.g., cropping practices in Africa (Washington-Ottombre et al., 2010), loss of agriculture lands due to deforestation in United States (Weinhold, 1999) and different scenarios of LUC trends in Europe (Ewert et al., 2005)) due to the world wide threat to food security (Jenerette and Wu, 2001).
Today, people in the world are more dependent upon forest resources for meeting essential needs (FAO, 2009). LUC, particularly urbanization, across the globe is rapidly increasing and is the root cause of current threats to biodiversity and species extinctions as well (Czech et al., 2000). LUC removes habitat directly during construction and fragments the remaining habitat (Swenson and Franklin, 2000). Urban systems are also involved with serious issues such as urban heat islands (Arnfield, 2003), carbon dioxide domes (Grimmond et al., 2002), and high-level nitrogen deposition (Bowen and Valiela, 2001), which can affect the future of the global ecosystem.
LUC assessments (Rhemtulla et al., 2007) show that general land use has proceeded from natural (e.g. forests) land across a region to human dominated use (commonly agriculture), and then finally to urban. Some rural (previously agricultural) areas in the United States have converted either to secondary forests as marginal agricultural land was set aside, or they have transitioned to mixed rural residential/secondary forests as urban areas expanded along the urban/rural fringe. However, analysis at landscape scales (Jenerette and Wu, 2001) show that patterns can shift rapidly and even transitions from one class to another (e.g., forest to agriculture) can be quick. Detailed knowledge of LUC is a key parameter for sustainable planning in the context of agriculture change, urban development, and forest conservation.
It is quite conclusive to the land use science community that LUC is a complex process; with multiple drivers of LUC operating at a variety of spatial and temporal scales from diverse sources such as policy, behavior, economics, and other natural features (Serneels and Lambin, 2001). Therefore, advanced approaches are required, such as data mining tools, to understand underlying patterns in LUC data. Data mining, a common term in computer science, is basically called the process of extracting hidden knowledge from data sets (Sut and Simsek, 2011, Li et al., 2009). A variety of data mining approaches have been applied (Imran et al., 2008) in various disciplines (e.g. economics, medicine, engineering, and environmental science); however, scientists in different fields may use other terms. For example, in LUC science, scientists substitute data mining with other terms such as empirical (He and Lo, 2007), dynamic (Clarke et al., 1997), rule-base (Tayyebi et al., 2010b, Tayyebi et al., 2011b), agent based (Ralha et al., 2013, Jokar et al., 2013a) and machine learning models (Pijanowski et al., 2009).
Data mining methods generally include two main groups of models (Tayyebi et al., 2013a, Tayyebi et al., 2013b, Wernick, 2007, Liu et al., 2001, Hardle et al., 2004) – global parametric models and local non-parametric models – that have been used to quantify the relationship between dependent, LUC, and multiple independent variables, social and environmental drivers. Global parametric models are the most common in the literature of LUC science (Landis and Zhang, 1998, Theobald and Hobbs, 1998, Aspinall, 2004); these approaches present all data to the model. In other words, one model is created that represents the entire dataset. A variety of global parametric models have been applied by modelers, particularly in land use science. Logistic regression is one of the most common statistical global parametric model applied to model LUC (e.g., Tayyebi et al., 2008a, Tayyebi et al., 2008b, Tayyebi et al., 2010a, Mertens and Lambin, 2000, Jokar et al., 2013b). Global parametric models for LUC modeling have also included artificial neural networks (ANNs), (Pijanowski et al., 2002, Mas et al., 2004, Tayyebi et al., 2011a, Almeida et al., 2008), cellular automata (Batty and Xie, 1994, Clarke et al., 1997, Stevens and Dragićević, 2007) and genetic algorithms (Seppelt and Voinov, 2002). Local non-parametric models, on the other hand, subset all data and build separate (i.e., local) models of these subsets. Thus, multiple models are generated from partitioned data. Classification and regression tree (CART; Breiman et al., 1984, Müller et al., 2013) and Multivariate Adaptive Regression Splines (MARS; Friedman, 1991, De Andrés Suárez et al., 2011) are local non-parametric models that have been used widely as data mining tools in multidisciplinary sciences (Zha and Chan, 2005, Abdel-Aty and Haleem, 2011).
Although global parametric models in modeling LUC have received considerable attention during the last three decades (Pontius and Schneider, 2001, Tang et al., 2005a, Tang et al., 2005b), few studies have compared global parametric models with local non-parametric models in LUC science (Pontius et al., 2008). We examine this problem by using three models from two families of data mining approaches to systematically compare characterizations of LUC. The main question that we are expecting to answer is to explore how the mechanism of global parametric models and local non-parametric models are different from each other to characterize the LUC patterns.
Many scientists have compared CART, ANN and MARS for a variety of applications such as dentistry (Gansky, 2003), water quality (Areerachakul and Sanguansintukul, 2010) and classification of speech patterns (Zha and Chan, 2005). These comparative studies have shown that there are distinct advantages of one tool over another, in some cases due to topic (Psichogios and Ungar, 1992), spatial dimension of analysis (Pijanowski et al., 2005), data form (e.g. input or output data are categorical, continuous or ordinal; Tayyebi et al., 2014), or temporal dimension of analysis (short-term and long-term, Ture et al., 2005); however, a few studies have shown that there are no detectable differences between the variety of data mining approaches employed (Areerachakul and Sanguansintukul, 2010).
Although comparative methodological data mining studies are becoming frequent in the literature (cf. Pontius et al., 2008), the comparison of various data mining tools is still challenging. For example, in an oral health study, Gansky (2003) found that ANN performed better than CART; however, he concluded any comparative study of data mining tools needs multiple model assessment tools and an iterative analysis approach across space and time (e.g., explore size of data, examine goodness of fit, re-evaluate drivers). ANNs are the most common global parametric models that have been applied to a variety of applications; they are often compared to statistical global parametric models. ANNs usually perform better than statistical global parametric models when non-linear patterns and interactions exist in the data (Francis, 2001); however, several researchers have found that regression models outperformed ANNs when the functional relationships between the independent and dependent variables are known (Warner and Misra, 1996). In addition, regression models also provide better explanatory power as the parameters of ANNs are not directly interpretable.
Comparative data mining studies across spatial and temporal scales can be enhanced by examining how they apply to different scenarios of LUC (Pijanowski et al., 2006). Such studies can determine how universal drivers of LUC may operate globally or be manifested. Some researchers who have conducted comparative studies have found that characteristics of the study area such as sample size, quality of data, how models are built (i.e., training) and calibrated (i.e., testing), and patterns in data can influence which model performs best. For example, CART, MARS and ANN were explored for modeling different forest classes using satellite imagery and comparing this with in situ field data within five ecologically different regions in the western United States (Moisen and Frescino, 2002). MARS and ANN showed tremendous advantages over CART for prediction; however, the differences between models were less distinct for the in situ data that had less noise. Thus, “noisy” data may be modeled best using ANN and MARS. In another study, Relative Operating Characteristic (ROC) was used to compare CART and MARS for predicting the likelihood of emerging markets using financial data (Büyükbebeci, 2009). The CART approach gave more accurate results in the training run; however, in testing runs, MARS gave more accurate results. Thus, some tools may over fit the data hindering its ability to generalize from one dataset to another.
Most of the previous studies in LUC science have applied only one data mining tool to multiple regions (Pijanowski et al., 2005, Pijanowski et al., 2006) or multiple data mining tools to one region (Tayyebi et al., 2013a, Tayyebi et al., 2013b). In previous work with the ANN, Pijanowski et al. (2005) applied one tool to simulate urbanization in two regions, in the Upper Midwest, United States – the Twin Cities in Minnesota and the Detroit Metropolitan Area. They tested whether the ANN could generalize urbanization patterns in these two regions. The predictive power of ANN was compared where training and testing data were the same (i.e., internally validated by place), but models were developed in one city and then model goodness-of-fit tested by applying model parameters to the second city. Pijanowski et al. (2005) found that internally calibrated and validated models outperformed those where ANN network parameters were swapped between metropolitan areas although generalization occurred for one situation (Detroit applied to Twin Cities) but not for the other. In another study (Tayyebi et al., 2014), multiple data mining tools (ANN and logistic regression) were applied to one region (MRW; urbanization) to examine various uncertainty dimensions in LUC models (e.g. data, model parameter, model structure and model outcome). They found that data uncertainty should be more carefully dealt with to minimize its occurrence. Finally, Pontius et al. (2008) compared the predictive power of various data mining tools (13 LUC models) in the literature that had been previously applied to various regions. Pontius et al. (2008) found that uncertainty is still high in nearly all of the LUC models, indicating the need to further improve our understanding and characterization of LUC.
There is still a lack of comparison studies in LUC science applying multiple data mining tools to multiple regions at the same time with various time intervals (e.g. short and long term intervals), landscape characteristic (e.g. dominated with a particular land use class), and LUC patterns (e.g. urban, agriculture and forest) to compare LUC models across space and time. We examine this problem by applying three data mining tools across the globe (in United States and Africa) with various land use transformations (e.g. agriculture, forest and urban), landscape characteristics (e.g. CLIP, SEWI and MRW dominated by agriculture, urban and forest lands, respectively) and time intervals (5, 10 and 20 years in CLIP, SEWI and MRW, respectively) to ensure that we are not obtaining a unique outcome but rather an outcome that is potentially more robust. The main objectives of this study are to: (1) develop a framework to classify data mining tools, (2) understand complex LUC patterns (forest, urban and agriculture) across space, in the United States midwest region and Africa, and time (5 years in CLIP; 10 years in SEWI and 20 years in MRW) using three data mining models, and (3) compare the performance of the three data mining approaches across space and time in terms of their potential for forest and urban gain simulations in two different areas of the United States, and agriculture change in Africa, using conventional LUC model goodness-of-fit metrics.
Section snippets
Parametric versus non-parametric models
Data mining tools use parameters to describe the relationship between dependent and independent variables. Data mining tools can be classified into two main groups based on whether they have finite or infinite number of parameters. A parametric model covers techniques (e.g. logistic regression and ANN) that use a finite number of parameters, in other words, the structure of a model is fixed before running the model (Table 1; Liu et al., 2001, Hardle et al., 2004). However, a non-parametric
Data mining tools
LUC models use land use maps across space and time for training to unravel mechanisms of landscape development. Calibration via the testing data is needed to ensure that the underlying patterns can apply to new data (Manel et al., 1999). Thus, stratified random sampling was used to save the data into two mutually exclusive sets: the training (approximately 70% of the data) and testing (the other 30% of data) data sets. The training dataset was used to train the models; however, testing data was
Study area
The Climate–Land Interaction Project (CLIP) study area (Olson et al., 2008) is located in East Africa encompassing 5 entire countries (Kenya, Uganda, Rwanda, Burundi and Tanzania; Fig. 1a). Approximately 15% of the study area is agriculture. The south east of Wisconsin (SEWI) region includes seven counties: Kenosha, Milwaukee, Ozaukee, Racine, Walworth, Washington and Waukesha counties (Fig. 1b; Pijanowski et al., 2006). SEWI is currently dominated by urban in the east and agriculture in the
CART simulation
Fig. 2 shows agriculture, urban, forest gain and non-change using yellow, red, green and gray nodes, respectively, for the CART model simulations (see Supplement 2 for more details about training on CART). Each variable is identified by a variable name, the value at which the split would be made, and the improvement yielded by the split (Fig. 2; Table 5). Tree splitters send low values of a splitter to the left and high values to the right. The best variables are displayed in decreasing order
Comparison of data mining tools across space and time
A wide variety of LUC models have been developed to serve the different processes and scales of analyses (Matthews et al., 2007). However, few studies have compared CART, ANN and MARS with one another to detect patterns in data in short-time and long-time step intervals. The results of this paper show that ANN performs better than CART and MARS in a short-term interval (CLIP; 5 years); however, the three models are approximately identical across a long-term interval simulations (MRW; 20 years).
Conclusion
This research attempts to compare one global parametric model, which is ANN, with two local non-parametric models, which are MARS and CART, to simulate LUC patterns across space and time. This study aimed to investigate the performance of LTM, CART and MARS methods in predictions of the agriculture, forest and urban growth in three different regions. The results of this paper are informative for the field of data mining in LUC science. We were able to show that despite the high amount of
Acknowledgments
Funding to complete this research was obtained through the USGS Climate Change Research Program, the Great Lakes Fishery Trust, and the Department of Forestry and Natural Resources, Purdue University.
References (119)
- et al.
Analyzing angle crashes at unsignalized intersections using machine learning techniques
Accid. Anal. Prev.
(2011) - et al.
Evaluation of modeling techniques for forest site productivity prediction in contrasting ecoregions using stochastic multicriteria acceptability analysis (SMAA)
Environ. Model. Softw.
(2011) - et al.
Potential mosquito vectors of arboviruses in Portugal: species, distribution, abundance and West Nile infection
Trans. R. Soc. Trop. Med. Hyg.
(2008) Modelling land use change with generalized linear and generalized additive models – a multi-model analysis of change between 1860 and 2000 in Gallatin Valley, Montana
J. Environ. Manag.
(2004)- et al.
Characterizing performance of environmental models
Environ. Model. Softw.
(2013) - et al.
A comparison of decision tree classifiers with backpropagation neural networks for multimodal classification problems
Pattern Recognit
(1993) - et al.
Drivers of land abandonment in Southern Chile and implications for landscape planning
Lands. Urban Plan.
(2011) - et al.
Future scenarios of European agricultural land use: I. Estimating changes in crop productivity
Agric. Ecosyst. Environ.
(2005) - et al.
On tree structured classifiers
- et al.
Local-scale fluxes of carbon dioxide in urban environments: methodological challenges and results from Chicago
Environ. Pollut.
(2002)
Combining non-parametric models with logistic regression: an application to motor vehicle injury data
Comput. Stat. Data Anal.
A land-use systems approach to represent land-use dynamics at continental and global scales
Environ. Model. Softw.
Comparing discriminant analysis, neural networks and logistic regression for predicting species distributions: a case study with a Himalayan river bird
Ecol. Model.
Assessing land/use cover changes: a nationwide multidate spatial database for Mexico
Int. J. Appl. Earth Observ. Geoinf.
Comparing five modeling techniques for predicting forest characteristics
Ecol. Model.
Comparing the determinants of cropland abandonment in Albania and Romania using boosted regression trees
Agric. Syst.
Integrating diverse methods to understand climate–land interactions in East Africa
GeoForum
Using neural networks and GIS to forecast land use changes: a land transformation model
Comput. Environ. Urban Syst.
Rates and patterns of land use change in the Upper Great Lakes States, USA: a framework for spatial temporal analysis
Landsc. Urban Plan.
A big data urban growth simulation at a national scale: configuring the GIS and neural network based land transformation model to run in a high performance computing environment
Environ. Model. Softw.
Land-use change model validation by a ROC method for the Ipswich watershed, Massachusetts, USA
Agric. Ecosyst. Environ.
A multi-agent model system for land-use change simulation
Environ. Model. Softw.
Coupling land use and groundwater models to map land use legacies: assessment of model uncertainties relevant to land use planning
Appl. Geogr.
Optimization methodology for land use patterns using spatially explicit landscape models
Ecol. Model.
It was an artefact not the result: a note on systems dynamic model development tools
Environ. Model. Softw.
Proximate causes of land-use change in Narok District, Kenya: a spatial statistical model
Agric. Ecosyst. Environ.
A hybrid analytical-heuristic method for calibrating land-use change models
Environ. Model. Softw.
Comparison of regression tree data mining methods for prediction of mortality in head injury
Expert Syst. Appl.
Forecasting land use change and its environmental impact at a watershed scale
J. Environ. Manag.
An urban growth boundary model using neural networks, GIS and radial parameterization: an application to Tehran, Iran
Landsc. Urban Plan.
Two rule-based urban growth boundary models applied to the Tehran metropolitan area, Iran
Appl. Geogr.
Ecological resilience in urban ecosystems: linking urban patterns to human and ecological functions
Urban Ecosyst.
Classification and regression trees and MLP neural network to classify water quality of canals in Bangkok, Thailand
Int. J. Intell. Comput. Res.
Old-growth temperate rainforests of South America: conservation, plant–animal interactions, and baseline biogeochemical processes
Two decades of urban climate research: a review of turbulence, exchanges of energy and water, and the urban heat island
Int. J. Climatol.
From cells to cities
Environ. Plan. B
The ecological effects of urbanization of coastal watersheds: historical increases in nitrogen loads and eutrophication of Waquoit Bay estuaries
Can. J. Fish. Aquat. Sci.
Classification and Regression Trees
Comparison of MARS, CMARS and CART in Predicting Default Probabilities for Emerging Markets
A self-modifying cellular automaton model of historical urbanization in the San Francisco Bay area
Environ. Plan. Plan. Des.
Economic associations among causes of species endangerment in the United States associations among causes of species endangerment in the United States reflect the integration of economic sectors, supporting the theory and evidence that economic growth proceeds at the competitive exclusion of nonhuman species in the aggregate
BioScience
Bankruptcy forecasting: a hybrid approach using Fuzzy c-means clustering and multivariate adaptive regression splines
Expert Syst. Appl.
Seed Security for Food Security in the Light of Climate Change and Soaring Food Prices: Challenges and Opportunities
A review of methods for the assessment of prediction errors in conservation presence/absence models
Environ. Conserv.
Global consequences of land use
Science
Neural Networks Demystified
Flexible parsimonious smoothing and additive modeling
Technometrics
Multivariate adaptive regression splines
Ann. Stat.
Dental data mining: potential pitfalls and practical issues
Adv. Dent. Res.
Cited by (87)
Long Short-Term Memory and Attention Models for Simulating Urban Densification
2023, Sustainable Cities and SocietyTracking land use trajectory to map abandoned farmland in mountainous area
2023, Ecological InformaticsCan ecological landscape pattern influence dry-wet dynamics? A national scale assessment in China from 1980 to 2018
2022, Science of the Total EnvironmentCitation Excerpt :On the contrary, it will be deteriorated with increasing landscape fragmentation and diversity. It has been revealed that landscape pattern has the ecological potential to regulate local or regional climatic circumstances through influencing surface albedo, land surface roughness and vegetation evapotranspiration (Peng et al., 2019; Perugini et al., 2017; Tayyebi et al., 2014). Nevertheless, specific mechanisms of influences of various elements in landscape pattern on dry-wet dynamics remain unclear.
Influences of landscape pattern evolution on regional crop water requirements in regions of large-scale agricultural operations
2021, Journal of Cleaner ProductionModeling urban encroachment on ecological land using cellular automata and cross-entropy optimization rules
2020, Science of the Total EnvironmentConsiderations for selecting a machine learning technique for predicting deforestation
2020, Environmental Modelling and SoftwareCitation Excerpt :Specialised methods from other fields, such as presence/absence models used in species distribution are also being re-purposed to model deforestation distribution (De Souza and De Marco, 2018). Comparisons of multiple methods often report only model performance, and with little context of how each might be applicable in different decision making scenarios (Pérez-Vega et al., 2012; Tayyebi et al., 2014; Samardzic-Petrovic et al., 2017). Exceptions to this include studies that also report on comprehensibility and method intricacy (Kampichler et al., 2010), and model calibration time (Rodrigues and de la Riva, 2014).