Error analysis and correction of spatialization of crop yield in China – Different variables scales, partitioning schemes and error correction methods

doi:10.1016/j.compag.2018.03.031

Computers and Electronics in Agriculture

Volume 148, May 2018, Pages 272-279

https://doi.org/10.1016/j.compag.2018.03.031 Get rights and content

Highlights

•
We explored the influence of variables scales on precision of crop yield spatialization.
•
We detected the relationship between partitioning schemes and precision of crop yield spatialization.
•
We compared the pros and cons of seven different error correction methods.

Abstract

Spatialization of crop yield is beneficial to comprehensive analysis between interdisciplinary data. Multivariable linear regression models are often applied to spatialization of attribute data. The variables scales and the partitioning of China should be considered when the model is constructed. Different variables scales and partitioning schemes will inevitably results in different spatialization errors. Spatialization errors can be reduced by error correction methods. Different methods have different influence on the accuracy of crop yield spatialization. In this study, three variables scales were selected including prefectural scale, county scale and grid cell (1 km × 1 km). Five partitioning schemes (no partition of China, 7 regions of China, 9 regions of China, 10 regions of China, partitions of China by province) were considered. A total of 28 kinds of multivariable linear regression models were constructed with area of different types of farmland as independent variables, crop yields as dependent variables. Then, seven kinds of error correction methods were used to correct crop yield spatialization results. Three error evaluation indicators were selected to investigate the influence of different variables scales, partitioning schemes and error correction methods on the precision of spatialization results. The conclusions can be drawn as follows: (a) Nine models with intercept based on variables at regional scale could not be used to spatialize crop yield, while the others can be used for spatialization of crop yield. (b) The precision of the spatialization result based on the model without intercept is higher than that based on the model with intercept. (c) For models without intercept, precision of spatialization results increased first and then decreased with the refinement of partitioning scheme. (d) For models without intercept, the precision of spatialization results improved with scaling down of the variables scale from prefectural scale to county scale and grid scale. (e) Among the seven kinds of error correction methods, average correction method, weight coefficient correction methodⅡ and weight coefficient correction method III can’t be used to correct initial spatialization results. (f) Proportional coefficient correction method, weight coefficient correction methodⅠ, weight coefficient correction method Ⅳ and weight coefficient correction method Ⅴ can be used to correct initial results of spatialization. (g) The precisions of corrected spatialization products based on error correction methods, which can improve the precision of initial spatialization products, are very closely. This research made up for the deficiency of spatial error analysis of crop yield, explored the relationship between different sample scales and partitioning schemes and spatial error, compared the pros and cons of different error correction methods. Meanwhile, it also provided valuable information for other types of social and economic statistical data.

Introduction

Given a backdrop of global environmental dynamism and climate change, traditional geo-ecological processes have undergone drastic changes over the past few decades. The geographical processes are no longer simple natural processes, and the researches of ecological processes also are no longer confined to the dynamics and development in ecosystem. The integration and intersection of multiple disciplines is becoming an important characteristic of modern geo-ecological processes (Fu et al., 2006).

It is an important symbol of the combination of human activities and geo-ecological processes to apply statistics to the study of geo-ecological processes. Socio-economic statistics are collected and published based on administrative division. So they have low spatial resolution and lack of the description to spatial distribution characteristics of socioeconomic statistics. It is difficult to use them for comprehensive analysis of socio-economic data and other data in practical application, which limits their application to geographical research to a great extent. There are three major problems. First, the contradiction between the spatial heterogeneity of geographical elements and the homogeneity of statistics in the same administrative division; Second, the disagreement between landscape scale and statistical scale; Third, the statistical indicators in different regions are inconsistent (Liu and Li, 2012). The spatialization of socio-economic statistics can solve the above problems effectively (Liao and Zhang, 2009).

Numerous studies focused on the spatialization of socio-economic statistics, including spatialization of population (Tobler et al., 1995, Tobler et al., 1997, Sutton et al., 2001, Tian et al., 2005) and gross domestic product (GDP) statistics (Ebener et al., 2005, Doll et al., 2006, Sutton et al., 2007, Elvidge et al., 1997, Elvidge et al., 2009a, Elvidge et al., 2009b and Ghosh et al., 2009). With the rapid development of Remote Sensing (RS) and Geographic Information System (GIS) technology, the spatialization of agricultural production data are frequently studied, mainly including spatialization of crop acreage (Qiu et al., 2003, Leff et al., 2004, You and Wood, 2006, You et al., 2009, Monfreda et al., 2008, Khan et al., 2010, Zhang et al., 2013, Jin et al., 2015, Salmon et al., 2015, Liu et al., 2017) and agricultural production inputs (Potter et al., 2010, Sun et al., 2010, Yan and Pan, 2014). However, there are fewer researches on crop yield spatialization. For instance, Shi et al. used the cultivated land data to spatialize maize yield per unit area statistics by multivariable linear regression model, and got a spatial distribution map of maize yield per unit area in Jilin province (Shi et al., 2011). Liu et al. took population density as the dependent variables and crop yield as independent variables to construct a regression model with the support of land use data. The model was then applied to spatialize provincial-level crop yield statistics, resulting in a distribution map of crop yield of China at 1 km by 1 km in 2000 and the precision of crop yield spatialization results were analyzed from provincial scale down to prefectural scale and county scale (Liu and Li, 2012). But few studies explored the influence of variables scales and partitioning schemes on precision of crop yield spatialization.

As one of frequently-used geo-data processing methods, spatialization of attribute data inevitably results in errors during data processing. Spatialization errors can be reduced by correcting initial spatialization results. Many error modifying methods have been used to correct spatialization errors, such as average correction method (Wu et al., 2015), proportional coefficient correction method (Shi et al., 2016), weight coefficient correction method based on the basic idea that different farmland types have the same weight (Liao and Qin, 2014). However, there are few researches about comparing the pros and cons of different error correction methods. So, in this study we will discuss the influence of some new error correction methods on crop output spatialization and compare them with the existing error correction methods to improve spatialization precision.

This study attempts to simulate the spatial distribution of crop yield in China using land use data with the following objectives: (1) exploring the influence of variables scales on precision of crop yield spatialization; (2) detecting the influence of partitioning schemes on precision of crop yield spatialization; and (3) comparing the pros and cons of different error correction methods.

Section snippets

Data sources

Five datasets are used for this study.

1.
County-level and prefecture-level crop yield statistics of China in 2010. The data come from Statistical Yearbook of China in 2011.
2.
Land use dataset of China in 2010. The data set is provided by Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences (RESDC) (http://www.resdc.cn).
3.
County-level administrative map of China in 2010. It mostly includes vector data of county-level administrative boundary in China and other attribute data,

Research method

Crop output is proportional to farmland area, and different farmland types have different influence on crop output, and multivariate linear regression analysis method (MLRAM) is the most frequently used method to realize spatialization of attribute data. So, we chose MLRAM to spatialize crop output. Its basic formula is as follows:

Supposing one dependent variable y is affected by k independent variables (x₁, x₂, …, x_k), and there are n groups of observed values (y_a, x_1a, x_2a, …, x_ka), a = 1, 2, …,

Error correction

Error is the difference between the analog value of model and the actual observation value. The basic formula is as follows: $ε = y - y_{i}$ ε represents error, y is a statistics and y_i is a analog value.

The purpose of error correction is to improve spatialization precision by assigning errors to initial spatialization results based on some methods, such as average correction method (Wu et al., 2015), proportional coefficient correction method (Shi et al., 2016), weight coefficient correction method based

Conclusions

In this paper, three variables scales including prefectural scale, county scale and grid cell (1 km × 1 km) were selected. Five partitioning schemes (no partition of China, 7 regions of China, 9 regions of China, 10 regions of China, partitions of China by province) were considered. A total of 28 kinds of multivariable linear regression models were constructed with area of different types of farmland as independent variables, crop yields as dependent variables. Then, seven kinds of error

Acknowledgements

This work was supported by the National Key R&D Program of China [Grant number 2016YFA0602702].

References (39)

C.N.H. Doll et al.
Mapping regional economic activity from night-time light satellite imagery
Ecol. Econ.
(2006)
C.D. Elvidge et al.
A global poverty map derived from satellite data
Comput. Geosci.
(2009)
M.R. Khan et al.
Disaggregating and mapping crop statistics using hypertemporal remote sensing
Int. J. Appl. Earth Obs. Geoinf.
(2010)
J.M. Salmon et al.
Global rain-fed, irrigated, and paddy croplands: A new high resolution map derived from remote sensing, crop inventories and climate data
Int. J. Appl. Earth Obs. Geoinf.
(2015)
K. Shi et al.
Detecting spatiotemporal dynamics of global electric power consumption using DMSP-OLS nighttime stable light data
Appl. Energy
(2016)
Y. Tian et al.
Modeling population density using land cover data
Ecol. Model.
(2005)
L.Z. You et al.
An entropy approach to spatial disaggregation of agricultural production
Agric. Syst.
(2006)
L.Z. You et al.
Generating plausible crop distribution maps for sub-saharan africa using a spatially disaggregated data fusion and optimization approach
Agric. Syst.
(2009)
Chinese Academy of Agricultural Sciences (CAAS)
Planting Regionalization in China
(1984)
S. Ebener et al.
From wealth to health: modelling the distribution of income per capita at the sub-national level using night-time light imagery
Int. J. Health Geogr.
(2005)

C.D. Elvidge et al.

Relation between satellite observed visible-near infrared emissions, population, economic activity and electric power consumption

Int. J. Remote Sens.

(1997)

C.D. Elvidge et al.

A fifteen year record of global natural gas flaring derived from satellite data

Energies

(2009)

B. Fu et al.

Progress and perspective of geographical-ecological processes

Acta Geogr. Sinica

(2006)

T. Ghosh et al.

Estimation of Mexico’s Informal Economy and Remittances Using Nighttime Imagery

Remote Sensing

(2009)

X. Jin et al.

Farmland dataset reconstruction and farmland change analysis in China during 1661–1985

J. Geogr. Sci.

(2015)

B. Leff et al.

Geographic distribution of major crops across the world

Global Biogeochem. Cycles

(2004)

S. Liao et al.

A spatialization method for survey data of theoretical stock-carrying capacity of grassland in China and its application

Geogr. Res.

(2014)

S. Liao et al.

Study on error evaluating index for spatialisation of attribute data

J. Geo-inform. Sci.

(2009)

Z. Liu et al.

Spatial distribution of China crop output based on land use and population density

Trans. Chinese Soc. Agric. Eng. (Trans. CSAE)

(2012)

Cited by (5)

Downscaling the APSIM crop model for simulation at the within-field scale
2023, Agricultural Systems
Most crop models are designed for point-based modeling and to simulate agronomic variables on their native spatial footprint, i.e. typically as a uniform field-scale value. Precision agriculture needs crop model simulations at sub-field scales to support differential management application. Spatialization processes are used to change the simulation scale of crop models.
The objective of this study is to investigate the spatialization of a complex crop model by using a spatial calibration approach to modify its native spatial footprint and to evaluate if it is relevant to use this kind of crop model at the within-field scale.
APSIM was spatialized to simulate durum wheat yield at different spatial scales (field, within-field and site-scale) on an experimental field under Mediterranean conditions in southern Italy. Ancillary soil data were used to derive potential management (modeling) zones at different scales, which were then used to spatially calibrate soil and biomass parameters in APSIM to spatially predict yield in two different production years (one year was used for calibration and the other for evaluation). Spatialized crop model performances were evaluated using the spatial balanced accuracy (SBA) score, a metric to evaluate the global preservation of patterns between maps.
The spatial structure of the yield data influenced the effectiveness of the spatial calibration process. When the agronomic variable (durum wheat yield) was spatially structured, a spatialized APSIM approached performed best (5-zone modeling scale, SBA = 0.17) and outperformed the field-scale (native footprint) model (SBA = 0.19). In contrast, when the target agronomic variable was more random (less spatially structured), the uniform field-scale modeling performed best and spatial calibration had no benefit. The spatialized APSIM performances were mainly based on the reliability of the delineated zones that undeniably affected the quality of the spatialized model outputs. Thus, more research is needed on how best to model scale-dependent processes to have more reliable modeling at the within-field scale.
Based on the example of a complex crop model like APSIM, this study showed that spatial calibration can be effective and has a role to play in the spatialization of complex crop models.
Agri-biomass supply chain optimization in north China: Model development and application
2022, Energy
Citation Excerpt :
Growing numbers of researchers have invested great efforts with the aim of developing advanced technologies that can convert agri-biomass to energy and fuel. These benefits notwithstanding, the use of agri-biomass for energy and fuel comes with high collection and transportation costs and this is because of its intrinsic characteristics, including its low energy density, scattered geographical distribution, and seasonal and weather sensitivity [10–12]. In addition, it should be immediately disposed of as this will make it possible to plant the next season crops after they are produced.
The development of an agri-biomass supply chain optimization model and decision support tools have a critical role to play in the success of large-scale agri-biomass utilization. A multidisciplinary approach that incorporates operational research, geographic information systems, mathematical modeling, technical economic analysis and sensitivity analysis was developed to optimize agri-biomass supply chain management and feedstock supply. It applies the model to a case study of Shandong Province's Dezhou City, and this enables it to illustrate the factors that affect supply costs. The optimal agri-biomass supply costs were 180.98 CNY/t. Of the costs, transportation related cost and purchase cost were found to be the most significant components of agri-biomass supply costs, and labor cost was the largest component of operating costs. The results showed the agri-biomass supply chain is profitable compared with the actual situation in Shandong Province. Sensitivity analysis results demonstrated that the optimal agri-biomass supply chain infrastructure was sensitive to changes in agri-biomass unit collection cost, agri-biomass unit transportation cost, and agri-biomass demand. The paper concluded that it is worthwhile to exploit economies of scale to reduce agri-biomass supply costs.
Economic analysis of different straw supply modes in China
2021, Energy
Citation Excerpt :
However, China has developed an agricultural mode where individual households are responsible for production. Combining this with the special characteristics of straw, such as low density, and obvious seasonality, there is significant uncertainty in the process of straw collection, storage, and transportation [14–16]. Insufficient supply and high costs are the main obstacle restricting the business utilization of straw.
Choosing an appropriate straw supply mode is crucial for reducing straw supply costs. This study considers four different supply modes: Farmer-Factory mode, Farmer-Broker-Factory mode, Farmer-Centralized Storage Site-Factory mode, and Farmer-Broker-Centralized Storage Site-Factory mode. Comparing the advantages and disadvantages and economic analysis of each mode are conducted. It is found that straw collection includes artificial and mechanized collection, straw transportation includes tractor and truck transportation, straw storage includes open field storage and centralized storage sites. When collecting 100,000 tons of straw, ordering the different supply modes based on cost result an order (from high to low) of 1A, 3, 2A, 1B, 4, 2B, ordering them based on equipment demand result in an order (from high to low) of 1A, 3, 2A, 1B, 2B, 4, and ordering them based on labor demand result in an order (from high to low) of 1A, 3, 2A, 1B, 4, 2B. It also can be seen that collection cost, transportation cost, and loading and unloading costs are important components of supply cost for each mode. Through the analysis of the six modes, mechanized baling collection can significantly reduce supply cost and equipment and labor requirements. Although intermediate links are added, mode 4 has become an economical supply mode that is also suitable for large-scale straw utilization due to its mechanized operations. As a conclusion, it will likely be the main straw supply mode of the future.
Downscaling the APSIM Crop Model for Simulation at the Within-Field Scale
2023, SSRN
A Hybrid System Based on Dynamic Selection for Time Series Forecasting
2022, IEEE Transactions on Neural Networks and Learning Systems

View full text

Original papersError analysis and correction of spatialization of crop yield in China – Different variables scales, partitioning schemes and error correction methods

Highlights

Abstract

Introduction

Section snippets

Data sources

Research method

Error correction

Conclusions

Acknowledgements

Ecol. Econ.

Comput. Geosci.

Int. J. Appl. Earth Obs. Geoinf.

Int. J. Appl. Earth Obs. Geoinf.

Appl. Energy

Ecol. Model.

Agric. Syst.

Agric. Syst.

Planting Regionalization in China

From wealth to health: modelling the distribution of income per capita at the sub-national level using night-time light imagery

Int. J. Health Geogr.

Relation between satellite observed visible-near infrared emissions, population, economic activity and electric power consumption

Int. J. Remote Sens.

A fifteen year record of global natural gas flaring derived from satellite data

Energies

Progress and perspective of geographical-ecological processes

Acta Geogr. Sinica

Estimation of Mexico’s Informal Economy and Remittances Using Nighttime Imagery

Remote Sensing

Farmland dataset reconstruction and farmland change analysis in China during 1661–1985

J. Geogr. Sci.

Geographic distribution of major crops across the world

Global Biogeochem. Cycles

A spatialization method for survey data of theoretical stock-carrying capacity of grassland in China and its application

Geogr. Res.

Study on error evaluating index for spatialisation of attribute data

J. Geo-inform. Sci.

Spatial distribution of China crop output based on land use and population density

Trans. Chinese Soc. Agric. Eng. (Trans. CSAE)

Original papers
Error analysis and correction of spatialization of crop yield in China – Different variables scales, partitioning schemes and error correction methods