Robust methods for assessing the accuracy of linear interpolated DEM

https://doi.org/10.1016/j.jag.2014.08.012Get rights and content

Highlights

  • Two robust statistics are proposed for assessing the interpolated DEM accuracy.

  • Confidence intervals are constructed to restrict the interpolated DEM residuals.

  • Asymptotic convergence behaviors are investigated by Monte Carlo simulations.

  • Robust methods produce more reliable DEM accuracy assessments than classical methods.

Abstract

Methods for assessing the accuracy of a digital elevation model (DEM) with emphasis on robust methods have been studied in this paper. Based on the squared DEM residual population generated by the bi-linear interpolation method, three average-error statistics including (a) mean, (b) median, and (c) M-estimator are thoroughly investigated for measuring the interpolated DEM accuracy. Correspondingly, their confidence intervals are also constructed for each average error statistic to further evaluate the DEM quality. The first method mainly utilizes the student distribution while the second and third are derived from the robust theories. These innovative robust methods possess the capability of counteracting the outlier effects or even the skew distributed residuals in DEM accuracy assessment. Experimental studies using Monte Carlo simulation have commendably investigated the asymptotic convergence behavior of confidence intervals constructed by these three methods with the increase of sample size. It is demonstrated that the robust methods can produce more reliable DEM accuracy assessment results compared with those by the classical t-distribution-based method. Consequently, these proposed robust methods are strongly recommended for assessing DEM accuracy, particularly for those cases where the DEM residual population is evidently non-normal or heavily contaminated with outliers.

Introduction

The digital elevation model (DEM) has been widely used in GIS filed to digitally represent the variation of the Earth's surface. No matter appearing in the form of a rasterized DEMs with regular square-shaped cells or a vector-based triangulated irregular networks (TINs), an elevation surface is normally generated by interpolation of sampled terrain elevation points (Maune, 2007). The common approaches to capturing the terrain elevation values of a terrain surface include field surveying by total stations, photogrammetry or airborne LiDAR. These data acquisition methods are normally subject to measurement errors and the subsequent DEM interpolation models can propagate or further enlarge the errors (Fisher and Tate, 2006). A DEM, therefore subject to a certain level of errors, needs to be properly assessed and specified to the DEM users. Meanwhile, as the era of big data comes, the requirements of spatial data quality information increase significantly. Therefore, as pointed out by Shi et al. (2004), measurement of the positional error of geo-spatial data is a key research issue in the area of quality assessment of spatial data.

In the field of uncertainty modeling and spatial data quality analysis, DEM error modeling has occupied the core of considerable researches (Shi, 2010). Much attention has been paid to exploring the effects of all sorts of interpolation models upon various kinds DEM data with different spatial resolutions (Chaplot et al., 2006). It is well-known that the common indicator for assessing the DEM accuracy can be root mean squared error (RMSE), which has been employed together with trend analysis to evaluate the quality of the DEM produced from ASTER stereoscopy (de Oliveira and Paradella, 2009). Basically, the following three factors are mainly perceived to account for the interpolated DEM uncertainty: (1) source data errors from the spatial data acquisition stage; (2) the employed DEM interpolation models; (3) the complexity level of terrain variation (Shi et al., 2014). Based on the approximation theory, Hu et al. (2009) has discussed the mathematical relationship between DEM error and these three influencing factors. Normally, type (1) error is regarded as data-based error; hence it is highly related to the data capturing methods, while type (2) and (3) errors are perceived to be the model-based errors. These two latter type errors have aroused great attention of the scholars for investigating how close the interpolated DEM surface approximates the actual ground surface (Fisher and Tate, 2006). Furthermore, Chaplot et al. (2006) evaluated the performance of five common interpolation techniques: inverse distance weighting (IDW), ordinary/universal kriging, radial basis function (RBF), and the regularized spline with tension, upon natural landscapes with differing terrain morphologies and different geographical scales, pointing out that DEM interpolated accuracy is related to landform types, sampling data density and the spatial scale. Guo et al. (2010) investigated the effects of topographic variability and sampling density on the accuracy of LiDAR DEM generated by several interpolation methods at different spatial resolutions. Liu et al. (2012) presented as a new methodology to DEM accuracy assessment based on approximation theory and illustrated its application to DEMs created by linear interpolation using contour lines as the source data.

Actually, statistical testing is an alternative approach to ensure the spatial data quality requirements being met at a moderate cost. Existing spatial data accuracy standards, such as the National Standard for Spatial Data Accuracy (NSSDA) used in the United States, commonly assume that positional error of spatial data are normally distributed (Zandbergen, 2008). However, such straightforward assumption may become a bottleneck of these methods for coping with the interference of outliers in reality. For example, DEM source data from airborne LiDAR may contain both low and high outliers due to the presence of high-rise buildings or flying objects, like birds. As a result, it makes sense to develop robust methods for measuring DEM accuracy, thereby yielding more reliable and not biased assessment results, particularly for those cases with heavily contaminated outliers.

Robust approaches to DEM accuracy assessment have aroused much attention in recent studies. For instance, Aguilar et al. (2007) explored three approaches for DEM accuracy assessment, the third one of which was a non-parametric approach based on the theory of estimating functions without the assumption of normal distribution. Zandbergen (2008) stated that a non-normal distribution of positional errors in spatial data had implications for spatial data accuracy standards and error propagation modeling, hence promoted specific recommendations were then made for revising the NSSDA. Besides, Höhle and Höhle (2009) proposed alternative robust statistical measures such as median, normalized median absolute deviation, and sample quantiles for replacing the traditional error indicator – RMSE for the accuracy assessment of DEMs derived from laser scanning and automated photogrammetry. Meanwhile, they also discussed requirements regarding the DEM reference data and employed the bootstrap technique for constructing the confidence intervals of each robust statistical measure, as well as furthermore settling the question of how large a sample size is requisite so as to obtain sufficiently precise estimates of standard deviation and the sample quantiles, such treatment scheme thereby establishing a potential systematic framework for DEM accuracy assessment by robust statistical methods.

This study is devoted to further developing innovative robust statistical methods for the DEM accuracy assessment. Ideally, a DEM accuracy assessment result should not be affected distinctly by any possible outliers from the raw data with its source error of approximate normal assumption. This research not only presents the robust statistics for quantitatively accessing DEM accuracy, but also provides their corresponding confidence intervals to indicate the overall variation of the DEM accuracy. In this study, the interpolated residual population is firstly generated for each different sampling site. After that, three average error statistics are then examined upon these residuals for measuring the DEM accuracy. Corresponding confidence intervals for each statistic are then further constructed for characterizing the error variation of DEM quality. The first approach is the t-distribution-based method as proposed by Aguilar et al. (2007), while the other two approaches are proposed, mainly based on robust theories, aiming to counteract the outlier effects or even the skew distribution circumstance of the interpolated DEM residuals. Performance of these three DEM accuracy assessment methods has been verified using the ASTER GDEM (the Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model) data from Shannxi Province of China typically due to the data's multiple morphological characteristics. Regarding the ASTER GDEM data, Slater et al. (2011) carried out empirical accuracy assessment upon a globally-distributed sample dataset and conducted statistical analyses by comparing the GDEM using known reference DEMs and ground control points (GCPs), with the results revealing a systematic bias in the ASTER GDEM elevations, higher average noise levels, and a lower effective ground resolution, as well as numerous topographic artifacts and anomalies. Besides, Chrysoulakis et al. (2011) pointed out that the ASTER GDEM production overall failed to meet its pre-production estimated vertical accuracy by investigating the whole area of Greece. However, such ASTER GDEM data remained widely used in terrain surface reconstruction, geomorphological modeling and landscape visualization. Upon such ASTER GDEM data, the following experimental studies based on Monte Carlo simulations indicate that robust methods produce more reliable assessment results compared with the classical t-distribution-based method.

The rest of this paper is organized as follows. Section ‘Problem definition’ gives the description of the research problem. In section ‘Three “average error” statistics and their corresponding confidence intervals’, three statistical methods, together with their corresponding confidence intervals, are proposed for conducting the DEM quality assessment. Their performances are investigated in section ‘Numerical experiments’ by numerical experiment upon the actual DEM data sets. Section ‘Results and analysis’ presents the follow-up discussions and findings from the experiments. Conclusions are provided in the final section ‘Conclusions’.

Section snippets

Problem definition

This study aims to develop innovative robust statistical methods upon the population of interpolated residuals for the DEM accuracy assessment. The procedure consists of first partially sampling seed points from the original DEM data. Upon these points an interpolated DEM surface is then generated using the bi-linear interpolation method. Checkpoints are chosen from the interpolated DEM, with their elevations compared with the corresponding “true values” from the original DEM data, thereby

Three “average error” statistics and their corresponding confidence intervals

In this section, three “average error” statistics for squared interpolated residual population are presented, as well as their corresponding confidence intervals.

Numerical experiments

The study area was chosen from Shannxi Province (left part of Fig. 2), locating in the hinterland of China, due to its multiple and diverse morphological characteristics, which are most suitable for conducting the performance examinations of here described robust statistical methods.

In this study, five sample regions were selected from this area with each sample region representing one typical topographic feature. Sample region A: between 34.06°–34.34° N and 108.56°–108.84° E is situated in the

Results and analysis

Table 2 provides the statistical characteristics of the residuals population for each of the five sample regions. The absolute mean values exert close relationship with the terrain complexity in each region. Apparently, these values appear smaller for a flat topography area and larger for rugged topography. Besides, it is found that systematic errors do obviously dominate and the standard deviations also reflect the roughness of the sampling regions. Meanwhile, the skewness statistic is also

Conclusions

Three average error statistics: the mean, median and M-estimator for the squared interpolated DEM residuals have been thoroughly investigated regarding their robustness of the DEM accuracy assessment. Confidence intervals associated with certain confidence levels (95%) are also constructed for each of these statistics. Experimental results indicate that the median and M-estimator perform soundly well in counterstriking the outlier efforts from the interpolated DEM residuals compared with the

Acknowledgments

Mr. Bin WANG is the beneficiary of a doctoral grant from the AXA Research Fund. And Dr. Wenzhong SHI acknowledges the funding support from Ministry of Science and Technology of China (Project No.: 2012AA12A305, 2012BAJ15B04). Besides, the authors would also like to thank Teng ZHONG from the Hong Kong University for providing the ASTER GDEM data, Mrs. Elaine Anson of The Hong Kong Polytechnic University, for her careful language polishing work, and the handling editor's efforts, as well as

References (23)

  • P.J. Huber et al.

    Robust Statistics

    (2009)
  • Cited by (25)

    • A multi-terrain feature-based deep convolutional neural network for constructing super-resolution DEMs

      2023, International Journal of Applied Earth Observation and Geoinformation
    • Super-resolution for terrain modeling using deep learning in high mountain Asia

      2023, International Journal of Applied Earth Observation and Geoinformation
    View all citing articles on Scopus
    View full text