Skip to main content
Log in

Low-quality multivariate spatio-temporal serial data preprocessing

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

As the accumulation of spatio-temporal data, the low-quality problems of multivariate spatio-temporal data become clear and mainly present that numerous missing data, high noise of time series and great different spatial scale of spatiotemporal data. Aimed at the low-quality problems of multivariate spatio-temporal series, we propose three methods to process them, firstly, using improved non-local means (NLM) algorithm and nonlinear regression analysis method to achieve incomplete data imputation; secondly, using NLM algorithm to deal with Gaussian white noise in time series; thirdly, using the Gaussian pyramid method to zoom the spatial scale for multivariate spatial data. Besides, we compare with traditional methods using quantitative evaluation indices to measure the performance, including K-nearest and wavelet threshold. The experimental simulations indicate that the interpolating accuracy of the two proposed algorithms are higher than K-nearest neighbor algorithm and the effect of denoising using NLM algorithm is obviously better than wavelet threshold method. The experiment results further indicate that the Gaussian pyramid method effectively achieves spatial scale transformation of multivariate spatial data and keeps the local detail characteristics of spatial data with better visual expression effect.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Li, X., Wang, L.: On the study of fusion techniques for bad geological remote sensing image. J. Ambient Intell. Humaniz. Comput. 6(1), 141–149 (2015)

    Article  Google Scholar 

  2. Wang, Y., Liu, Z., Liao, H., Li, C.: Improving the performance of GIS polygon overlay computation with MapReduce for spatial big data processing. Cluster Comput. 18(2), 507–516 (2015)

    Article  Google Scholar 

  3. He, Z., Wu, C., Liu, G., Zheng, Z., Tian, Y.: Decomposition tree: a spatio-temporal indexing method for movement big data. Cluster Comput. 18(4), 1481–1492 (2015)

    Article  Google Scholar 

  4. Sun, S., Song, W., Zomaya, A.Y., Xiang, Y., Choo, K.R., Shah, T., Wang, L.: Associative retrieval in spatial big data based on spreading activation with semantic ontology. Future Gener. Comput. Syst. 76, 499–509 (2017)

    Article  Google Scholar 

  5. Ma, Y., Wu, H., Wang, L., Huang, B., Ranjan, R., Zomaya, A.Y., Jie, W.: Remote sensing big data computing: challenges and opportunities. Future Gener. Comput. Syst. 51, 47–60 (2015)

    Article  Google Scholar 

  6. Ma, Y., Wang, L., Liu, P., Ranjan, R.: Towards building a data-intensive index for big data computing—a case study of remote sensing data processing. Inf. Sci. 319(C), 171–188 (2015)

    Article  Google Scholar 

  7. Wang, L., Geng, H., Liu, P., Lu, K., Kolodziej, J., Ranjan, R., Zomaya, A.Y.: Particle swarm optimization based dictionary learning for remote sensing big data. Knowl.-Based Syst. 79(C), 43–50 (2015)

    Article  Google Scholar 

  8. Wei, K., Guo, S., Li, X., Zeng, D., Xu, K.: Congestion control in social-based sensor networks: a social network perspective. Peer-to-Peer Netw. Appl. 9(4), 681–691 (2015)

    Article  Google Scholar 

  9. Peng, J., Choo, K.-K.R., Ashman, H.: Bit-level n-gram based forensic authorship analysis on social media: identifying individuals from linguistic profiles. J. Netw. Comput. Appl. 70(C), 171–182 (2016)

    Article  Google Scholar 

  10. Chen, J., Chen, D., Li, X., Zhang, K.: Towards improving social communication skills with multimodal sensory information. IEEE Trans. Ind. Inform. 10(1), 323–330 (2013)

    Article  Google Scholar 

  11. Dou, M., Chen, J., Chen, D., Chen, X., Deng, Z., Zhang, X., Xu, K., Wang, J.: Modeling and simulation for natural disaster contingency planning driven by high-resolution remote sensing images. Future Gener. Comput. Syst. 37(C), 367–377 (2014)

    Article  Google Scholar 

  12. Sun, S., Wang, L., Ranjan, R., Wu, A.: Semantic analysis and retrieval of spatial data based on the uncertain ontology model in digital earth. Int. J. Digit. Earth. 8(1), 3–16 (2015)

    Article  Google Scholar 

  13. Deb, R., Liew, A.W.C.: Missing value imputation for the analysis of incomplete traffic accident data. Inform. Sci. 339, 274–289 (2016)

    Article  Google Scholar 

  14. Yozgatligil, C., Aslan, S., Iyigun, C., Batmaz, I.: Comparison of missing value imputation methods in time series: the case of Turkish meteorological data. Theor. Appl. Climatol. 112(1–2), 143–167 (2013)

    Article  Google Scholar 

  15. Kombo, A.Y., Mwambi, H., Molenberghs, G.: Multiple imputation for ordinal longitudinal data with monotone missing data patterns. J. Appl. Stat. 44(2), 1–18 (2016)

    MathSciNet  Google Scholar 

  16. Kwon, T.Y., Park, Y.: A new multiple imputation method for bounded missing values. Stat. Probab. Lett. 107, 204–209 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  17. Lee, K.J., Roberts, G., Doyle, L.W., Anderson, P.J., Carlin, J.B.: Multiple imputation for missing data in a longitudinal cohort study: a tutorial based on a detailed case study involving imputation of missing outcome data. Int. J. Soc. Res. Methodol. 19(5), 1–17 (2016)

    Article  Google Scholar 

  18. Pampaka, M., Hutcheson, G., Williams, J.: Handling missing data: analysis of a challenging data set using multiple imputation. Int. J. Res. Method Educ. 39(1), 19–37 (2016)

    Article  Google Scholar 

  19. Wu, W., Jia, F., Enders, C.: A comparison of imputation strategies for ordinal missing data on likert scale variables. Multivar. Behav. Res. 50(5), 484–503 (2015)

    Article  Google Scholar 

  20. Liu, Z., Pan, Q., Dezert, J., Martin, A.: Adaptive imputation of missing values for incomplete pattern classification. Pattern Recogn. 52(C), 85–95 (2016)

    Article  Google Scholar 

  21. Zainuri, N.A., Jemain, A.A., Muda, N.: A comparison of various imputation methods for missing values in air quality data. Sains Malays. 44(3), 449–456 (2015)

    Article  Google Scholar 

  22. Lingras, P., Zhong, M., Sharma, S.: Evolutionary Regression And Neural Imputations Of Missing Values. Soft Computing Applications in Industry, pp. 449–456. Springer, Berlin (2015)

    Google Scholar 

  23. Zhang, S.: Shell-neighbor method and its application in missing data imputation. Appl. Intell. 35(1), 123–133 (2011)

    Article  Google Scholar 

  24. Teegavarapu, R.S.V.: Missing precipitation data estimation using optimal proximity metric-based imputation, nearest-neighbor classification and cluster-based interpolation methods. Hydrol. Sci. J. 59(11), 2009–2026 (2014)

    Article  Google Scholar 

  25. Liao, Z., Lu, X., Yang, T., Wang, H.: Missing data imputation: a fuzzy K-means clustering algorithm over sliding window. In: 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 3, pp. 133–137. IEEE (2009)

  26. Bessissi, Z., Terbeche, M., Ghezali, B.: Wavelet application to the time series analysis of DORIS station coordinates. C R Geosci. 341(6), 446–461 (2009)

    Article  Google Scholar 

  27. Allabakash, S., Yasodha, P., Reddy, S.V., Srinivasulu, P.: Wavelet transform-based methods for removal of ground clutter and denoising the radar wind profiler data. IET Signal Process. 9(5), 440–448 (2015)

    Article  Google Scholar 

  28. Yadav, S.K., Sinha, R., Bora, P.K.: Electrocardiogram signal denoising using non-local wavelet transform domain filtering. IET Signal Process. 9(1), 88–96 (2016)

    Article  Google Scholar 

  29. Liu, D., Fu, Q.: Monthly precipitation time-series analysis of low-lying wetland in Sanjiang Plian based on wavelet denoising. Res. Soil Water Conserv. 15(2), 164–167 (2008)

    Google Scholar 

  30. Du, W., Tao, J., Gong, X., Gong, L., Liu, C.: Dual-tree complex wavelet transform based multifractal detrended fluctuation analysis for nonstationary time series. Acta Phys. Sin. 65 (2016)

  31. Gao, J., Sultan, H., Hu, J., Tung, W.: Denoising nonlinear time series by adaptive filtering and wavelet shrinkage: a comparison. IEEE Signal Process. Let. 17(3), 237–240 (2015)

    Google Scholar 

  32. Liu, Y., Yang, G., Li, M., Yin, H.: Variational mode decomposition denoising combined the detrended fluctuation analysis. Signal Process. 125(C), 349–364 (2016)

    Article  Google Scholar 

  33. Galford, G.L., et al.: Wavelet analysis of MODIS time series to detect expansion and intensification of row-crop agriculture in Brazil. Remote Sens. Environ. 112(2), 576–587 (2008)

    Article  Google Scholar 

  34. Sun, H., Xu, A., Lin, H., Zhang, L.: Optimization of frequency domain denoising algorithms for time-series vegetation index. Remot Sens. Inf. 28(1), 24–28 (2013)

    Google Scholar 

  35. Yang, D., Deng, L., An, X., Yang, K.: Optimal noise models for coordinate time series of IGS reference station. J. Geom. 41, 7–10 (2016)

    Google Scholar 

  36. Wu, J., Jelinski, D.E., Luck, M., Tueller, P.T.: Multiscale analysis of landscape heterogeneity: scale variance and pattern metrics. Geogr. Inf. Sci. 6(1), 6–19 (2000)

    Google Scholar 

  37. Stoter, J., Visser, T., van Oosterom, P., Quak, W., Bakker, N.: A semantic-rich multi-scale information model for topography. Int. J. Geogr. Inf. Sci. 25(5), 739–763 (2011)

    Article  Google Scholar 

  38. Abdulle, A., Weinan, E., Engquist, B., Vanden-Eijnden, E.: The heterogeneous multiscale method. Acta Numer. 21, 1–87 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  39. Tao, X., et al.: Scale transformation of leaf area index product retrieved from multiresolution remotely sensed data: analysis and case studies. Int. J. Remote Sens. 30(20), 5393–5395 (2009)

    Article  Google Scholar 

  40. Verburg, P.H., Neumann, K., Nol, L.: Challenges in using land use and land cover data for global change studies. Global Change Biol. 17(2), 974–989 (2011)

    Article  Google Scholar 

  41. Le Coz, M., Delclaux, F., Genthon, P., Favreau, G.: Assessment of digital elevation model (DEM) aggregation methods for hydrological modeling: Lake Chad basin, Africa. Comput. Geosci. 35(8), 1661–1670 (2009)

    Article  Google Scholar 

  42. Meng, B., Wang, J.F.: A review on the methodology of scaling with geo-data. Acta Geogr. Sin. 60(2), 277–288 (2005)

    Google Scholar 

  43. Yang, Q.K., Guo, W.L., Li, R.: Genaralizing the fine resolution DEMs with filtering method. Bull. Soil Water Conserv. 28, 58–62 (2008)

    Google Scholar 

  44. Dendoncker, N., Schmit, C., Rounsevell, M.: Exploring spatial data uncertainties in land-use change scenarios. Int. J. Geogr. Inf. Sci. 22(9), 1013–1030 (2008)

    Article  Google Scholar 

  45. Shortridge, J.E., Falconi, S.M., Zaitchik, B.F., Guikema, S.D.: Climate, agriculture, and hunger: statistical prediction of undernourishment using nonlinear regression and data-mining techniques. J. Appl. Stat. 42(11), 2367–2390 (2015)

    Article  MathSciNet  Google Scholar 

  46. Xu, Q.: The research on non-linear regression analysis methods. Master Thesis, Hefei University of Technology (2009)

  47. Efros, A.A., Leung, T.K.: Texture synthesis by non-parametric sampling. In: The Proceedings of the Seventh IEEE International Conference on, vol. 2, pp. 1033–1038 (1999)

  48. Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 60–65 (2005)

  49. Buades, A., Coll, B., Morel, J.M.: A review of image denoising algorithms, with a new one. Multiscale Model. Sim. 4(2), 490–530 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  50. Brox, T., Cremers, D.: Iterated nonlocal means for texture restoration. In: International Conference on Scale Space and Variational Methods in Computer Vision. Springer, Berlin, vol. 4485, pp. 13–24 (2007)

  51. Chai, T., Draxler, R.R.: Root mean square error (RMSE) or mean absolute error (MAE)? Arguments against avoiding RMSE in the literature. Geosci. Model. Dev. 7(3), 1247–1250 (2014)

    Article  Google Scholar 

  52. Galiano, G., Velasco, J.: On a nonlocal spectrogram for denoising one-dimensional signals. Appl. Math. Comput. 244(2), 859–869 (2014)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lajiao Chen or Weijing Song.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, T., Li, L., Chen, L. et al. Low-quality multivariate spatio-temporal serial data preprocessing. Cluster Comput 22 (Suppl 1), 2357–2370 (2019). https://doi.org/10.1007/s10586-017-1453-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-1453-8

Keywords

Navigation