Skip to main content

An Improved Multi-source Spatiotemporal Data Fusion Model Based on the Nearest Neighbor Grids for PM2.5 Concentration Interpolation and Prediction

  • Conference paper
  • First Online:
Data Mining and Big Data (DMBD 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1744))

Included in the following conference series:

  • 499 Accesses

Abstract

The acquisition of PM2.5 concentration mainly relies on small and provincial control air quality monitoring stations, respectively. The distribution of provincial control stations (PCSs) is sparse as its high cost, conversely the distribution of small stations is relatively dense and spread over the whole space as the relatively low cost, thus the observations of small stations can be employed to predict that of PCSs. Based on this considerations, in this paper, we propose a novel multi-source spatiotemporal data fusion method via the nearest neighbor grids, named MSF-NNG, to interpolate and predict PM2.5 concentration of PCSs by utilizing those data of small stations. Firstly, we divide the city into 1 km \(\times \) 1 km grids, and then Cressman interpolation method is employed to fill the missing ones with the observations of small stations, wherein the observations include PM2.5 concentrations, humidity, temperature and wind speed. Secondly, it needs to find the neighbors of a PCS based on its grid partitions. Thirdly, MSF-NNG is proposed to interpolate and predict the PM2.5 concentrations of PCS by fusing the information of PM2.5 concentrations, humidity, temperature and wind speed of the corresponding neighbor grids. Finally, comparison experiments are conducted on several data sets, the results show MSF-NNG method with obvious advantages in interpolation and prediction for PM2.5 concentrations over fourteen and twelve algorithms, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Stafoggia, M., et al.: Estimation of daily PM 10 and PM 2.5 concentrations in Italy, 2013–2015, using a spatiotemporal land-use random-forest model. Environ. Int. 124(1), 170–179(2019)

    Google Scholar 

  2. Chin, M.: Basic mechanisms for adverse cardiovascular events associated with air pollution. Heart 101(4), 253–256 (2015)

    Article  Google Scholar 

  3. Yao, C., Cao, Z., Han, Y.: Industrial agglomeration, population urbanization, land urbanization and environment pollution. Areal Res. Dev. 39(5), 145–149 (2020)

    Google Scholar 

  4. Zheng, Y., et al.: Forecasting fine-grained air quality based on big da ta. In: 21th ACM SIGKDD International Conference on Knowledge Discovery Data Mining., pp. 2267–2276 (2010)

    Google Scholar 

  5. Jutzeler, A., Li, J., Faltings, B.: A Region-based model for estimating urban air pollution. In: 28th AAAI Conference on Artificial Intelligence, pp. 425–430 (2014)

    Google Scholar 

  6. Chen, L., Cai, Y., Ding, Y., Lv, M., Yuan, C., Chen, G.: Spatially fine-grained urban air quality estimation using ensemble semi-supervised learning and pruning. In: 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 1076–1087 (2016)

    Google Scholar 

  7. Zheng, Y., Liu, F., Hsieh, H.: U-air: when urban air quality inference meets big data. In: 19th ACM SIGKDD The International Conference on Knowledge Discovery and Data Minin, pp. 1436–1444(2013)

    Google Scholar 

  8. Liu, X., Wang, X., Zou, L., Xia, J., Pang, W.: Spatial imputation for air pollutants data sets via low rank matrix completion algorithm. Environ. Int. 139, Art. no. 105713 (2020)

    Google Scholar 

  9. Qin, M., Du, Z., Zhang, F., Liu, R.: A matrix completion-based multiview learning method for imputing missing values in buoy monitoring data. Inf. Sci. 487(2), 18–30 (2019)

    Article  Google Scholar 

  10. Qi, Z., Wang, T., Song, G., Hu, W., Zhang, Z.: Deep air learning interpolation, prediction, and feature analysis of fine-grained air quality. IEEE Trans. Knowl. Data Eng. 30(23), 2285–2297 (2018)

    Article  Google Scholar 

  11. Chen, Z., et al.: Extreme gradient boosting model to estimate PM2.5 concentrations with missing-filled satellite data in China. Atmos. Environ. 202(1), 180–189 (2019)

    Google Scholar 

  12. Malings, C., et al.: Development of a general calibration model and long-term performance evaluation of low-cost sensors for air pollutant gas monitoring. Atmos. Meas. Tech. 12(2), 903–920 (2019)

    Article  Google Scholar 

  13. Liu, N., Ma, R., Wang, Y., Zhang, L.: Inferring fine-grained air pollution map via a spatiotemporal super-resolution scheme. In: 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and the 2019 ACM International Symposium ACM, pp. 498–504 (2019)

    Google Scholar 

  14. Ma, R., et al.: Fine-grained air pollution inference with mobile sensing systems: a weather-related deep autoencoder model. In: 2020 ACM on Interactive Mobile Wearable and Ubiquitous Technologies, Art. no. 52 (2020)

    Google Scholar 

  15. Li, J., Heap, A.: Spatial interpolation methods applied in the environmental sciences-a review. Environ. Modell. Softw. 53(12), 174–189 (2014)

    Google Scholar 

  16. Sekulic, A., Kilibarda, M., Heuvelink, G., Nikolic, M., Bajat, B.: Random Forest Spatial Interpolation. Remote Sens. 12, Art. no. 1687(2020)

    Google Scholar 

  17. Wei, J., et al.: Estimating 1-KM-resolution PM2.5 concentrations across China using the space-time random forest approach. Remote Sens. Environ. 231, Art. no. 111221 (2019)

    Google Scholar 

  18. Wei, J., et al.: Improved 1 km resolution PM2.5 estimates across China using enhanced space-time extremely randomized trees. Atmos. Chem. Phys. 20(6), 3273–3289 (2020)

    Google Scholar 

  19. Li, T., Shen, H., Zeng, C., Yuan, Q., Zhang, L.: Point-surface fusion of station measurements and satellite observations for mapping PM2.5 distribution in China: Methods and assessment. Atmos. Environ. 152(1), 477–489 (2017)

    Google Scholar 

  20. Huang, G., Li, X., Zhang, B., Ren, J.: PM2.5 concentration forecasting at surface monitoring sites using GRU neural network based on empirical mode decomposition. Sci. Total Environ. 768(3), Art. no. 144516 (2021)

    Google Scholar 

  21. Wu, X., Wang, Y., He, S., Wu, Z.: PM2.5_PM10 ratio prediction based on a long short-term memory neural network in Wuhan, China. Geosci. Model Dev. 13(3), 1499–1511 (2020)

    Google Scholar 

  22. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  23. Ahmed, M., Xiao, Z., Shen, Y.: Estimation of ground PM2.5 concentrations in pakistan using convolutional neural network and multi-pollutant satellite images. Remote Sens. 14, Art. no. 1735 (2022)

    Google Scholar 

  24. Cressman, G.: An operational objective analysis system. Mon. Weather Rev. 87(10), 367–374 (1959)

    Article  Google Scholar 

  25. Liu, Z., Huang, R., Hu, Y., Fan, S., Feng, P.: Generating high spatiotemporal resolution LAI based on MODIS/GF-1 data and combined Kriging-Cressman interpolation. Int. J. Agric. Biol. Eng. 9(5), 120–131 (2016)

    Google Scholar 

  26. Li, L., Zhang, J., Wang, Y., Ran, B.: Missing value imputation for traffic-related time series data based on a multi-view learning method. IEEE Trans. Intell. Transp. Syst. 20(8), 2933–2943 (2019)

    Article  Google Scholar 

  27. Willmott, C., Rowe, C., Philpot, W.: Small-scale climate maps: a sensitivity analysis of some common assumptions associated with grid-point interpolation and contouring. Am. Cartograph. 12(1), 5–16 (1985)

    Article  Google Scholar 

  28. Cormack, R., Cressie, N.: Statistics for spatial data. Int. Biometric Soc. 48(4), 1300–1302 (1992)

    Article  Google Scholar 

  29. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  30. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)

    Article  MATH  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the support of the National Key R &D Program of China (2019YFB2103000), the State Key Program of National Nature Science Foundation of China (61936001), the Natural Science Foundation of Chongqing (cstc2019jcyj-cxttX0002, cstc2020jcyj- msxmX0737, cstc2021ycjh-bgzxm0013), the Key Cooperation Project of Chongqing Municipal Education Commission (HZ2021008), and the Science and Technology Research Program of Chongqing Education Commission of China (KJQN201900638).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xiaxia Zhang or Guoyin Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, X., Hu, J., Zhou, P., Wang, G. (2022). An Improved Multi-source Spatiotemporal Data Fusion Model Based on the Nearest Neighbor Grids for PM2.5 Concentration Interpolation and Prediction. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2022. Communications in Computer and Information Science, vol 1744. Springer, Singapore. https://doi.org/10.1007/978-981-19-9297-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-9297-1_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-9296-4

  • Online ISBN: 978-981-19-9297-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics