Skip to main content

Advertisement

Log in

Long-term time-series pollution forecast using statistical and deep learning methods

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Tackling air pollution has become of utmost importance since the last few decades. Different statistical as well as deep learning methods have been proposed till now, but seldom those have been used to forecast future long-term pollution trends. Forecasting long-term pollution trends into the future is highly important for government bodies around the globe as they help in the framing of efficient environmental policies. This paper presents a comparative study of various statistical and deep learning methods to forecast long-term pollution trends for the two most important categories of particulate matter (PM) which are PM2.5 and PM10. The study is based on Kolkata, a major city on the eastern side of India. The historical pollution data collected from government set-up monitoring stations in Kolkata are used to analyse the underlying patterns with the help of various time-series analysis techniques, which is then used to produce a forecast for the next two years using different statistical and deep learning methods. The findings reflect that statistical methods such as auto-regressive (AR), seasonal auto-regressive integrated moving average (SARIMA) and Holt–Winters outperform deep learning methods such as stacked, bi-directional, auto-encoder and convolution long short-term memory networks based on the limited data available.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Mahajan S, Chen LJ, Tsai TC (2017) An empirical study of PM2.5 forecasting using neural network. https://doi.org/10.1109/UIC-ATC.2017.8397443

  2. Xiang X (2019) Forecasting air pollution PM2.5 in beijing using weather data and multiple kernel learning. J Forecast. https://doi.org/10.1002/for.2599

  3. Xie J (2017) Deep neural network for PM2.5 pollution forecasting based on manifold learning. In: 2017 international conference on sensing, diagnostics, prognostics, and control (SDPC), pp 236–240

  4. Luo C, Yang H, Huang L, Mahajan S, Chen L (2018) A fast PM2.5 forecast approach based on time-series data analysis, regression and regularization. In: 2018 conference on technologies and applications of artificial intelligence (TAAI), pp 78–81

  5. Feng X, Li Q, Zhu Y, Hou J, Jin L, Wang J (2015) Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos Environ 107:118–128. https://doi.org/10.1016/j.atmosenv.2015.02.030. http://www.sciencedirect.com/science/article/pii/S1352231015001491

  6. Haiming Z, Xiaoxiao S (2013) Study on prediction of atmospheric PM2.5 based on RBF neural network. In: 2013 4th international conference on digital manufacturing automation, pp 1287–1289

  7. Yan L, Wu Y, Yan L, Zhou M (2018) Encoder–decoder model for forecast of PM2.5 concentration per hour. In: 2018 1st international cognitive cities conference (IC3), pp 45–50

  8. Cortina-Januchs MG, Quintanilla-Dominguez J, Vega-Corona A, Andina D (2015) Development of a model for forecasting of PM10 concentrations in Salamanca, Mexico. Atmos Pollut Res 6(4):626–634. https://doi.org/10.5094/APR.2015.071. http://www.sciencedirect.com/science/article/pii/S1309104215301951

  9. Al-kasassbeh M, Sheta A, Faris H, Turabieh H (2013) Prediction of PM10 and tsp air pollution parameters using artificial neural network autoregressive, external input models: a case study in salt, jordan. Middle-East J Sci Res 14:999–1009. https://doi.org/10.5829/idosi.mejsr.2013.14.7.2171

    Article  Google Scholar 

  10. Lam LH, Mok KM (2007) Prediction of ambient pm10 concentration with artificial neural network. In: Computational methods in engineering and science. Springer, Berlin, Heidelberg, p 276

  11. Das M, Maiti SK, Mukhopadhyay U (2006) Distribution of PM2.5 and PM10-2.5 in PM10 fraction in ambient air due to vehicular pollution in Kolkata megacity. Environ Monit Assess 122(1–3):111–123

    Article  Google Scholar 

  12. Jiao K, Xu M, Liu M (2018) Health status and air pollution related socioeconomic concerns in urban china. Int J Equ Health 17(1):1–11

    Google Scholar 

  13. Ong BT, Sugiura K, Zettsu K (2016) Dynamically pre-trained deep recurrent neural networks using environmental monitoring data for predicting PM2.5. Neural Comput Appl 27:1553–1566. https://doi.org/10.1007/s00521-015-1955-3

    Article  Google Scholar 

  14. Bashir Shaban K, Kadri A, Rezk E (2016) Urban air pollution monitoring system with forecasting models. IEEE Sensors J 16(8):2598–2606

    Article  Google Scholar 

  15. Tao Q, Liu F, Li Y, Sidorov D (2019) Air pollution forecasting using a deep learning model based on 1D convnets and bidirectional GRU. IEEE Access 7:76690–76698

    Article  Google Scholar 

  16. Liang X, Zou T, Guo B, Li S, Zhang H, Zhang S, Huang H, Chen S (2015) Assessing Beijing’s PM2.5 pollution: severity, weather impact, apec and winter heating. Proc R Soc A: Math, Phys Eng Sci 471:257. https://doi.org/10.1098/rspa.2015.0257

  17. Mlakar P (1997) Determination of features for air pollution forecasting models. In: Proceedings intelligent information systems, IIS’97, pp 350–354

  18. Li T, Hua M, Wu X (2020) A hybrid CNN-LSTM model for forecasting particulate matter (PM2.5). IEEE Access 8:26933–26940

    Article  Google Scholar 

  19. Wang W, Guo Y (2009) Air pollution PM2.5 data analysis in los angeles long beach with seasonal arima model. In: 2009 international conference on energy and environment technology, vol 3, pp 7–10

  20. Bai L, Wang J, Ma X, Lu H (2018) Air pollution forecasts: an overview. Int J Environ Res Public Health 15(4):780

    Article  Google Scholar 

  21. Kurt A, Gulbagci B, Karaca F, Alagha O (2008) An online air pollution forecasting system using neural networks. Environ Int 34(5):592–598

    Article  Google Scholar 

  22. Xu X (2020) Forecasting air pollution PM2.5 in Beijing using weather data and multiple kernel learning. J Forecast 39(2):117–125

    Article  MathSciNet  Google Scholar 

  23. Norazian MN, Shukri YA, Azam RN et al (2008) Estimation of missing values in air pollution data using single imputation techniques. ScienceAsia 34(3):341–345

    Article  Google Scholar 

  24. Bandyopadhyay K (1644) Banned vehicles found plying in kolkata in november. Times News Network. http://timesofindia.indiatimes.com/articleshow/73062554.cms

  25. MacNee W, Donaldson K (2003) Mechanism of lung injury caused by PM10 and ultrafine particles with special reference to COPD. Eur Respir J 21(40 suppl):47s–51s. https://doi.org/10.1183/09031936.03.00403203. https://erj.ersjournals.com/content/21/40_suppl/47s

  26. Ministry of Environment, Forest and Climate Change, Govt. of India: Central Pollution Control Board. http://www.cpcb.nic.in/. Accessed 15 Aug 2020

  27. Kissock JK University of Dayton Average Daily Temperature Archive. http://academic.udayton.edu/kissock/http/Weather/. Accessed 15 Aug 2020

  28. The Weather Company (IBM): Weather Underground. https://www.wunderground.com/. Accessed 15 Aug 2020

  29. US Department of State: Air Now International US Embassies and Consulates. https://www.airnow.gov/international/us-embassies-and-consulates/. Accessed 15 Aug 2020

  30. Buck SF (1960) A method of estimation of missing values in multivariate data suitable for use with an electronic computer. J R Stat Soc, Ser B (Methodol) 22(2):302–306. http://www.jstor.org/stable/2984099

  31. Hodrick RJ, Prescott EC (1997) Postwar U.S. business cycles: an empirical investigation. J Money, Credit Banking 29(1):1–16. http://www.jstor.org/stable/2953682

  32. Box GEP, Jenkins G (1990) Time series analysis, Forecasting and control. Holden-Day Inc, USA

    MATH  Google Scholar 

  33. Fuller WA (1976) Introduction to statistical time series. Wiley, New York

    MATH  Google Scholar 

  34. Manuca R, Savit R (1996) Stationarity and nonstationarity in time series analysis. Phys. D: Nonlinear Phenom 99(2–3):134–161

    Article  MathSciNet  Google Scholar 

  35. Winters PR (1960) Forecasting sales by exponentially weighted moving averages. Manag Sci 6(3):324–342. https://doi.org/10.1287/mnsc.6.3.324

    Article  MathSciNet  MATH  Google Scholar 

  36. Walker GT (1931) On periodicity in series of related terms. Proc R Soc Lond, Ser A, Contain Pap Math Phys Character 131(818):518–532

    MATH  Google Scholar 

  37. Taylor SJ, Letham B (2018) Forecasting at scale. Am Stat 72(1):37–45

    Article  MathSciNet  Google Scholar 

  38. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  39. Kramer MA (1991) Nonlinear principal component analysis using autoassociative neural networks. AIChE J 37(2):233–243. https://doi.org/10.1002/aic.690370209

    Article  Google Scholar 

  40. Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681

    Article  Google Scholar 

  41. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324. https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  42. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. In Rumelhart DE, Mcclelland JL (ed) Parallel distributed processing: explorations in the microstructure of cognition. Foundations, vol 1. MIT Press, pp 318–362

  43. Shi X, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Proceedings of the 28th international conference on neural information processing systems, vol 1. MIT Press, pp 802–810

  44. Cortina-Januchs MG, Quintanilla-Dominguez J, Vega-Corona A, Andina D (2015) Development of a model for forecasting of PM10 concentrations in Salamanca. Mexico. Atmos Pollut Res 6(4):626–634

    Article  Google Scholar 

  45. Middya AI, Roy S, Dutta J, Das R (2020) Jusense: a unified framework for participatory-based urban sensing system. Mob Netw Appl 25:1249–1274

    Article  Google Scholar 

  46. Dutta J, Chowdhury C, Roy S, Middya A, Gazi F (2017) Towards smart city: sensing air quality in city based on opportunistic crowd-sensing. In: Proceedings of the 18th international conference on distributed computing and networking. Association for Computing Machinery

  47. Wang X, Smith-Miles K, Hyndman R (2009) Rule induction for forecasting method selection: meta-learning the characteristics of univariate time series. Neurocomputing 72(10–12):2581–2594

    Article  Google Scholar 

  48. Armstrong JS (2001) Principles of forecasting: a handbook for researchers and practitioners, vol 30. Springer

  49. Meade N (2000) Evidence for the selection of forecasting methods. J Forecast 19(6):515–535

    Article  Google Scholar 

  50. Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley

  51. Moritz S, Bartz-Beielstein T (2017) imputeTS: time series missing value imputation in R. R J 9(1):207

    Article  Google Scholar 

  52. Ravn MO, Uhlig H (2002) On adjusting the Hodrick–Prescott filter for the frequency of observations. Rev Econ Stat 84(2):371–376

    Article  Google Scholar 

  53. Borio C (2014) The financial cycle and macroeconomics: what have we learnt? J Bank Finance 45:182–198

    Article  Google Scholar 

  54. Kirchgässner G, Wolters J, Hassler U (2012) Introduction to modern time series analysis. Springer

  55. Żbikowski K (2015) Using volume weighted support vector machines with walk forward testing and feature selection for the purpose of creating stock trading strategy. Expert Syst Appl 42(4):1797–1805

    Article  Google Scholar 

  56. Kirkpatrick CD II, Dahlquist JA (2010) Technical analysis: the complete resource for financial market technicians. FT Press

  57. Van Rossum G, Drake FL Jr (1995) Python tutorial. Centrum voor Wiskunde en Informatica Amsterdam, The Netherlands

    Google Scholar 

  58. van der Walt S, Colbert SC, Varoquaux G (2011) The Numpy array: a structure for efficient numerical computation. Comput Sci Eng 13(2):22–30

    Article  Google Scholar 

  59. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado G, Davis A, Dean J, Devin M et al (2015) TensorFlow: large-scale machine learning on heterogeneous systems

  60. Seabold S, Perktold J (2010) Statsmodels: econometric and statistical modeling with python. In Proceedings of the 9th python in science conference, p 61

  61. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830

    MathSciNet  MATH  Google Scholar 

  62. Lhabitant FS (2011) Correlation vs. trends: a common misinterpretation. https://risk.edhec.edu/sites/risk/files/1328885974025.pdf. Accessed 15 Aug 2020

  63. Jayamurugan R, Kumaravel B, Palanivelraja S, Chockalingam M (2013) Influence of temperature, relative humidity and seasonal variability on ambient air quality in a coastal urban area. Int J Atmos Sci 2013:1–7

    Google Scholar 

  64. Karar K, Gupta AK, Kumar A, Biswas AK (2006) Seasonal variations of PM10 and TSP in residential and industrial sites in an urban area of Kolkata, India. Environ Monitor Assess 118(1–3):369–381

    Article  Google Scholar 

  65. World Health Organization: Ambient (outdoor) air quality and health. https://www.who.int/news-room/fact-sheets/detail/ambient-(outdoor)-air-quality-andhealth. Accessed 15 Aug 2020

Download references

Acknowledgements

This comparative study was supported by the project entitled—“Participatory and Realtime Pollution Monitoring System For Smart City”, funded by the Department of Science and Technology, Government of West Bengal, India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarbani Roy.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nath, P., Saha, P., Middya, A.I. et al. Long-term time-series pollution forecast using statistical and deep learning methods. Neural Comput & Applic 33, 12551–12570 (2021). https://doi.org/10.1007/s00521-021-05901-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-05901-2

Keywords