Skip to main content
Log in

High granular and short term time series forecasting of \(\hbox {PM}_{2.5}\) air pollutant - a comparative review

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Forecasting time series has acquired immense research importance and has vast applications in the area of air pollution monitoring. This work attempts to investigate the abilities of various existing techniques when applied for short term, high granular time series forecasting of PM2.5. More specifically, a comparative study has been provided, taking into account both popularly used models and lesser-used models in this area. The study has been carried out considering ten well defined models that are ARIMA (auto-regressive integrated moving average), SARIMA (seasonal ARIMA), SES (single exponential smoothing), DES (double exponential smoothing), TES (triple exponential smoothing), ANN (artificial neural network), DT (decision tree), kNN (k-nearest neighbor), LSTM (long short-term memory) and MCFO (markov chain first order). A framework has been built that categories the models, implements them under identical execution environment and forecasts succeeding values. Implementation has been carried out over five data sets of real-world air pollution time series, that are collected from five differently located government setup monitoring stations over a period of 1 year (July 2018-June 2019). Rigorous statistical analysis has been performed that yields an insight to the nature and variability of these time series data. Forecasting has been carried out on short term basis, focusing on high granularity whereas, three different lengths of forecast horizon (1 day, 1 week, and 1 month) have been tested. Eventually, the models have been compared in terms of their associated performance measuring units namely, RMSE (root mean of squared error), MAE (mean absolute error) and MAPE (mean absolute percentage error). The comparative results verified with multiple datasets show that all the models posses less error for a shorter forecast horizon, where LSTM providing the best performance. Superiority of machine learning and deep learning models are found in case of longer length of forecast horizon with kNN achieving best accuracy whereas, significant performance degradation of ARIMA is found for longer forecast horizon. Moreover, TES, DT, kNN, LSTM, MCFO are found to be well adopted in relation with shape and variability of the data. Note that the performance on various length of high granular forecast horizon have been studied over multiple datasets that give an added value to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Al-Qahtani FH, Crone SF (2013, August). Multivariate k-nearest neighbour regression for time series data–A novel algorithm for forecasting UK electricity demand. In The 2013 international joint conference on neural networks (IJCNN), IEEE. pp.1-8

  • Amato A, Calabrese M, Di Lecce V (2008, May) Decision trees in time series reconstruction problems. In 2008 IEEE Instrumentation and Measurement Technology Conference, IEEE. pp.895–899

  • Bose R, Dey RK, Roy S, Sarddar D (2020) Time series forecasting using double exponential smoothing for predicting the major ambient air pollutants. Information and communication technology for sustainable development. Springer, Singapore, pp 603–613

    Chapter  Google Scholar 

  • Breiman L, Friedman JH, Olshen R, Stone CJ (1984) Classification and regression trees. Wadsworth & Brooks/Cole Advanced Books & Software, Pacific California

    MATH  Google Scholar 

  • Brunelli U, Piazza V, Pignato L, Sorbello F, Vitabile S (2007) Two-days ahead prediction of daily maximum concentrations of \(\text{ SO}_{2}\), \(\text{ O}_{3}\), \(\text{ PM}_{10}\), \(\text{ NO}_{2}\), CO in the urban area of Palermo, Italy. Atmos Environ 41(14):2967–2995

    Article  Google Scholar 

  • Caillault ÉP, Bigand A (2018, September). Comparative study on Univariate forecasting methods for Meteorological time series. In 2018 26th European Signal Processing Conference (EUSIPCO),IEEE, pp.2380-2384

  • Chelani AB (2005) Predicting chaotic time series of PM10 concentration using artificial neural network. Int J Environ Stud 62(2):181–191

    Article  Google Scholar 

  • Choubin B, Zehtabian G, Azareh A, Rafiei-Sardooi E, Sajedi-Hosseini F, Kişi Ö (2018) Precipitation forecasting using classification and regression trees (CART) model: a comparative study of different approaches. Environ Earth Sci 77(8):314

    Article  Google Scholar 

  • Chu HJ, Lin CY, Liau CJ, Kuo YM (2012) Identifying controlling factors of ground-level ozone levels over southwestern Taiwan using a decision tree. Atmos Environment 60:142–152

    Article  Google Scholar 

  • Das M, Ghosh SK (2018) Data-driven approaches for meteorological time series prediction: a comparative study of the state-of-the-art computational intelligence techniques. Pattern Recognit Lett 105:155–164

    Article  Google Scholar 

  • Dastorani M, Mirzavand M, Dastorani MT, Sadatinejad SJ (2016) Comparative study among different time series models applied to monthly rainfall forecasting in semi-arid climate condition. Nat Hazards 81(3):1811–1827

    Article  Google Scholar 

  • Díaz-Robles LA, Ortega JC, Fu JS, Reed GD, Chow JC, Watson JG, Moncada-Herrera JA (2008) A hybrid ARIMA and artificial neural networks model to forecast particulate matter in urban areas: the case of Temuco, Chile. Atmos Environ 42(35):8331–8340

    Article  Google Scholar 

  • Divina F, García Torres M, Goméz Vela FA, Vázquez Noguera JL (2019) A comparative study of time series forecasting methods for short term electric energy consumption prediction in smart buildings. Energies 12(10):1934

    Article  Google Scholar 

  • Domańska D, Wojtylak M (2012) Application of fuzzy time series models for forecasting pollution concentrations. Expert Syst Appl 39(9):7673–7679

    Article  Google Scholar 

  • Dong M, Yang D, Kuang Y, He D, Erdal S, Kenski D (2009) \(\text{ PM}_{2.5}\) concentration prediction using hidden semi-Markov model-based times series data mining. Expert Syst Appl 36(5):9046–9055

    Article  Google Scholar 

  • Dragomir EG (2010) Air quality index prediction using K-nearest neighbor technique. Bull PG Univ Ploiesti Ser Math Inform Phys LXII 1(2010):103–108

    Google Scholar 

  • Elangasinghe MA, Singhal N, Dirks KN, Salmond JA, Samarasinghe S (2014) Complex time series analysis of \(\text{ PM}_{10}\) and \(\text{ PM}_{2.5}\) for a coastal site using artificial neural network modelling and k-means clustering. Atmos Environ 94:106–116

    Article  Google Scholar 

  • Freeman BS, Taylor G, Gharabaghi B, Thé J (2018) Forecasting air quality time series using deep learning. J Air Waste Manag Ass 68(8):866–886

    Article  Google Scholar 

  • Gocheva-Ilieva SG, Ivanov AV, Voynikova DS, Boyadzhiev DT (2014) Time series analysis and forecasting for air pollution in small urban area: an SARIMA and factor analysis approach. Stoch Environ Res Risk Assess 28(4):1045–1060

    Article  Google Scholar 

  • Gómez-Carracedo MP, Andrade JM, López-Mahía P, Muniategui S, Prada D (2014) A practical comparison of single and multiple imputation methods to handle complex missing data in air quality datasets. Chem Intell Lab Syst 134:23–33

    Article  Google Scholar 

  • Haider SA, Naqvi SR, Akram T, Umar GA, Shahzad A, Sial MR, Khaliq S, Kamran M (2019) LSTM neural network based forecasting model for wheat production in Pakistan. Agronomy 9(2):72

    Article  Google Scholar 

  • Huang SF, Cheng CH (2008, July). Forecasting the air quality using OWA based time series model. In 2008 International Conference on Machine Learning and Cybernetics, IEEE. Vol. 6, pp.3254–3259

  • Khaldi R, Chiheb R, El Afia A (2018, May). Feedforward and recurrent neural networks for time series forecasting: comparative study. In Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications pp. 1-6

  • Kumar U, Jain VK (2010) ARIMA forecasting of ambient air pollutants (\(\text{ O}_{3}\), NO, \(\text{ NO}_{2}\) and CO). Stoch Environ Res Risk Assess 24(5):751–760

    Article  Google Scholar 

  • Lin K, Jing L, Wang M, Qiu M (2017, August). A novel long-term air quality forecasting algorithm based on kNN and NARX. In 2017 12th International Conference on Computer Science and Education,IEEE. (ICCSE) pp.343–348

  • Li X, Peng L, Yao X, Cui S, Hu Y, You C, Chi T (2017) Long short-term memory neural network for air pollutant concentration predictions: method development and evaluation. Environ pollut 231:997–1004

    Article  Google Scholar 

  • Mahajan S, Chen LJ, Tsai TC, 2017, August. An empirical study of \(\text{ PM}_{2.5}\) forecasting using neural network. In, (2017) IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications. Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), IEEE, pp 1–7

  • Meryem O, Ismail J (2014, October). A comparative study of predictive algorithms for time series forecasting. In 2014 third IEEE international colloquium in information science and technology (CIST), IEEE, pp.68–73

  • Momani PENM, Naill PE (2009) Time series analysis model for rainfall data in Jordan: case study for using time series analysis. Am J Environ Sci 5(5):599

    Article  Google Scholar 

  • Myung IJ, Pitt MA (1997) Applying Occam’s razor in modeling cognition: a Bayesian approach. Psychon Bull Review 4(1):79–95

    Article  Google Scholar 

  • Noor MN, Yahaya AS, Ramli NA, Al Bakri AMM (2014) Mean imputation techniques for filling the missing observations in air pollution dataset. Trans Tech Publications Ltd., Kapellweg

    Google Scholar 

  • Nury AH, Hasan K, Alam MJB (2017) Comparative study of wavelet-ARIMA and wavelet-ANN models for temperature time series data in northeastern Bangladesh. J King Saud Univ Sci 29(1):47–61

    Article  Google Scholar 

  • Oprea M, Mihalache SF, Popescu M (2016, May). A comparative study of computational intelligence techniques applied to \(\text{ PM}_{2.5}\) air pollution forecasting. In 2016 6th International Conference on Computers Communications and Control (ICCCC), IEEE. pp. 103–108

  • Qiao W, Tian W, Tian Y, Yang Q, Wang Y, Zhang J (2019) The forecasting of \(\text{ PM}_{2.5}\) using a hybrid model based on wavelet transform and an improved deep learning algorithm. IEEE Access 7:142814–142825

    Article  Google Scholar 

  • Qin D, Yu J, Zou G, Yong R, Zhao Q, Zhang B (2019) A novel combined prediction scheme based on CNN and LSTM for urban \(\text{ PM}_{2.5}\) concentration. IEEE Access 7:20050–20059

    Article  Google Scholar 

  • Reddy V, Yedavalli P, Mohanty S, Nakhat U (2018). Deep air: Forecasting air pollution in beijing, china

  • Roy S, Biswas SP, Mahata S, Bose R (2018, October). Time series forecasting using exponential smoothing to predict the major atmospheric pollutants. In 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN), IEEE. pp.679-684

  • Unnikrishnan R, Madhu G (2019) Comparative study on the effects of meteorological and pollutant parameters on ANN modelling for prediction of \(\text{ SO}_{2}\). SN Appl Sci 1(11):1394

    Article  Google Scholar 

  • Ventura LMB, de Oliveira Pinto F, Soares LM, Luna AS, Gioda A (2019) Forecast of daily \(\text{ PM}_{2.5}\) concentrations applying artificial neural networks and Holt-Winters models. Air Qual Atmos Health 12(3):317–325

    Article  Google Scholar 

  • Wang P, Zhang H, Qin Z, Zhang G (2017) A novel hybrid-Garch model based on ARIMA and SVM for \(\text{ PM}_{2.5}\) concentrations forecasting. Atmos Pollut Res 8(5):850–860

    Article  Google Scholar 

  • Wu L, Gao X, Xiao Y, Liu S, Yang Y (2017) Using grey Holt-Winters model to predict the air quality index for cities in China. Nat Hazards 88(2):1003–1012

    Article  Google Scholar 

  • Yang L, Li C, Tang X (2020) The impact of \(\text{PM}_{2.5}\) on the host defense of respiratory system. Front Cell Dev Biol, 8, p.91

  • Ye L, Yang G, Van Ranst E, Tang H (2013) Time-series modeling and prediction of global monthly absolute temperature for environmental decision making. Adv Atmos Sci 30(2):382–396

    Article  Google Scholar 

  • Zainuddin Z, Pauline O (2011) Modified wavelet neural network in function approximation and its application in prediction of time-series pollution data. Appl Soft Comp 11(8):4866–4874

    Article  Google Scholar 

  • Zakaria NN, Othman M, Sokkalingam R, Daud H, Abdullah L, Abdul Kadir E (2019) Markov chain model development for forecasting air pollution index of Miri, Sarawak. Sustainability 11(19):5190

    Article  Google Scholar 

  • Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14(1):35–62

    Article  Google Scholar 

Download references

Acknowledgements

This research work is supported by the project entitled- Participatory and Realtime Pollution Monitoring System For Smart City, funded by Higher Education, Science & Technology and Biotechnology, Department of Science & Technology, Government of West Bengal, India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarbani Roy.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Das, R., Middya, A.I. & Roy, S. High granular and short term time series forecasting of \(\hbox {PM}_{2.5}\) air pollutant - a comparative review. Artif Intell Rev 55, 1253–1287 (2022). https://doi.org/10.1007/s10462-021-09991-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-021-09991-1

Keywords

Navigation