Abstract
Accurate prediction of PM2.5 concentrations can provide a solid foundation for preventing and controlling air pollution. When the Long Short-Term Memory (LSTM) is applied to predict PM2.5 concentration, the influential factors strongly correlated with PM2.5 concentration are directly fed into the LSTM network. However, the influence of these factors on PM2.5 concentration is different. To address this issue, a weighted Random Forest (RF)-LSTM model was proposed to predict PM2.5 concentration for the next six hours in this study. This model first uses the RF to select the factors that are more important for predicting PM2.5 concentration and then uses a fully connected neural network to learn the weight value of each factor. Finally, the weighted data is fed into the LSTM network. The model is trained, validated, and tested using hourly air pollutant and meteorological data collected from four monitoring stations in Beijing, China, from November 1, 2019 to February 28, 2022. The prediction performance of the weighted RF-LSTM model was compared to the RF-LSTM and LSTM models. The results show that the RMSE and MAE of the weighted RF-LSTM model are the smallest, and the R2 is the largest for the next six hours’ prediction of PM2.5 concentration at four stations. Compared to the LSTM model, the weighted RF-LSTM model decreases RMSE by 2.3%-5.3%, MAE by 5.6%-9.6%, and improves R2 by 2.0%-4.8%, showing that the weighted RF-LSTM model proposed in this study can achieve better prediction performance and has strong generalization ability.
Graphical abstract
Similar content being viewed by others
Data Availability
The data that support the findings of this study are available on request from the corresponding author.
Abbreviations
- RF:
-
Random Forest
- LSTM:
-
Long Short-Term Memory
- a weighted RF-LSTM:
-
a weighted Random Forest- Long Short-Term Memory
- RF-LSTM:
-
Random Forest- Long Short-Term Memory
- PM2.5:
-
PM2.5 concentration
- PM10:
-
PM10 concentration
- NO2 :
-
NO2 concentration
- CO:
-
CO concentration
- SO2 :
-
SO2 concentration
- O3 :
-
O3 concentration
- RNN:
-
Recurrent Neural Network
- BP:
-
Back Propagation
- FC:
-
Fully Connected
- RMSE:
-
root mean square error
- MAE:
-
mean absolute error
- R2 :
-
coefficient of determination
References
Agarwal S, Sharma S, Suresh R, Rahman MH, Vranckx S, Maiheu B, Blyth L, Janssen S, Gargava P, Shukla VK, Batra S (2020) Air quality forecasting using artificial neural networks with real time dynamic error correction in highly polluted regions. Sci Total Environ 735:139454
Al-Janabi S, Mohammad M, Al-Sultan A (2020) A new method for prediction of air pollution based on intelligent computation. Soft Comput 24(1):661–680
Byun D, Schere KL (2006) Review of the governing equations, computational algorithms, and other components of the Models-3 Community Multiscale Air Quality (CMAQ) modeling system. Appl Mech Rev 59(2):51–77
Cabaneros SM, Calautit JK, Hughes BR (2019) A review of artificial neural network models for ambient air pollution prediction. Environ Model Softw 119:285–304
Chang YS, Chiao HT, Abimannan S, Huang YP, Tsai YT, Lin KM (2020) An LSTM-based aggregated model for air pollution forecasting. Atmospheric Pollution Research 11(8):1451–1463
Chen G, Li S, Knibbs LD, Hamm NA, Cao W, Li T, Guo J, Ren H, Abramson MJ, Guo Y (2018) A machine learning method to estimate PM2.5 concentrations across China with remote sensing, meteorological and land use information. Sci Total Environ 636:52–60
Cheng X, Liu Y, Xu X, You W, Zang Z, Gao L, Chen Y, Su D, Yan P (2019) Lidar data assimilation method based on CRTM and WRF-Chem models and its application in PM2.5 forecasts in Beijing. Sci Total Environ 682:541–552
Ding, W., & Liu, J. (2023). Nonlinear and spatial spillover effects of urbanization on air pollution and ecological resilience in the Yellow River Basin. Environ Sci Pollut Res, 1-16.
Ding W, Zhu Y (2022) Prediction of PM2.5 Concentration in Ningxia Hui Autonomous Region Based on PCA-Attention-LSTM. Atmosphere 13(9):1444
Ding W, Leung Y, Zhang J, Fung T (2021) A hierarchical Bayesian model for the analysis of space-time air pollutant concentrations and an application to air pollution analysis in Northern China. Stoch Env Res Risk A 35(11):2237–2271
Gao X, Li W (2021) A graph-based LSTM model for PM2.5 forecasting. Atmospheric Pollution Research 12(9):101150
Gautam A, Sharma PK, Baredar P, Warudkar V, Bhagoria JL, Ahmed S (2021) Modeling of atmospheric boundary flows using experimental investigation over complex terrain in a non-neutral condition. Materials Today: Proceedings 46:5681–5686
Gautam A, Warudkar V, Bhagoria JL (2022) Comparison of Weibull parameter estimation methods using LiDAR and mast wind data in an Indian offshore site: The Gulf of Khambhat. Ocean Eng 266:112927
Gupta, P., & Christopher, S.A. (2009). Particulate matter air quality assessment using integrated surface, satellite, and meteorological products: Multiple regression approach. J Geophys Res: Atmospheres, 114(D14)
Huang G, Li X, Zhang B, Ren J (2021) PM2.5 concentration forecasting at surface monitoring sites using GRU neural network based on empirical mode decomposition. Sci Total Environ 768:144516
Jian L, Zhao Y, Zhu YP, Zhang MB, Bertolatti D (2012) An application of ARIMA model to predict submicron particle concentrations from meteorological factors at a busy roadside in Hangzhou, China. Sci Total Environ 426:336–345
Jiang N, Li L, Wang S, Li Q, Dong Z, Duan S, Zhang R, Li S (2019) Variation tendency of pollution characterization, sources, and health risks of PM2.5-bound polycyclic aromatic hydrocarbons in an emerging megacity in China: Based on three-year data. Atmos Res 217:81–92
Jiang X, Yoo EH (2018) The importance of spatial resolutions of Community Multiscale Air Quality (CMAQ) models on health impact assessment. Sci Total Environ 627:1528–1543
Jin H, Chen X, Zhong R, Liu M (2022) Influence and prediction of PM2.5 through multiple environmental variables in China. Sci Total Environ 849:157910
Kappos AD, Bruckmann P, Eikmann T, Englert N, Heinrich U, Höppe P, Koch E, Krause GHM, Kreyling WG, Rauchfuss K, Rombout P, Klemp VS, Thiel WR, Wichmann HE (2004) Health effects of particles in ambient air. Int J Hyg Environ Health 207(4):399–407
Karimian H, Li Y, Chen Y, Wang Z (2023) Evaluation of different machine learning approaches and aerosol optical depth in PM2.5 prediction. Environ Res 216:114465
Kim YB, Park SB, Lee S, Park YK (2023) Comparison of PM2.5 prediction performance of the three deep learning models: A case study of Seoul, Daejeon, and Busan. J Ind Eng Chem 120:159–169
Li T, Hua M, Wu XU (2020) A hybrid CNN-LSTM model for forecasting particulate matter (PM2.5). IEEE Access 8:26933–26940
Li X, Jin L, Kan H (2019) Air pollution: a global problem needs local fixes. Nature 570(7762):437–439
Lipton ZC, Berkowitz J, Elkan C (2015). A critical review of recurrent neural networks for sequence learning. arXiv preprint arXiv:1506.00019
Liu DR, Lee SJ, Huang Y, Chiu CJ (2020) Air pollution forecasting based on attention-based LSTM neural network and ensemble learning. Expert Syst 37(3):e12511
Liu W, Guo G, Chen F, Chen Y (2019) Meteorological pattern analysis assisted daily PM2.5 grades prediction using SVM optimized by PSO algorithm. Atmospheric Pollution Research 10(5):1482–1491
Ma J, Ding Y, Cheng JC, Jiang F, Gan VJ, Xu Z (2020) A Lag-FLSTM deep learning network based on Bayesian Optimization for multi-sequential-variant PM2.5 prediction. Sustain Cities Soc 60:102237
Pak U, Ma J, Ryu U, Ryom K, Juhyok U, Pak K, Pak C (2020) Deep learning-based PM2.5 prediction considering the spatiotemporal correlations: A case study of Beijing, China. Sci Total Environ 699:133561
Pun VC, Kazemiparkouhi F, Manjourides J, Suh HH (2017) Long-term PM2.5 exposure and respiratory, cancer, and cardiovascular mortality in older US adults. Am J Epidemiol 186(8):961–969
Qiao W, Tian W, Tian Y, Yang Q, Wang Y, Zhang J (2019) The forecasting ofPM2.5 using a hybrid model based on wavelet transform and an improved deep learning algorithm. IEEE Access 7:142814–142825
Shang K, Chen Z, Liu Z, Song L, Zheng W, Yang B et al (2021) Haze prediction model using deep recurrent neural network. Atmosphere 12(12):1625
Shao Y, Ma Z, Wang J, Bi J (2020) Estimating daily ground-level PM2.5 in China with random-forest-based spatiotemporal kriging. Sci Total Environ 740:139761
Sharma PK, Gautam A, Baredar P, Warudkar V, Bhagoria JL, Ahmed S (2021) Analysis of terrain of site Mamatkheda Ratlam through wind modeling tool ArcGIS and WAsP. Materials Today: Proceedings 46:5661–5665
Suleiman A, Tight MR, Quinn AD (2016) Hybrid neural networks and boosted regression tree models for predicting roadside particulate matter. Environ Model Assess 21:731–750
Sun W, Sun J (2017) Daily PM2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm. J Environ Manag 188:144–152
Sun W, Xu Z (2021) A novel hourly PM2.5 concentration prediction model based on feature selection, training set screening, and mode decomposition-reorganization. Sustain Cities Soc 75:103348
Wu X, Liu Z, Yin L, Zheng W, Song L, Tian J et al (2021) A haze prediction model in chengdu based on LSTM. Atmosphere 12(11):1479
Xu Y, Ho HC, Wong MS, Deng C, Shi Y, Chan TC, Knudby A (2018) Evaluation of machine learning techniques with multiple remote sensing datasets in estimating monthly concentrations of ground-level PM2.5. Environ Pollut 242:1417–1426
Yan X, Zang Z, Jiang Y, Shi W, Guo Y, Li D, Zhao C, Husi L (2021) A Spatial-Temporal Interpretable Deep Learning Model for improving interpretability and predictive accuracy of satellite-based PM2.5. Environ Pollut 273:116459
Yin L, Wang L, Huang W, Tian J, Liu S, Yang B, Zheng W (2022) Haze grading using the convolutional neural networks. Atmosphere 13(4):522
Zhang B, Zou G, Qin D, Lu Y, Jin Y, Wang H (2021a) A novel Encoder-Decoder model based on read-first LSTM for air pollutant prediction. Sci Total Environ 765:144507
Zhang L, Liu P, Zhao L, Wang G, Zhang W, Liu J (2021b) Air quality predictions with a semi-supervised bidirectional LSTM neural network. Atmospheric Pollution Research 12(1):328–339
Zhang M, Wu D, Xue R (2021c) Hourly prediction of PM2.5 concentration in Beijing based on Bi-LSTM neural network. Multimed Tools Appl 80(16):24455–24468
Zhao CN, Xu Z, Wu GC, Mao YM, Liu LN, Dan YL, Tao SS, Zhang Q, Sam NB, Fan YG, Zou YF, Ye DQ, Pan HF (2019a) Emerging role of air pollution in autoimmune diseases. Autoimmun Rev 18(6):607–614
Zhao J, Deng F, Cai Y, Chen J (2019b) Long short-term memory-Fully connected (LSTM-FC) neural network for PM2.5 concentration prediction. Chemosphere 220:486–492
Zhu H, Lu X (2016) The prediction of PM2.5 value based on ARMA and improved BP neural network model. In 2016 International Conference on Intelligent Networking and Collaborative Systems (INCoS) (pp. 515-517). IEEE
Zhu S, Lian X, Wei L, Che J, Shen X, Yang L, Qiu X, Liu X, Gao W, Ren X, Li J (2018) PM2.5 forecasting using SVR with PSOGSA algorithm based on CEEMD, GRNN and GCA considering meteorological factors. Atmos Environ 183:20–32
Funding
This work was supported by the National Natural Science Foundation, Research on air pollution and ecological toughness and time and space in the Yellow River Basin City under Grant 12361108, the Ningxia Natural Science Foundation under Grant no. 2023AAC03278, First Class Disciplines Foundation of Ningxia under Grant NXYLXK2017B09, the 2023 Ningxia Youth Top Talent Program.
Author information
Authors and Affiliations
Contributions
Weifu Ding: Funding acquisition; investigation; visualization; Huihui Sun: methodology; resources; supervision; Data availability. All authors read and approved the final Manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Communicated by: H. Babaie
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ding, W., Sun, H. Prediction of PM2.5 concentration based on the weighted RF-LSTM model. Earth Sci Inform 16, 3023–3037 (2023). https://doi.org/10.1007/s12145-023-01111-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-023-01111-7