Abstract
Particulate matter concentration is one among several variables monitored at regular intervals to calculate air quality indices (AQI) which are intended to help understand the acute and chronic effects of air quality on human health. The fine particulate (PM2.5) samplers installed at pollution monitoring stations continuously monitor the concentration of pollutant in air over time. The specific time-averaged concentration is then estimated from the continuous records. Missing data records in the PM2.5 time series is quite normal, which is attributed by faulty equipment, routine maintenance schedules, or replacement of equipment. When one or more point observations in a time series are missing, it is very essential to estimate or predict the missing values. This study presents the application of machine learning techniques such as support vector regression (SVR), group method of data handling (GMDH) network, and evolutionary adaptive neuro fuzzy inference system to estimate the 24-h average PM2.5 concentration levels at a particular station using PM2.5 concentration levels observed at neighborhood stations as inputs. The performance of these models are evaluated in terms of widely used statistical metrics such as centered root mean square difference (CRMSD), normalized Nash–Sutcliffe efficiency (NNSE), and correlation coefficient (R). The findings of the study reveal that the GMDH model provided reasonably accurate estimates of daily PM2.5 levels.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Tan, Z.: Properties of aerosol particles. In: Air Pollution and Greenhouse Gases. Green Energy and Technology Series, pp. 91–116. Springer, Singapore (2014). https://doi.org/10.1007/978-981-287-212-8_4
Vallero, D.: The science of air pollution, In: Fundamentals of Air Pollution, 5th edn., Chap. 3, pp. 43–81. Academic Press, Boston (2014). https://doi.org/10.1016/B978-0-12-401733-7.00003-7
Dong, M., Yang, D., Kuang, Y., He, D., Erdal, S., Kenski, D.: \({\rm PM}_{2.5}\) concentration prediction using hidden semi-Markov model-based times series data mining. Expert Syst. Appl. 36(5), 9046-9055 (2009). https://doi.org/10.1016/j.eswa.2008.12.017
de Mattos Neto, P.S.G., Madeiro, F., Ferreira, T.A.E., Cavalcanti, G.D.C.: Hybrid intelligent system for air quality forecasting using phase adjustment. Eng. Appl. Artif. Intell. 32, 185–191 (2014). https://doi.org/10.1016/j.engappai.2014.03.010
Lary, D.J., Lary, T., Sattler, B.: Using machine learning to estimate global \({\rm PM}_{2.5}\) for environmental health studies. Environ. Health Insights 9(Suppl 1), 41–52 (2015). https://doi.org/10.4137/EHI.S15664
Feng, X., Li, Q., Zhu, Y., Hou, J., Jin, L., Wang, J.: Artificial neural networks forecasting of \({\rm PM}_{2.5}\) pollution using air mass trajectory based geographic model and wavelet transformation. Atmos. Environ. 107, 118–128 (2015). https://doi.org/10.1016/j.atmosenv.2015.02.030
CPCB: National Air Quality Index. CPCB, New Delhi, 42 pp. (2014). http://www.indiaenvironmentportal.org.in/files/file/Air%20Quality%20Index.pdf
CPCB: Ambient Air Quality Data at Various Locations in the Country. http://cpcb.nic.in/RealTimeAirQualityData.php (2017)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1995). https://doi.org/10.1007/978-1-4757-2440-0
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/BF00994018
Raghavendra, N.S., Deka, P.C.: Support vector machine applications in the field of hydrology: a review. Appl. Soft Comput. 19, 372–386 (2014). https://doi.org/10.1016/j.asoc.2014.02.002
Ivakhnenko, A.G.: The group method of data handling—A rival of the method of stochastic approximation. Sov. Autom. Control 13(3), 43–55 (1968)
Anastasakis, L., Mort, N.: The development of self-organization techniques in modelling: a review of the group method of data handling (GMDH). ACSE Research Report 813, University of Sheffield, UK (2001)
Yu, X., Gen, M.: Introduction to Evolutionary Algorithms. Springer, London (2010). https://doi.org/10.1007/978-1-84996-129-5
Jang, J.S.: ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans. Syst. Man Cybern. 23(3), 665–685 (1993)
Chang, C., Lin, C.: LIBSVM : a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27. http://www.csie.ntu.edu.tw/~cjlin/libsvm
Heris, S.M.K.: Implementation of Group Method of Data Handling in MATLAB. Project Code: YPML113, Yarpiz (2015). http://www.yarpiz.com
Heris, S.M.K.: Evolutionary ANFIS Training in MATLAB. Project Code: YPFZ104, Yarpiz (2015). http://www.yarpiz.com
Nash, J.E., Sutcliffe, J.V.: River flow forecasting through conceptual models. Part I : a discussion of principles. J. Hydrol. 10(3), 282–290 (1970). https://doi.org/10.1016/0022-1694(70)90255-6
Acknowledgements
The authors would like to thank the Central Pollution Control Board, India, for hosting the necessary data on its Web site which is used in this study.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Krishnappa, L., Devatha, C.P. (2019). Machine Learning Approaches for the Estimation of Particulate Matter (PM2.5) Concentration Levels: A Case Study in the Hyderabad City, India. In: Bansal, J., Das, K., Nagar, A., Deep, K., Ojha, A. (eds) Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol 816. Springer, Singapore. https://doi.org/10.1007/978-981-13-1592-3_61
Download citation
DOI: https://doi.org/10.1007/978-981-13-1592-3_61
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1591-6
Online ISBN: 978-981-13-1592-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)