Abstract
This study is focused on wastewater treatment station instrumentation. It is preliminary work for accomplishing virtual sensors in the future. In this stage, a specific case of correlation between monitored variables is carried out. The main aim is to choose the right and enough variables to predict the chemical oxygen demand (COD) over a wastewater treatment plant. Firstly, four methods for feature selection are implemented with all the monitored variables. After that, three regression techniques are applied to measure the performance of the previous step’s cases. In all cases were obtained an acceptable COD prediction was, which will allow the implementation of virtual sensors in the future with predictably adequate accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A review of feature selection methods with applications. In: 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2015 - Proceedings, pp. 1200–1205 (2015). https://doi.org/10.1109/MIPRO.2015.7160458
Allen, M.P.: Understanding Regression Analysis. Springer, Heidelberg (2004). https://doi.org/10.1007/b102242
Bagherzadeh, F., Mehrani, M.J., Basirifard, M., Roostaei, J.: Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance. J. Water Process Eng. 41, 102033 (2021). https://doi.org/10.1016/J.JWPE.2021.102033
Boretti, A., Rosa, L.: Reassessing the projections of the world water development report. NPJ Clean Water 2(1), 15 (2019)
Borzooei, S., et al.: Optimization of the wastewater treatment plant: from energy saving to environmental impact mitigation. Sci. Total Environ. 691, 1182–1189 (2019)
Brown, T.C., Mahat, V., Ramirez, J.A.: Adaptation to future water shortages in the united states caused by population growth and climate change. Earth’s Future 7(3), 219–234 (2019)
Cunha, D.L., da Silva, A.S., Coutinho, R., Marques, M.: Optimization of ozonation process to remove psychoactive drugs from two municipal wastewater treatment plants. Water Air Soil Pollution 233(2), 67 (2022)
Fernandez-Serantes, L., Casteleiro-Roca, J., Calvo-Rolle, J.: Hybrid intelligent system for a half-bridge converter control and soft switching ensurement. Revista Iberoamericana de Automática e Informática industrial (2022)
Fonti, V., Belitser, E.: Feature selection using lasso (2017)
Freund, R.J., Wilson, W.J., Sa, P.: Regression Analysis. Elsevier, Amsterdam (2006)
Gonzalez-Cava, J.M., et al.: Machine learning techniques for computer-based decision systems in the operating theatre: application to analgesia delivery. Logic J. IGPL 29(2), 236–250 (2020). https://doi.org/10.1093/jigpal/jzaa049
Ivanov, A., Bezyayev, A., Gazin, A.: Simplification of statistical description of quantum entanglement of multidimensional biometric data using symmetrization of paired correlation matrices. J. Comput. Eng. Math. 4, 3–13 (2017). https://doi.org/10.14529/jcem170201
Kraskov, A., Stögbauer, H., Grassberger, P.: Estimating mutual information. Phys. Rev. E 69, 066138 (2004). https://doi.org/10.1103/PhysRevE.69.066138, https://link.aps.org/doi/10.1103/PhysRevE.69.066138
Lakshmanaprabu, S.K., Shankar, K., Ilayaraja, M., Nasir, A.W., Vijayakumar, V., Chilamkurti, N.: Random forest for big data classification in the internet of things using optimal features. Int. J. Mach. Learn. Cybern. 10(10), 2609–2618 (2019). https://doi.org/10.1007/s13042-018-00916-z
Liu, H., et al.: Evolving feature selection. IEEE Intell. Syst. 20(6), 64–76 (2005). https://doi.org/10.1109/MIS.2005.105
Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005). https://doi.org/10.1109/TKDE.2005.66
Mestre, X., Vallet, P.: Correlation tests and linear spectral statistics of the sample correlation matrix. IEEE Trans. Inf. Theory 63(7), 4585–4618 (2017). https://doi.org/10.1109/TIT.2017.2689780
Modaresi, F., Araghinejad, S., Ebrahimi, K.: A comparative assessment of artificial neural network, generalized regression neural network, least-square support vector regression, and k-nearest neighbor regression for monthly streamflow forecasting in linear and nonlinear conditions. Water Resour. Manag. 32(1), 243–258 (2017). https://doi.org/10.1007/s11269-017-1807-2
Muoio, R., et al.: Optimization of a large industrial wastewater treatment plant using a modeling approach: a case study. J. Environ. Manag. 249, 109436 (2019)
Muthukrishnan, R., Rohini, R.: Lasso: a feature selection technique in predictive modeling for machine learning. In: 2016 IEEE International Conference on Advances in Computer Applications, ICACA 2016, pp. 18–20 (2017). https://doi.org/10.1109/ICACA.2016.7887916
Porras, S., Jove, E., Baruque, B., Calvo-Rolle, J.L.: A comparative analysis of intelligent techniques to predict energy generated by a small wind turbine from atmospheric variables. Logic J. IGPL (2022). https://doi.org/10.1093/jigpal/jzac031
Ranstam, J., Cook, J.A.: Lasso regression. Br. J. Surg. 105, 1348 (2018). https://doi.org/10.1002/bjs.10895
Razif, M., Soemarno, Yanuwiadi, B., Rachmansyah, A., Belgiawan, P.F.: Implementation of regression linear method to predict WWTP cost for EIA: case study of ten malls in Surabaya city. Procedia Environ. Sci. 28, 158–165 (2015). https://doi.org/10.1016/j.proenv.2015.07.022, https://www.sciencedirect.com/science/article/pii/S1878029615002340, the 5th Sustainable Future for Human Security (SustaiN 2014)
Ross, B.C.: Mutual information between discrete and continuous data sets. PLoS ONE 9, e87357 (2014). https://doi.org/10.1371/journal.pone.0087357
Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007). https://doi.org/10.1093/bioinformatics/btm344
Safarpour, H., Tabesh, M., Shahangian, S.A.: Environmental assessment of a wastewater system under water demand management policies. Water Resour. Manag. 36(6), 2061–2077 (2022)
Şenol, R., Salman, O., Kaya, Z.: Potable water production from ambient moisture. Appl. Water Sci. 13(1), 10 (2023)
Simić, S., Banković, Z., Villar, J.R., Simić, D., Simić, S.D.: A hybrid fuzzy clustering approach for diagnosing primary headache disorder. Logic J. IGPL 29(2), 220–235 (2020). https://doi.org/10.1093/jigpal/jzaa048
Spellman, F.R.: Handbook of Water and Wastewater Treatment Plant Operations. CRC Press, Boca Raton (2013)
Su, X., Yan, X., Tsai, C.L.: Linear regression. Wiley Interdisc. Rev. Comput. Stat. 4(3), 275–294 (2012)
Vanli, N.D., Kozat, S.S.: A comprehensive approach to universal piecewise nonlinear regression based on trees. IEEE Trans. Sig. Process. 62(20), 5471–5486 (2014). https://doi.org/10.1109/TSP.2014.2349882, https://www.scopus.com/inward/record.uri?eid=2-s2.0-84907445235 &doi=10.1109%2fTSP.2014.2349882 &partnerID=40 &md5=74299ee97d7c3d7a5448c133cf129c62
Vanli, N.D., Sayin, M.O., Mohaghegh N.M., Ozkan, H., Kozat, S.S.: Nonlinear regression via incremental decision trees. Pattern Recogn. 86, 1–13 (2019). https://doi.org/10.1016/j.patcog.2018.08.014, https://www.sciencedirect.com/science/article/pii/S0031320318303121
Windeatt, T.: Accuracy/diversity and ensemble MLP classifier design. IEEE Trans. Neural Netw. 17(5), 1194–1211 (2006). https://doi.org/10.1109/TNN.2006.875979
Zayas-Gato, F., et al.: Intelligent model for active power prediction of a small wind turbine. Logic J. IGPL (2022). https://doi.org/10.1093/jigpal/jzac040
Zhang, S., Li, X., Zong, M., Zhu, X., Cheng, D.: Learning k for KNN classification. ACM Trans. Intell. Syst. Technol. 8(3) (2017). https://doi.org/10.1145/2990508
Acknowledgement
Míriam Timiraos’s research was supported by the “Xunta de Galicia” through grants to industrial PhD (http://gain.xunta.gal/), under the “Doutoramento Industrial 2022” grant with reference: 04_IN606D_2022_ 2692965.
Álvaro Michelena’s research was supported by the Spanish Ministry of Universities (https://www.universidades.gob.es/), under the “Formación de Profesorado Universitario” grant with reference: FPU21/00932.
CITIC, as a Research Center of the University System of Galicia, is funded by Consellería de Educación, Universidade e Formación Profesional of the Xunta de Galicia through the European Regional Development Fund (ERDF) and the Secretaría Xeral de Universidades (Ref. ED431G 2019/01).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Timiraos, M. et al. (2023). Comparative Study of Wastewater Treatment Plant Feature Selection for COD Prediction. In: Jove, E., Zayas-Gato, F., Michelena, Á., Calvo-Rolle, J.L. (eds) Distributed Computing and Artificial Intelligence, Special Sessions II - Intelligent Systems Applications, 20th International Conference. DCAI 2023. Lecture Notes in Networks and Systems, vol 742. Springer, Cham. https://doi.org/10.1007/978-3-031-38616-9_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-38616-9_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38615-2
Online ISBN: 978-3-031-38616-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)