Abstract
Water quality estimation using machine learning is a type of data analysis process that uses algorithms to identify patterns in large sets of data related to water quality. This can include identifying pollutants and other potential contamination that could negatively impact quality for drinking purposes, recreational activities or other uses. This helps ensure that the safety of water sources and the quality of recreational activities are constantly monitored and maintained. Thus, in this paper, a set of existing machine learning classifiers is applied to Internet of Things (IoT) sensor data on various water quality parameters, and the results are compared. Subsequently, a meta ensemble classifier that utilizes the soft voting technique of the best four previous classifiers is proposed to enhance estimation accuracy. According to results on the majority of the metrics used, this meta ensemble classifier outperforms all previously considered classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Boyd, C.E.: Water Quality. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-23335-8
World Health Organization: guidelines for drinking-water quality. https://www.who.int/publications/i/item/9789240045064. Accessed 25 Mar 2023
Jhaveri, R. H. et. al.: A review on machine learning strategies for real-world engineering applications. Mobile Inf. Syst. (2022). https://doi.org/10.1155/2022/1833507
Vonitsanos, G., Panagiotakopoulos, T., Kanavos, A., Tsakalidis, A.: Forecasting air flight delays and enabling smart airport services in apache spark. In: Artificial Intelligence Applications and Innovations. AIAI 2021 IFIP WG 12.5 International Workshops, pp. 407–417 (2021). https://doi.org/10.1007/978-3-030-79157-5_33
Panagiotakopoulos, T., Kotsiantis, S., Kostopoulos, G., Iatrellis, O., Kameas, A.: Early dropout prediction in MOOCs through supervised learning and hyperparameter optimization. Electronics 10(14), 1701 (2021). https://doi.org/10.3390/electronics10141701
Panagiotakopoulos, T., et al.: Vessel’s trim optimization using IoT data and machine learning models. In: 13th International Conference on Information, Intelligence, Systems & Applications (2022). https://doi.org/10.1109/IISA56318.2022.9904361
Panagiotou, C., Panagiotakopoulos, T., Kameas, A.: A multi: modal decision making system for an ambient assisted living environment. In: 8th ACM International Conference on Pervasive Technologies Related to Assistive Environments (2015). https://doi.org/10.1145/2769493.2769529
Chou, J.-S., Chia-Chun, H., Ha-Son, H.: Determining quality of water in reservoir using machine learning. Ecol. inf. 44, 57–75 (2018)
Panagiotakopoulos, T., Vlachos, D. P., Bakalakos, T. V., Kanavos, A., Kameas, A.: A fiware-based iot framework for smart water distribution management. In: 12th International Conference on Information, Intelligence, Systems & Applications (2021). https://doi.org/10.1109/IISA52424.2021.9555509
Kim, Y.H., et. al.: Machine learning approaches to coastal water quality monitoring using GOCI satellite data. GIScience & Remote Sensing, vol. 51, no. 2, pp. 158–174 (2014). https://doi.org/10.1080/15481603.2014.900983
Chang, N.-B., Bai, K., Chen, C.-F.: Integrating multisensor satellite data merging and image reconstruction in support of machine learning for better water quality management. J. Environ. Manag. 201, 227–240 (2017). https://doi.org/10.1016/j.jenvman.2017.06.045
Hafeez, S. et. al.: Comparison of machine learning algorithms for retrieval of water quality indicators in Case-II waters: a case study of Hong Kong. Remote Sens., vol. 11, no. 6 (2019). https://doi.org/10.3390/rs11060617
Chen, K. et. al.: Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data. Water Res. 171 (2020). https://doi.org/10.1016/j.watres.2019.115454
Xu, X. et. al.: Real-time detection of potable-reclaimed water pipe cross-connection events by conventional water quality sensors using machine learning methods. J. Environ. Manage. 238, 201–209 (2019). https://doi.org/10.1016/j.jenvman.2019.02.110
Ahmed, N. A. et al.: Machine learning methods for better water quality prediction. J. Hydrol. 578, 124084 (2019). https://doi.org/10.1016/j.jhydrol.2019.124084
Li, Y., et al.: Lagoon water quality monitoring based on digital image analysis and machine learning estimators. Water Res. 172 (2020). https://doi.org/10.1016/j.watres.2020.115471
Arias-Rodriguez, L. et. al.: Monitoring water quality of valle de bravo reservoir, mexico, using entire lifespan of meris data and machine learning approaches. Remote Sens. 12(10), 1586 (2020). https://doi.org/10.3390/rs12101586
Lu, H., Ma, X.: Hybrid decision tree-based machine learning models for short-term water quality prediction. Chemosphere 249 (2020). https://doi.org/10.1016/j.chemosphere.2020.126169
Xu, T., Coco, G., Neale, M.: A predictive model of recreational water quality based on adaptive synthetic sampling algorithms and machine learning. Water Res. 177, 115788 (2020). https://doi.org/10.1016/j.watres.2020.115788
El Bilali, A., Taleb, A.: Prediction of irrigation water quality parameters using machine learning models in a semi-arid environment. J. Saudi Soc. Agric. Sci. 19(7), 439–451 (2020). https://doi.org/10.1016/j.jssas.2020.08.001
Asadollah, S.B.H.S., et. al.: River water quality index prediction and uncertainty analysis: a comparative study of machine learning models. J. Environ. Chem. Eng. 9(1) (2021) https://doi.org/10.1016/j.jece.2020.104599
Lu, Q. et. al.: Retrieval of water quality from UAV-Borne hyperspectral imagery: a comparative study of machine learning algorithms. Remote Sens. 13(19), 3928 (2021). https://doi.org/10.3390/rs13193928
Wang, L. et. al.: Improving the robustness of beach water quality modeling using an ensemble machine learning approach. Sci. Total Environ. 765, 142760 (2021). https://doi.org/10.1016/j.scitotenv.2020.142760
Nasir, N. et. al.: Water quality classification using machine learning algorithms. J. Water Process. Eng. 48, 102920 (2022). https://doi.org/10.1016/j.jwpe.2022.102920
Tung, T.M., Yaseen, Z.M.: A survey on river water quality modelling using artificial intelligence models: 2000–2020. J. Hydrol. 585, 124670 (2020). https://doi.org/10.1016/j.jhydrol.2020.124670
Mengyuan, Z., et al.: A review of the application of machine learning in water quality evaluation. Eco-Environment & Health (2022)
Kadiwal, A.: Water Quality [Dataset]. https://www.kaggle.com/adityakadiwal/water-potability. Accessed 25 Mar 2022
Ali, M.: Pycaret: an open source, low-code machine learning library in python, PyCaret version 2.3.5 (2020). https://www.pycaret.org. Accessed 25 Mar 2022
Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017)
Luo, Y.: Evaluating the state of the art in missing data imputation for clinical data. Brief. Bioinform. 23(1) (2022). https://doi.org/10.1093/bib/bbab489
Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: 8th IEEE International Conference on Data Mining, Pisa, Italy, pp. 413–422 (2008). https://doi.org/10.1109/ICDM.2008.17
Amorim, L.B., Cavalcanti, G.D., Cruz, R.M.: The choice of scaling technique matters for classification performance. Appl. Soft Comput. 133 (2023)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5) (2001)
Chen, T., Guestrin, C. XgBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, pp. 785–794 (2016)
Freund Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Eur. Conf. Comput. Learn. Theory, 23–37. Barcelona, Spain (2016)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)
Tharwat, A.: Linear vs quadratic discriminant analysis classifier: a tutorial. Int. J. Appl. Pattern Recogn. 3(2), 145–180 (2016)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967). https://doi.org/10.1109/TIT.1967.1053964
Murphy, K.P.: Naive Bayes classifiers. Univ. British Columbia 18(60), 1–8 (2006)
Kleinbaum, D.G., et al.: Logistic Regression, p. 536. Springer-Verlag, New York (2002)
Cortes, C., Vapnik, V.I.: Support vector networks. Mach. Learn. 20(3), 273–297 (1995)
Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable selection using random forests. Pattern Recogn. Lett. 31(14), 2225–2236 (2010)
Sharma, A., Shrimali, V. R., Beyeler, M.: Machine learning for OpenCV 4: intelligent algorithms for building image processing apps using OpenCV 4, Python, and scikit-learn. Packt Publishing Ltd (2019)
Kaddoura, S.: Evaluation of machine learning algorithm on drinking water quality for better sustainability. Sustainability 14(18), 11478 (2022). https://doi.org/10.3390/su141811478
Acknowledgements
This work was supported by the research project OpenDCO, “Open Data City Officer” (Project No.: 22022-1-CY01-KA220-HED-000089196, Erasmus+ KA2: KA220-HED - Cooperation partnerships in higher education)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 IFIP International Federation for Information Processing
About this paper
Cite this paper
Davrazos, G., Panagiotakopoulos, T., Kotsiantis, S. (2023). Water Quality Estimation from IoT Sensors Using a Meta-ensemble. In: Maglogiannis, I., Iliadis, L., Papaleonidas, A., Chochliouros, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 677. Springer, Cham. https://doi.org/10.1007/978-3-031-34171-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-031-34171-7_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34170-0
Online ISBN: 978-3-031-34171-7
eBook Packages: Computer ScienceComputer Science (R0)