Skip to main content

Water Quality Estimation from IoT Sensors Using a Meta-ensemble

  • Conference paper
  • First Online:
Artificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops (AIAI 2023)

Abstract

Water quality estimation using machine learning is a type of data analysis process that uses algorithms to identify patterns in large sets of data related to water quality. This can include identifying pollutants and other potential contamination that could negatively impact quality for drinking purposes, recreational activities or other uses. This helps ensure that the safety of water sources and the quality of recreational activities are constantly monitored and maintained. Thus, in this paper, a set of existing machine learning classifiers is applied to Internet of Things (IoT) sensor data on various water quality parameters, and the results are compared. Subsequently, a meta ensemble classifier that utilizes the soft voting technique of the best four previous classifiers is proposed to enhance estimation accuracy. According to results on the majority of the metrics used, this meta ensemble classifier outperforms all previously considered classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boyd, C.E.: Water Quality. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-23335-8

    Book  Google Scholar 

  2. World Health Organization: guidelines for drinking-water quality. https://www.who.int/publications/i/item/9789240045064. Accessed 25 Mar 2023

  3. Jhaveri, R. H. et. al.: A review on machine learning strategies for real-world engineering applications. Mobile Inf. Syst. (2022). https://doi.org/10.1155/2022/1833507

  4. Vonitsanos, G., Panagiotakopoulos, T., Kanavos, A., Tsakalidis, A.: Forecasting air flight delays and enabling smart airport services in apache spark. In: Artificial Intelligence Applications and Innovations. AIAI 2021 IFIP WG 12.5 International Workshops, pp. 407–417 (2021). https://doi.org/10.1007/978-3-030-79157-5_33

  5. Panagiotakopoulos, T., Kotsiantis, S., Kostopoulos, G., Iatrellis, O., Kameas, A.: Early dropout prediction in MOOCs through supervised learning and hyperparameter optimization. Electronics 10(14), 1701 (2021). https://doi.org/10.3390/electronics10141701

    Article  Google Scholar 

  6. Panagiotakopoulos, T., et al.: Vessel’s trim optimization using IoT data and machine learning models. In: 13th International Conference on Information, Intelligence, Systems & Applications (2022). https://doi.org/10.1109/IISA56318.2022.9904361

  7. Panagiotou, C., Panagiotakopoulos, T., Kameas, A.: A multi: modal decision making system for an ambient assisted living environment. In: 8th ACM International Conference on Pervasive Technologies Related to Assistive Environments (2015). https://doi.org/10.1145/2769493.2769529

  8. Chou, J.-S., Chia-Chun, H., Ha-Son, H.: Determining quality of water in reservoir using machine learning. Ecol. inf. 44, 57–75 (2018)

    Article  Google Scholar 

  9. Panagiotakopoulos, T., Vlachos, D. P., Bakalakos, T. V., Kanavos, A., Kameas, A.: A fiware-based iot framework for smart water distribution management. In: 12th International Conference on Information, Intelligence, Systems & Applications (2021). https://doi.org/10.1109/IISA52424.2021.9555509

  10. Kim, Y.H., et. al.: Machine learning approaches to coastal water quality monitoring using GOCI satellite data. GIScience & Remote Sensing, vol. 51, no. 2, pp. 158–174 (2014). https://doi.org/10.1080/15481603.2014.900983

  11. Chang, N.-B., Bai, K., Chen, C.-F.: Integrating multisensor satellite data merging and image reconstruction in support of machine learning for better water quality management. J. Environ. Manag. 201, 227–240 (2017). https://doi.org/10.1016/j.jenvman.2017.06.045

    Article  Google Scholar 

  12. Hafeez, S. et. al.: Comparison of machine learning algorithms for retrieval of water quality indicators in Case-II waters: a case study of Hong Kong. Remote Sens., vol. 11, no. 6 (2019). https://doi.org/10.3390/rs11060617

  13. Chen, K. et. al.: Comparative analysis of surface water quality prediction performance and identification of key water parameters using different machine learning models based on big data. Water Res. 171 (2020). https://doi.org/10.1016/j.watres.2019.115454

  14. Xu, X. et. al.: Real-time detection of potable-reclaimed water pipe cross-connection events by conventional water quality sensors using machine learning methods. J. Environ. Manage. 238, 201–209 (2019). https://doi.org/10.1016/j.jenvman.2019.02.110

  15. Ahmed, N. A. et al.: Machine learning methods for better water quality prediction. J. Hydrol. 578, 124084 (2019). https://doi.org/10.1016/j.jhydrol.2019.124084

  16. Li, Y., et al.: Lagoon water quality monitoring based on digital image analysis and machine learning estimators. Water Res. 172 (2020). https://doi.org/10.1016/j.watres.2020.115471

  17. Arias-Rodriguez, L. et. al.: Monitoring water quality of valle de bravo reservoir, mexico, using entire lifespan of meris data and machine learning approaches. Remote Sens. 12(10), 1586 (2020). https://doi.org/10.3390/rs12101586

  18. Lu, H., Ma, X.: Hybrid decision tree-based machine learning models for short-term water quality prediction. Chemosphere 249 (2020). https://doi.org/10.1016/j.chemosphere.2020.126169

  19. Xu, T., Coco, G., Neale, M.: A predictive model of recreational water quality based on adaptive synthetic sampling algorithms and machine learning. Water Res. 177, 115788 (2020). https://doi.org/10.1016/j.watres.2020.115788

    Article  Google Scholar 

  20. El Bilali, A., Taleb, A.: Prediction of irrigation water quality parameters using machine learning models in a semi-arid environment. J. Saudi Soc. Agric. Sci. 19(7), 439–451 (2020). https://doi.org/10.1016/j.jssas.2020.08.001

    Article  Google Scholar 

  21. Asadollah, S.B.H.S., et. al.: River water quality index prediction and uncertainty analysis: a comparative study of machine learning models. J. Environ. Chem. Eng. 9(1) (2021) https://doi.org/10.1016/j.jece.2020.104599

  22. Lu, Q. et. al.: Retrieval of water quality from UAV-Borne hyperspectral imagery: a comparative study of machine learning algorithms. Remote Sens. 13(19), 3928 (2021). https://doi.org/10.3390/rs13193928

  23. Wang, L. et. al.: Improving the robustness of beach water quality modeling using an ensemble machine learning approach. Sci. Total Environ. 765, 142760 (2021). https://doi.org/10.1016/j.scitotenv.2020.142760

  24. Nasir, N. et. al.: Water quality classification using machine learning algorithms. J. Water Process. Eng. 48, 102920 (2022). https://doi.org/10.1016/j.jwpe.2022.102920

  25. Tung, T.M., Yaseen, Z.M.: A survey on river water quality modelling using artificial intelligence models: 2000–2020. J. Hydrol. 585, 124670 (2020). https://doi.org/10.1016/j.jhydrol.2020.124670

    Article  Google Scholar 

  26. Mengyuan, Z., et al.: A review of the application of machine learning in water quality evaluation. Eco-Environment & Health (2022)

    Google Scholar 

  27. Kadiwal, A.: Water Quality [Dataset]. https://www.kaggle.com/adityakadiwal/water-potability. Accessed 25 Mar 2022

  28. Ali, M.: Pycaret: an open source, low-code machine learning library in python, PyCaret version 2.3.5 (2020). https://www.pycaret.org. Accessed 25 Mar 2022

  29. Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017)

    Google Scholar 

  30. Luo, Y.: Evaluating the state of the art in missing data imputation for clinical data. Brief. Bioinform. 23(1) (2022). https://doi.org/10.1093/bib/bbab489

  31. Liu, F.T., Ting, K.M., Zhou, Z.-H.: Isolation forest. In: 8th IEEE International Conference on Data Mining, Pisa, Italy, pp. 413–422 (2008). https://doi.org/10.1109/ICDM.2008.17

  32. Amorim, L.B., Cavalcanti, G.D., Cruz, R.M.: The choice of scaling technique matters for classification performance. Appl. Soft Comput. 133 (2023)

    Google Scholar 

  33. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  34. Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5) (2001)

    Google Scholar 

  35. Chen, T., Guestrin, C. XgBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, pp. 785–794 (2016)

    Google Scholar 

  36. Freund Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Eur. Conf. Comput. Learn. Theory, 23–37. Barcelona, Spain (2016)

    Google Scholar 

  37. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Article  Google Scholar 

  38. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)

    Article  MATH  Google Scholar 

  39. Tharwat, A.: Linear vs quadratic discriminant analysis classifier: a tutorial. Int. J. Appl. Pattern Recogn. 3(2), 145–180 (2016)

    Article  Google Scholar 

  40. Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967). https://doi.org/10.1109/TIT.1967.1053964

    Article  MATH  Google Scholar 

  41. Murphy, K.P.: Naive Bayes classifiers. Univ. British Columbia 18(60), 1–8 (2006)

    Google Scholar 

  42. Kleinbaum, D.G., et al.: Logistic Regression, p. 536. Springer-Verlag, New York (2002)

    Google Scholar 

  43. Cortes, C., Vapnik, V.I.: Support vector networks. Mach. Learn. 20(3), 273–297 (1995)

    Article  MATH  Google Scholar 

  44. Genuer, R., Poggi, J.-M., Tuleau-Malot, C.: Variable selection using random forests. Pattern Recogn. Lett. 31(14), 2225–2236 (2010)

    Article  Google Scholar 

  45. Sharma, A., Shrimali, V. R., Beyeler, M.: Machine learning for OpenCV 4: intelligent algorithms for building image processing apps using OpenCV 4, Python, and scikit-learn. Packt Publishing Ltd (2019)

    Google Scholar 

  46. Kaddoura, S.: Evaluation of machine learning algorithm on drinking water quality for better sustainability. Sustainability 14(18), 11478 (2022). https://doi.org/10.3390/su141811478

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the research project OpenDCO, “Open Data City Officer” (Project No.: 22022-1-CY01-KA220-HED-000089196, Erasmus+ KA2: KA220-HED - Cooperation partnerships in higher education)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gregory Davrazos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Davrazos, G., Panagiotakopoulos, T., Kotsiantis, S. (2023). Water Quality Estimation from IoT Sensors Using a Meta-ensemble. In: Maglogiannis, I., Iliadis, L., Papaleonidas, A., Chochliouros, I. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 677. Springer, Cham. https://doi.org/10.1007/978-3-031-34171-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34171-7_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34170-0

  • Online ISBN: 978-3-031-34171-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics