Skip to main content

Approaches to Building a Detection Model for Water Quality: A Case Study

  • Chapter
  • First Online:
Book cover Modern Approaches for Intelligent Information and Database Systems

Part of the book series: Studies in Computational Intelligence ((SCI,volume 769))

Abstract

Predicting failure or success of an event or value is a problem that has recently been addressed using data mining techniques. By using the information we have from the past and the information of the present, we can increase the chance to take the best decision on a future event. In this paper, we evaluate some popular classification algorithms to model a water quality detection system. The experiment is carried out using data gathered from Thüringer Fernwasserversorgung water company. We briefly introduce baseline steps we followed in order to achieve a descent model for this binary classification problem. We describe the algorithms we have used, and the purpose of using each algorithm, and in the end we come up with a final best model. Representative models are compared using the F1 score, as a performance measurement. Finding the best model allows for early recognition of undesirable changes in the drinking water quality and enables the water supply companies to counteract in time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://gecco-2017.sigevo.org/index.html/HomePage.

  2. 2.

    http://www.spotseven.de/gecco/gecco-challenge/gecco-challenge-2017/.

References

  1. Angelov, P., Manolopoulos, Y., Iliadis, L., Roy, A., Vellasco, M.: In: Advances in Big Data: Proceedings of the 2nd INNS Conference on Big Data, 23–25 Oct 2016, Thessaloniki, Greece, vol. 529. Springer (2016)

    Google Scholar 

  2. Bottenberg, R.A., Ward, J.H.: Applied multiple linear regression. Technical report. Personnel Research Lab Lackland AFB TEX (1963)

    Google Scholar 

  3. Chandrasekaran, S., Freise, M., Stork, J., Rebolledo, M., Bartz-Beielstein, T.: GECCO 2017 Industrial Challenge: Monitoring of Drinking-Water Quality (2017)

    Google Scholar 

  4. Darlington, R.B., Hayes, A.F.: Regression Analysis and Linear Models: Concepts, Applications, and Implementation. Guilford Publications (2016)

    Google Scholar 

  5. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7(Jan), 1–30 (2006)

    Google Scholar 

  6. García, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining. Springer (2015)

    Google Scholar 

  7. Hartshorn, S.: Machine Learning with Random Forests and Decision Trees (2016)

    Google Scholar 

  8. Hassoun, M.H.: Fundamentals of Artificial Neural Networks. MIT press (1995)

    Google Scholar 

  9. James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning, vol. 112. Springer (2013)

    Google Scholar 

  10. Kang, G.K., Gao, J.Z., Xie, G.: Data-driven Water Quality Analysis and Prediction: A survey

    Google Scholar 

  11. Kursa, M.B., Rudnicki, W.R., et al.: Feature selection with the boruta package. J. Stat. Softw. 36(11), 1–13 (2010)

    Article  Google Scholar 

  12. Mohammadpour, R., Shaharuddin, S., Chang, C.K., Zakaria, N.A., Ab Ghani, A., Chan, N.W.: Prediction of water quality index in constructed wetlands using support vector machine. Environ. Sci. Pollut. Res. 22(8), 6208–6219 (2015)

    Article  Google Scholar 

  13. Rodkey, F.L.: The Effect of Temperature on the Oxidation-reduction Potential of the Diphosphopyridine Nucleotide System

    Google Scholar 

  14. Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45(4), 427–437 (2009)

    Article  Google Scholar 

  15. Vapnik, V.: The Nature of Statistical Learning Theory. Springer Science & Business Media (2013)

    Google Scholar 

  16. Wong, J.: Imputation: imputation. R Package Version 2.0, 1 (2013)

    Google Scholar 

  17. Xiang, Y., Jiang, L.: Water quality prediction using LS-SVM and particle swarm optimization. In: Second International Workshop on Knowledge Discovery and Data Mining, 2009. WKDD 2009, pp. 900–904. IEEE (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Fitore Muharemi or Doina Logofătu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Muharemi, F., Logofătu, D., Andersson, C., Leon, F. (2018). Approaches to Building a Detection Model for Water Quality: A Case Study. In: Sieminski, A., Kozierkiewicz, A., Nunez, M., Ha, Q. (eds) Modern Approaches for Intelligent Information and Database Systems. Studies in Computational Intelligence, vol 769. Springer, Cham. https://doi.org/10.1007/978-3-319-76081-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-76081-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-76080-3

  • Online ISBN: 978-3-319-76081-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics