Skip to main content

An Approach for Predicting River Water Quality Using Data Mining Technique

  • Conference paper
  • First Online:
Advances in Data Mining: Applications and Theoretical Aspects (ICDM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9165))

Included in the following conference series:

  • 1523 Accesses

Abstract

Water contains many chemical, physical, and biological impurities. Some impurities are benign while others are toxic. The quality of water is defined in terms of its physical, chemical, and biological parameters and ascertaining its quality is crucial before use for various intended purposes such as potable water, agricultural, industrial, etc. Various water analysis methods are employed to determine water quality parameters such as DO, COD, BOD, pH, TDS, salinity, chlorophyll-a, coli form, and organic contaminants such as pesticides. The list of potential water contaminants is exhaustive and impractical to test for in its entirety. Such water testing is sometimes costly and time consuming. This paper attempts to present application of data mining technique to build a model to predict a widely used gross water quality parameter called Biochemical oxygen demand (BOD). BOD is a measure of the amount of dissolved oxygen used by microbial oxidation of organic matter in wastewater. The standard method for measuring BOD is a 5-day process. Dilution of sample, constant pH and nutrient content besides the temperature of 20 °C and dark area are required for correct results. High levels of nitrogen compounds yield false BOD results. Winkler titration which is also used to measure BOD is a chemical intensive process. Hence an automatic prediction model for BOD has been sought for accurate, cost-effective and time saving measurement. Based on data available for BOD measurements, this paper describes the development of a prediction model for BOD using a technique of data mining, namely, support vector machines (SVM). A correlation coefficient of 0.9471 and RMSE of 0.5019 was obtained for the BOD prediction model on river water quality data. The performance of the proposed model was also compared with two other models namely artificial neural network (ANN) and regression by discretization. Simulation results show that the proposed model performs better than the other two in terms of correlation coefficient and RMSE.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Peavy, H.S., Donald, R., Tchobanoglous, G.: Environmental Engineering. Mc-Graw Hill, London (1985). International Edition

    Google Scholar 

  2. Sawyer, C.N., McCarthy, L., et al.: Chemistry for Environmental Engineering and Science, 5th edn. Tata Mc-Graw Hill, New Delhi (2003)

    Google Scholar 

  3. Metcalf, Eddy: Wastewater Engineering-Treatment and Reuse, 4th edn. Mc-Graw Hill, New Delhi (2003)

    Google Scholar 

  4. Zheng, Z., Padmanabhan, B.: Constructing ensembles from data envelopment analysis. INFORMS J. Comput. 19(4), 486–496 (2007)

    Article  MATH  Google Scholar 

  5. Najah, A., Elshafie, A., Karim, A., Jaffar, O.: Prediction of Johor river water quality parameters using artificial neural networks. Eur. J. Sci. Res. 28(3), 422–435 (2009)

    Google Scholar 

  6. Talib, A., Abu Hassan, Y., Abdul Rahman, N.N.: Predicting biochemical oxygen demand as indicator of river pollution using artificial neural networks. In: 18th World IMACS/MODSIM Congress, Cairns, Australia (2009)

    Google Scholar 

  7. Ying, Z., Jun, N., Fu-yi, C., Liang, G.: Water quality forecast through application of BP neural network at Yuqiao reservoir. J. Zhejiang Univ. 8(9), 1482–1487 (2007)

    Article  Google Scholar 

  8. Diamantopoulou, M.J., et al.: Use of neural network technique for prediction of water quality parameters of Axios river in Northern Greece. Eur. Water 11(12), 55–62 (2005)

    Google Scholar 

  9. Zhou, C., Gao, L., Gao, H., Peng, C.: Pattern classification and prediction of water quality by neural network with particle swarm optimization. In: Proceedings of the 6th World Congress on Intelligent Control and Automation, Dalian, China (2006)

    Google Scholar 

  10. Lee, B.-H., Scholz, M.: A comparative study: prediction of constructed treatment wetland performance with k-nearest neighbors and neural networks. Water Air Soil Pollut. 174(1–4), 279–301 (2006)

    Article  Google Scholar 

  11. Sirilak, A., Siripun, S.: Classification and regression trees and MLP neural network to classify water quality of canals in Bangkok, Thailand. Int. J. Intell. Comput. Res. 1(1/2) (2010)

    Google Scholar 

  12. Verma, A., Wei, X., Kusiak, A.: Predicting the total suspended solids in wastewater: a data-mining approach. Eng. Appl. Artif. Intell. 26(4), 1366–1372 (2013)

    Article  Google Scholar 

  13. Ben-hur, A., Weston, J.: A user’s guide to support vector machines. In: Carugo, O., Eisenhaber, F. (eds.) Data Mining Techniques for Life Sciences, vol. 609, pp. 223–239. Humana Press, Totowa (2010)

    Chapter  Google Scholar 

  14. Hongyu, S.: Multiple Kernel Learning with Sequential Minimal Optimization (SMO) Algorithm. Aalto University School of Science, Espoo (2012)

    Google Scholar 

  15. Stitson, M.O., Weston, J.A.E., Gammerman, A., Vovk, V., Vapnik, V.: Theory of support vector machines. Technical report – CSD-TR-96-17 (1996)

    Google Scholar 

  16. Platt, J.C.: A fast algorithm for training support vector machines, Microsoft Research. Technical report (1998)

    Google Scholar 

  17. Xuan, W., Jiak, L.V., Deti, X.: A hybrid approach of support vector machine with particle swarm optimization for water quality prediction. In: 5th International Conference on Computer Science & Education, Hefei, China (2010)

    Google Scholar 

  18. Huang, Z., Luo, J., Li, X., Zhou, Y.: Prediction of effluent parameters of wastewater treatment plant based on improved least square support vector machine with PSO. In: 1st International Conference on Information Science & Engineering – ICISE, Nanjing, China (2009)

    Google Scholar 

  19. Najah, A., El-Shafie, A., Karim, O.A., Jaafar, O., El-Shafie, A.H.: An application of different artificial intelligences techniques for water quality prediction. Int. J. Phys. Sci. 6(22), 5298–5308 (2011)

    Google Scholar 

  20. Malek, S., Mosleh, M., Sharifah, M.S.: Dissolved oxygen prediction using support vector machines. Int. Comput. Inf. Sci. Eng. World Acad. Sci. Eng. Technol. 8(1), 46–50 (2014)

    Google Scholar 

  21. Singh, K.P., Basant, A., Malik, A., Jain, G.: Artificial neural network modeling of the river water quality—a case study. Ecol. Model. 220, 888–895 (2009)

    Article  Google Scholar 

  22. Witten, I.H., Frank, E.: Data Mining-Practical Machine Learning Tools and Technology with Java Implementations. Morgan Kauffman Publications, California (2000)

    Google Scholar 

  23. http://data.gov.uk/dataset/river-water-quality-regions

  24. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 1010–1012 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bharat B. Gulyani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Gulyani, B.B., Mangai, J.A., Fathima, A. (2015). An Approach for Predicting River Water Quality Using Data Mining Technique. In: Perner, P. (eds) Advances in Data Mining: Applications and Theoretical Aspects. ICDM 2015. Lecture Notes in Computer Science(), vol 9165. Springer, Cham. https://doi.org/10.1007/978-3-319-20910-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20910-4_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20909-8

  • Online ISBN: 978-3-319-20910-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics