Skip to main content
Log in

A water quality prediction method based on k-nearest-neighbor probability rough sets and PSO-LSTM

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Water security has attracted a lot of attention in the world, and water quality assessment is the main task to ensure water security. In China, as an important water supply route in the Beijing-Tianjin-Hebei region, the South-North Water Transfer Middle Route is crucial to the economic development and health of people in the region. Therefore, an effective water quality prediction method is essential to prevent water quality degradation and water pollution in the South-North Water Transfer Line. In this paper, on the basis of the water quality data of 13 automatic monitoring stations in the middle of the Transfer Middle Route, we propose a water quality prediction method based on k-nearest-neighbor probability rough sets and PSO-LSTM algorithm. More specifically, we first introduce a novel model of k-nearest-neighbor probabilistic rough sets (KNPRSs) by combining k-nearest-neighbor algorithm and probabilistic rough sets. Then, an attribute reduction approach based on KNPRSs is developed, which can effectively eliminate the redundant attributes in water quality assessment and filter out the valuable feature attributes. Furthermore, we propose a water quality prediction method based on LSTM neural network model optimized by PSO algorithm. By introducing the PSO algorithm, the hyperparameters of the LSTM neural network are adaptively optimized to improve the accuracy of water quality prediction. At last, the historical data of three automatic monitoring stations along the route are selected, and six water quality indicators with practical forecasting value are used as the target, and a comparison experiment is conducted. The experimental results show that the KNPRSs-PSO-LSTM model can fully extract the key characteristics of water quality attributes, can be used to predict different target indicators at different stations, which is reliable, stable and efficient, can effectively improve the prediction accuracy, can be applied to the South-North Water Transfer Middle Route water quality forecasting and early warning work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data Availability and Access

The data that has been used is confidential.

References

  1. Pawlak Z, Skowron A (2007) Rudiments of rough sets. Inf Sci 177(1):3–27

    Article  MathSciNet  Google Scholar 

  2. Pawlak Z (1998) Rough set theory and its applications to data analysis. Cybern Syst 29(7):661–688

    Article  Google Scholar 

  3. Lin T (1988) Neighborhood systems and approximation in relational databases and knowledge bases. In: Proceedings of the 4th international symposium on methodologies of intelligent systems, citeseer, pp 75–86

  4. Yao Y (1998) Relational interpretations of neighborhood operators and rough set approximation operators. Inf Sci 111(1–4):239–259

    Article  MathSciNet  Google Scholar 

  5. Wu WZ, Zhang WX (2002) Neighborhood operator systems and approximations. Inf Sci 144(1–4):201–217

    Article  MathSciNet  Google Scholar 

  6. Hu Q, Yu D, Liu J et al (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594

    Article  MathSciNet  Google Scholar 

  7. Ma L (2012) On some types of neighborhood-related covering rough sets. Int J Approx Reason 53(6):901–911

    Article  MathSciNet  Google Scholar 

  8. Li W, Huang Z, Jia X et al (2016) Neighborhood based decision-theoretic rough set models. Int J Approx Reason 69:1–17

    Article  MathSciNet  Google Scholar 

  9. Zhang Y, Miao D, Zhang Z et al (2018) A three-way selective ensemble model for multi-label classification. Int J Approx Reason 103:394–413

    Article  MathSciNet  Google Scholar 

  10. Fujita H, Gaeta A, Loia V et al (2018) Resilience analysis of critical infrastructures: a cognitive approach based on granular computing. IEEE Trans Cybern 49(5):1835–1848

    Article  Google Scholar 

  11. Yang X, Chen Y, Fujita H et al (2022) Mixed data-driven sequential three-way decision via subjective-objective dynamic fusion. Knowl-Based Syst 237:107728

    Article  Google Scholar 

  12. Liu J, Lin Y, Ding W et al (2023) Multi-label feature selection based on label distribution and neighborhood rough set. Neurocomputing 524:142–157

    Article  Google Scholar 

  13. Yin T, Chen H, Yuan Z et al (2023) Noise-resistant multilabel fuzzy neighborhood rough sets for feature subset selection. Inf Sci 621:200–226

    Article  Google Scholar 

  14. Wang C, Hu Q, Wang X et al (2017) Feature selection based on neighborhood discrimination index. IEEE Trans Neural Netw Learn Sys 29(7):2986–2999

    MathSciNet  Google Scholar 

  15. Wang C, Shi Y, Fan X et al (2019) Attribute reduction based on k-nearest neighborhood rough sets. Int J Approx Reason 106:18–31

    Article  MathSciNet  Google Scholar 

  16. Wan J, Chen H, Yuan Z et al (2021) A novel hybrid feature selection method considering feature interaction in neighborhood rough set. Knowl-Based Syst 227:107167

    Article  Google Scholar 

  17. Sang B, Chen H, Yang L et al (2021) Feature selection for dynamic interval-valued ordered data based on fuzzy dominance neighborhood rough set. Knowl-Based Syst 227:107223

  18. Yang X, Li M, Fujita H et al (2022) Incremental rough reduction with stable attribute group. Inf Sci 589:283–299

    Article  Google Scholar 

  19. Yang X, Chen H, Li T et al (2022) Student-t kernelized fuzzy rough set model with fuzzy divergence for feature selection. Inf Sci 610:52–72

    Article  Google Scholar 

  20. Hu M, Tsang EC, Guo Y et al (2021) A novel approach to attribute reduction based on weighted neighborhood rough sets. Knowl-Based Syst 220:106908

    Article  Google Scholar 

  21. Hu M, Tsang EC, Guo Y et al (2022) Attribute reduction based on overlap degree and k-nearest-neighbor rough sets in decision information systems. Inf Sci 584:301–324

    Article  Google Scholar 

  22. Wong SM, Ziarko W (1987) Comparison of the probabilistic approximate classification and the fuzzy set model. Fuzzy Sets Syst 21(3):357–362

    Article  MathSciNet  Google Scholar 

  23. Wang G, Yu H et al (2015) Monotonic uncertainty measures for attribute reduction in probabilistic rough set model. Int J Approx Reason 59:41–67

    Article  MathSciNet  Google Scholar 

  24. Xie J, Hu BQ, Jiang H (2022) A novel method to attribute reduction based on weighted neighborhood probabilistic rough sets. Int J Approx Reason 144:1–17

    Article  MathSciNet  Google Scholar 

  25. Maier HR, Dandy GC (2000) Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environ Model Softw 15(1):101–124

    Article  Google Scholar 

  26. McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133

    Article  MathSciNet  Google Scholar 

  27. Hebb DO (2005) The organization of behavior: a neuropsychological theory. Psychol Press

  28. Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386

    Article  Google Scholar 

  29. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536

    Article  Google Scholar 

  30. Jin W, Li ZJ, Wei LS et al (2000) The improvements of bp neural network learning algorithm. In: WCC 2000-ICSP 2000. 2000 5th International conference on signal processing proceedings. 16th World computer congress 2000, IEEE, pp 1647–1649

  31. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    Article  MathSciNet  Google Scholar 

  32. Bengio Y, Lamblin P, Popovici D et al (2006) Greedy layer-wise training of deep networks. Adv Neural Inf Process Sys 19

  33. Ranzato M, Poultney C, Chopra S et al (2006) Efficient learning of sparse representations with an energy-based model. Adv Neural Inf Process Sys 19

  34. Kirzhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Sys 25:1097–1105

    Google Scholar 

  35. Zhang L, Zhao JQ, Zhang XN et al (2013) Study of a new improved pso-bp neural network algorithm. J Harbin Inst Tech 20(5):106–112

    Google Scholar 

  36. Wang S, Zhang N, Wu L et al (2016) Wind speed forecasting based on the hybrid ensemble empirical mode decomposition and ga-bp neural network method. Renew Energy 94:629–636

    Article  Google Scholar 

  37. Xu X, Ding S, Jia W et al (2013) Research of assembling optimized classification algorithm by neural network based on ordinary least squares (ols). Neural Comput Applic 22:187–193

  38. Lei L (2018) Wavelet neural network prediction method of stock price trend based on rough set attribute reduction. Appl Soft Comput 62:923–932

    Article  Google Scholar 

  39. He H, Lu Z, Zhang C et al (2021) A data-driven method for dynamic load forecasting of scraper conveyer based on rough set and multilayered self-normalizing gated recurrent network. Energy Rep 7:1352–1362

  40. Wang Y, Zhou J, Chen K et al (2017) Water quality prediction method based on lstm neural network. In: 2017 12th International conference on intelligent systems and knowledge engineering (ISKE), IEEE, pp 1–5

  41. Ren T, Liu X, Niu J et al (2020) Real-time water level prediction of cascaded channels based on multilayer perception and recurrent neural network. J Hydrol 585:124783

    Article  Google Scholar 

  42. Remolina MCR, Li Z, Peleato NM (2022) Application of machine learning methods for rapid fluorescence-based detection of naphthenic acids and phenol in natural surface waters. J Hazard Mater 430:128491

    Article  Google Scholar 

  43. Wang S, Peng H, Liang S (2022) Prediction of estuarine water quality using interpretable machine learning approach. J Hydrol 605:127320

    Article  Google Scholar 

  44. Wang L, Dong H, Cao Y et al (2023) Real-time water quality detection based on fluctuation feature analysis with the lstm model. J Hydroinformatics 25(1):140–149

    Article  Google Scholar 

  45. Tao D, Yang Y, Cai Z et al (2023) Application of vmd-lstm in water quality prediction. In: Journal of physics: conference series, IOP Publishing, p 012057

  46. Lin TY et al (1998) Granular computing on binary relations i: data mining and neighborhood systems. Rough Sets Knowl Discov 1(1):107–121

    Google Scholar 

  47. Yao Y (2008) Probabilistic rough set approximations. Int J Approx Reason 49(2):255–271

    Article  Google Scholar 

  48. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comp 9(8):1735–1780

    Article  Google Scholar 

  49. Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: MHS’95. In: Proceedings of the sixth international symposium on micro machine and human science, IEEE, pp 39–43

  50. Marini F, Walczak B (2015) Particle swarm optimization (pso). a tutorial. Chemometr Intell Lab Syst 149:153–165

    Article  Google Scholar 

Download references

Acknowledgements

The work described in this paper was supported by grants from the National Natural Science Foundation of China (Grant nos. 11971365 and 11571010) and the Key Project of Guangxi Natural Science Foundation (Grant No. 2023GXNSFDA026006).

Author information

Authors and Affiliations

Authors

Contributions

Minrui Huang: Methodology, Investigation, Writing-original draft. Bao Qing Hu: Methodology, Writing-Reviewing and Editing. Haibo Jiang: Methodology, Writing-Reviewing and Editing. Bo Wen Fang: Methodology, Investigation and Writing-Reviewing.

Corresponding author

Correspondence to Bao Qing Hu.

Ethics declarations

Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical and informed consent for data used

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, M., Hu, B.Q., Jiang, H. et al. A water quality prediction method based on k-nearest-neighbor probability rough sets and PSO-LSTM. Appl Intell 53, 31106–31128 (2023). https://doi.org/10.1007/s10489-023-05024-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-05024-2

Keywords

Navigation