Abstract
This study aims to assess the spatiotemporal performance of Machine Learning-based techniques for simulating streamflow on a continental scale using Long-Sort Term Memory (LSTM) models. The dataset employed is derived from the Model Parameter Estimation Experiment (MOPEX), encompassing 438 watersheds across the US. MOPEX has the longest data record (55 years) compared to other datasets which makes it very suitable for LSTM training. The impact of incorporating vegetation Greenness Fraction (GF) in the LSTMGF model was assessed. To gauge the models’ performance, temporally and spatially, a range of assessment metrics were employed. Upon the integration of GF, the LSTM models either maintained or enhanced streamflow simulation across the US, contingent upon the watershed location and seasonal variations. Notably, the overall median KGE and Percent Bias (PB) values with the inclusion of GF were 0.723 and 4.09, in contrast to 0.717 and 4.94 without the incorporation of GF. In addition, the results indicated that LSTMGF had superior performance compared to LSTM in areas where there was significant seasonal variation in vegetation cover. Results show that using extensive data record (MOPEX) bolstered the performance of LSTM with a Kling-Gupta Efficiency (KGE) reaching up to 0.97 at certain stations compared to only 0.86 when 25 years are used for the training as it is the case of the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) dataset. These findings corroborate the potential for integrating LSTM models into continental scale hydrological models such as the NOAA NextGen National Water Model.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The data used in this work are publicly available via the NOAA National Water Services open-access database and can be found using the https://hydrology.nws.noaa.gov/pub/ link. All models have been trained in a python 3.7 environment with a TensorFlow version of 2.9.1.
References
Chiew F, Zhou S, McMahon T (2003) Use of seasonal streamflow forecasts in water resources management. J Hydrol 270(1–2):135–144
Roy A, Royer A, Turcotte R (2010) Improvement of springtime streamflow simulations in a boreal environment by incorporating snow-covered area derived from remote sensing data. J Hydrol 390(1–2):35–44
Shen Y, Wang S, Zhang B, Zhu J (2022) Development of a stochastic hydrological modeling system for improving ensemble streamflow prediction. J Hydrol 608:127683
Kişi Ö (2007) Streamflow forecasting using different artificial neural network algorithms. J Hydrol Eng 12(5):532–539
Liu J et al (2022) Ensemble streamflow forecasting over a cascade reservoir catchment with integrated hydrometeorological modeling and machine learning. Hydrol Earth Syst Sci 26(2):265–278
Mo R et al (2021) Dynamic long-term streamflow probabilistic forecasting model for a multisite system considering real-time forecast updating through spatio-temporal dependent error correction. J Hydrol 601:126666
Yerdelen C, Tastan M, Abdelkader M (2022) Assessment of trend analysis methods for annual streamflow. Environ Eng Manag J 21(4):569–577
Cheng M, Fang F, Kinouchi T, Navon I, Pain C (2020) Long lead-time daily and monthly streamflow forecasting using machine learning methods. J Hydrol 590:125376
Wang H, Huang J, Zhou H, Deng C, Fang C (2020) Analysis of sustainable utilization of water resources based on the improved water resources ecological footprint model: a case study of Hubei Province, China. J Environ Manage 262:110331
Zhang F, Guo S, Liu X, Wang Y, Engel BA, Guo P (2020) Towards sustainable water management in an arid agricultural region: a multi-level multi-objective stochastic approach. Agric Syst 182:102848
Tounsi A, Temimi M, Gourley JJ (2022) On the use of machine learning to account for reservoir management rules and predict streamflow. Neural Comput Appl 34(21):18917–18931. https://doi.org/10.1007/s00521-022-07500-1
Beven KJ, Kirkby MJ (1979) A physically based, variable contributing area model of basin hydrology/Un modèle à base physique de zone d’appel variable de l’hydrologie du bassin versant. Hydrol Sci J 24(1):43–69
Wagena MB et al (2020) Comparison of short-term streamflow forecasting using stochastic time series, neural networks, process-based, and Bayesian models. Environ Model Softw 126:104669. https://doi.org/10.1016/j.envsoft.2020.104669
Beven KJ (2000) Uniqueness of place and process representations in hydrological modelling. Hydrol Earth Syst Sci 4(2):203–213
Džubáková K (2010) Rainfall-Runoff modelling: its development, classification and possible applications. ACTA Geographica universitatis comenianae 54(2):173–181
Hah D, Quilty JM, Sikorska-Senoner AE (2022) Ensemble and stochastic conceptual data-driven approaches for improving streamflow simulations: exploring different hydrological and data-driven models and a diagnostic tool. Environ Model Softw 157:105474. https://doi.org/10.1016/j.envsoft.2022.105474
Bormann H, Diekkrüger B (2003) Possibilities and limitations of regional hydrological models applied within an environmental change study in Benin (West Africa). Phys Chem Earth Parts A/B/C 28(33–36):1323–1332
Parra V, Fuentes-Aguilera P, Muñoz E (2018) Identifying advantages and drawbacks of two hydrological models based on a sensitivity analysis: a study in two Chilean watersheds. Hydrol Sci J 63(12):1831–1843
Kisi O (2005) Suspended sediment estimation using neuro-fuzzy and neural network approaches/Estimation des matières en suspension par des approches neurofloues et à base de réseau de neurones. Hydrol Sci J. https://doi.org/10.1623/hysj.2005.50.4.683
Luo X, Yuan X, Zhu S, Xu Z, Meng L, Peng J (2019) A hybrid support vector regression framework for streamflow forecast. J Hydrol 568:184–193
Asaad MN, Eryürük Ş, Eryürük K (2022) Forecasting of streamflow and comparison of artificial intelligence methods: a case study for meram stream in Konya, Turkey. Sustainability 14(10):6319
Fang L, Shao D (2022) Application of long short-term memory (LSTM) on the prediction of rainfall-runoff in karst area. Front Phys 9:685
Xiang Z, Demir I (2020) Distributed long-term hourly streamflow predictions using deep learning–a case study for State of Iowa. Environ Model Softw 131:104761. https://doi.org/10.1016/j.envsoft.2020.104761
Frame JM, Kratzert F, Raney A, Rahman M, Salas FR, Nearing GS (2021) Post-processing the national water model with long short-term memory networks for streamflow predictions and model diagnostics. JAWRA J Am Water Resour Assoc 57(6):885–905
Tounsi A, Temimi M (2023) A systematic review of natural language processing applications for hydrometeorological hazards assessment. Nat Hazards 116(3):2819–2870. https://doi.org/10.1007/s11069-023-05842-0
Ghimire S, Yaseen ZM, Farooque AA, Deo RC, Zhang J, Tao X (2021) Streamflow prediction using an integrated methodology based on convolutional neural network and long short-term memory networks. Sci Rep 11(1):1–26
Johnson JM, Munasinghe D, Eyelade D, Cohen S (2019) An integrated evaluation of the national water model (NWM)–height above nearest drainage (HAND) flood mapping methodology. Nat Hazard 19(11):2405–2420
Han H, Kim J, Chandrasekar V, Choi J, Lim S (2019) Modeling streamflow enhanced by precipitation from atmospheric river using the NOAA national water model: a case study of the Russian river basin for February 2004. Atmosphere 10(8):466
Abdelkader M, Temimi M, Ouarda TBMJ (2023) Assessing the national water model’s streamflow estimates using a multi-decade retrospective dataset across the contiguous United States. Water 15(13):2319
Uysal G (2016) Streamflow forecasting using different neural network models with satellite data for a snow dominated region in Turkey. Procedia Eng 154:1185–1192
Gan R, Chen C, Tao J, Shi Y (2021) Hydrological process simulation of sluice-controlled rivers in the plains area of China based on an improved SWAT model. Water Resour Manage 35(6):1817–1835
Gonzalez A, Temimi M, Khanbilvardi R (2015) Adjustment to the curve number (NRCS-CN) to account for the vegetation effect on hydrological processes. Hydrol Sci J 60(4):591–605
Ni S, Wen H, Wilson G, Cai C, Wang J (2022) A simulated study of surface morphological evolution on coarse-textured soils under intermittent rainfall events. CATENA 208:105767
Temimi M, Leconte R, Chaouch N, Sukumal P, Khanbilvardi R, Brissette F (2010) A combination of remote sensing data and topographic attributes for the spatial and temporal monitoring of soil wetness. J Hydrol 388(1–2):28–40
J Schaake, S Cong, Q Duan (2006) US MOPEX data set. Lawrence Livermore National Lab.(LLNL), Livermore, CA (United States)
Bai P, Liu X, Xie J (2021) Simulating runoff under changing climatic conditions: a comparison of the long short-term memory network with two conceptual hydrologic models. J Hydrol 592:125779. https://doi.org/10.1016/j.jhydrol.2020.125779
Xie Y, Sun W, Ren M, Chen S, Huang Z, Pan X (2023) Stacking ensemble learning models for daily runoff prediction using 1D and 2D CNNs. Expert Syst Appl 217:119469. https://doi.org/10.1016/j.eswa.2022.119469
Addor N, Newman AJ, Mizukami N, Clark MP (2017) The CAMELS data set: catchment attributes and meteorology for large-sample studies. Hydrol Earth Syst Sci 21(10):5293–5313
Frame JM et al (2022) Deep learning rainfall–runoff predictions of extreme events. Hydrol Earth Syst Sci 26(13):3377–3392
Kratzert F, Klotz D, Brenner C, Schulz K, Herrnegger M (2018) Rainfall–runoff modelling using long short-term memory (LSTM) networks. Hydrol Earth Syst Sci 22(11):6005–6022
Newman A et al (2015) Development of a large-sample watershed-scale hydrometeorological data set for the contiguous USA: data set characteristics and assessment of regional variability in hydrologic model performance. Hydrol Earth Syst Sci 19(1):209–223
Ayzel G, Heistermann M (2021) The effect of calibration data length on the performance of a conceptual hydrological model versus LSTM and GRU: a case study for six basins from the CAMELS dataset. Comput Geosci 149:104708. https://doi.org/10.1016/j.cageo.2021.104708
Boulmaiz T, Guermoui M, Boutaghane H (2020) Impact of training data size on the LSTM performances for rainfall–runoff modeling. Model Earth Syst Environ 6(4):2153–2164
Gutman G, Ignatov A (1998) The derivation of the green vegetation fraction from NOAA/AVHRR data for use in numerical weather prediction models. Int J Remote Sens 19(8):1533–1543
Hunt KM, Matthews GR, Pappenberger F, Prudhomme C (2022) Using a long short-term memory (LSTM) neural network to boost river streamflow forecasts over the western United States. Hydrol Earth Syst Sci 26(21):5449–5472
Hochreiter S, Schmidhuber J (1997) Long Short-Term Memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12(7):2121–2159
Tieleman T, Hinton G (2012) Lecture 6.5-rmsprop, coursera: Neural networks for machine learning. Univ Toronto Tech Rep 6:307
Gupta HV, Kling H, Yilmaz KK, Martinez GF (2009) Decomposition of the mean squared error and NSE performance criteria: implications for improving hydrological modelling. J Hydrol 377(1):80–91. https://doi.org/10.1016/j.jhydrol.2009.08.003
Nash JE, Sutcliffe JV (1970) River flow forecasting through conceptual models part I–a discussion of principles. J Hydrol 10(3):282–290
Moriasi DN, Arnold JG, Van Liew MW, Bingner RL, Harmel RD, Veith TL (2007) Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans ASABE 50(3):885–900
Yerdelen C, Abdelkader M, Eris E (2021) Assessment of drought in SPI series using continuous wavelet analysis for Gediz Basin, Turkey. Atmos Res 260:105687
Knoben WJM, Freer JE, Woods RA (2019) Technical note: Inherent benchmark or not? Comparing Nash-Sutcliffe and Kling-Gupta efficiency scores. Hydrol Earth Syst Sci 23(10):4323–4331. https://doi.org/10.5194/hess-23-4323-2019
Gauch M, Mai J, Lin J (2021) The proper care and feeding of CAMELS: how limited training data affects streamflow prediction. Environ Model Softw 135:104926
Abdelkader M et al (2022) Assessing the spatiotemporal variability of SMAP soil moisture accuracy in a deciduous forest region. Remote Sens 14(14):3329
Acknowledgements
The authors acknowledge the NOAA hydrology web portal in helping us to access the MOPEX data.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
AT contributed to conceptualization, methodology, investigation, validation, visualization, writing–original draft, writing–review & editing. MA contributed to conceptualization, methodology, investigation, writing–review & editing. MT contributed to conceptualization, investigation, writing–review & editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tounsi, A., Abdelkader, M. & Temimi, M. Assessing the simulation of streamflow with the LSTM model across the continental United States using the MOPEX dataset. Neural Comput & Applic 35, 22469–22486 (2023). https://doi.org/10.1007/s00521-023-08922-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08922-1