Abstract
An in-depth understanding of a key element such as lake evaporation is particularly beneficial in developing the optimal management approach for reservoirs. In this study, we first aim to evaluate the applicability of regressors Random Forest (RF), Gradient Booting (GB), and Decision Tree (DT), K Nearest Neighbor (kNN), and XGBoost architectures to predict daily lake evaporation of five reservoirs in the Awash River basin, Ethiopia. The best performing models, Gradient Boosting and XGBoost, are then explained through an explanatory framework using daily climate datasets. The interpretability of the models was evaluated using the Shapley Additive explanations (SHAP). The GB model performed better with (RMSE = 0.045, MSE = 0.031, MAE = 0.002, NSE = 0.997, KGF = 0.991, RRMSE = 0.011) for Metehara Station, (RMSE = 0.032, MSE = 0.024, MAE = 0.001, NSE = 0.998, KGF = 0.999, RRMSE = 0.008) at Melkasa Station, and Dubti Station (RMSE = 0.13, MSE = 0.09, MAE = 0.017, NSE = 0.982, KGF = 0.977,RRMSE = 0.022) as the same as of XGBoost. The factors with the greatest overall impact on the daily evaporation for GB and XGboost Architecture were the SH, month, Tmax, and Tmin for Metehara and Melkasa, and Tmax, Tmin, and month had the greatest impact on the daily evaporation for Dubti. Furthermore, the interpretability of the models showed good agreement between the MLAs simulations and the actual hydro-climatic evaporation process. This result allows decision makers to not only rely on the results of an algorithm, but to make more informed decisions by using interpretable results for better control of the basin reservoir operating rules.
Similar content being viewed by others
Data availability
The datasets used during the current study are available from the corresponding author on reasonable request.
Abbreviations
- SHAP:
-
SHAPley Additive explanations
- ANFIS :
-
Adaptive Neuro-Fuzzy Interface System
- ANN :
-
Artificial Neural Network
- CANFIS :
-
Coactive Neuro-Fuzzy Inference System
- DT :
-
Decision tree
- EVP :
-
Lake Evaporation
- GB :
-
Gradient Booting
- GMDH :
-
Group method of data handling
- HHO :
-
Harris Hawks Optimization
- kNN :
-
K Nearest Neighbor
- LS-SVR :
-
Least squares suPCPort vector regression
- MAE :
-
Mean Absolute Error
- MLAs :
-
Machine learning Algorithms
- MLR :
-
Multiple Linear Regressions
- MoWE :
-
Ministry of Water and Energy of Ethiopia
- MSE :
-
Mean Square Error
- NASH :
-
Nash Sutcliff-model Efficiency
- PSO :
-
Particle Swarm Optimization
- RF :
-
Random Forest
- RMSE :
-
Root Mean Square Error
- SVM :
-
Support Vector Machine
- SVR :
-
Support Vector Regression
- WOA :
-
Optimization Algorithm
References
Abed M, Imteaz MA, Ahmed AN, Huang YF (2022) Modelling monthly pan evaporation utilising Random Forest and deep learning algorithms. Sci Rep 12(1):13132. https://doi.org/10.1038/s41598-022-17263-3
Adeba D, Kansal ML, Sen S (2015) Assessment of water scarcity and its impacts on sustainable development in Awash basin, Ethiopia. Sustain Water Resour Manag 1(1):71–87. https://doi.org/10.1007/s40899-015-0006-7
Adnan RM, Mostafa RR, Elbeltagi A, Yaseen ZM, Shahid S, Kisi O (2022) Development of new machine learning model for streamflow prediction: case studies in Pakistan. In Stochastic Environmental Research and Risk Assessment (Vol. 36, Issue 4). Springer Berlin Heidelberg. https://doi.org/10.1007/s00477-021-02111-z
Al Sudani ZA, Salem GSA (2022) Evaporation Rate Prediction Using Advanced Machine Learning Models: A Comparative Study. Adv Meteorol 2022(1433835):13. https://doi.org/10.1155/2022/1433835
Allawi MF, Aidan IA, El-Shafie A (2021) Enhancing the performance of data-driven models for monthly reservoir evaporation prediction. Environ Sci Pollut Res 28(7):8281–8295. https://doi.org/10.1007/s11356-020-11062-x
Allawi MF, Othman FB, Afan HA, Ahmed AN, Hossain MS, Fai CM, El-Shafie A (2019) Reservoir evaporation prediction modeling based on artificial intelligence methods. Water (Switzerland), 11(6). https://doi.org/10.3390/w11061226
Al-Mukhtar M (2021) Modeling of pan evaporation based on the development of machine learning methods. Theoret Appl Climatol 146(3–4):961–979. https://doi.org/10.1007/s00704-021-03760-4
Arya Azar N, Kardan N, Ghordoyee Milan S (2021) Developing the artificial neural network–evolutionary algorithms hybrid models (ANN–EA) to predict the daily evaporation from dam reservoirs. Eng Comput 0123456789. https://doi.org/10.1007/s00366-021-01523-3
Barbier EB (2004) Explaining Agricultural Land Expansion and Deforestation in Developing Countries. Am J Agr Econ 86(5):1347–1353. https://doi.org/10.2190/fjdp-vru8-f78e-fhwe
Bellido-Jiménez JA, Estévez J, García-Marín AP (2021) New machine learning approaches to improve reference evapotranspiration estimates using intra-daily temperature-based variables in a semi-arid region of Spain. Agric Water Manag 245(September). https://doi.org/10.1016/j.agwat.2020.106558
Bergstra J, Bengio Y (2012) Random Search for Hyper-Parameter Optimization. J Mach Learn Res 13:281–305
Chakraborty D, Başağaoğlu H, Winterle J (2021) Interpretable vs. noninterpretable machine learning models for data-driven hydro-climatological process modeling. Exp Syst Appl 170:114498
Chen T, Guestrin C (2016) XGBoost A scalable tree boosting system. Proc ACM SIGKDD Int Conf Knowl Discov Data Mining 13-17-Augu:785–794. https://doi.org/10.1145/2939672.2939785
Doshi-Velez F, Kim B (2017) Towards A Rigorous Science of Interpretable Machine Learning. ArXiv Preprint ArXiv:1702.08608 Ml:1–13. http://arxiv.org/abs/1702.08608 Accessed 15 Feb 2023
El Bilali A, Abdeslam T, Ayoub N, Lamane H, Ezzaouini MA, Elbeltagi A (2023) An interpretable machine learning approach based on DNN, SVR, Extra Tree, and XGBoost models for predicting daily pan evaporation. J Environ Manag 327(September 2022):116890. https://doi.org/10.1016/j.jenvman.2022.116890
Elbeltagi A, Al Mukhtar M, Kushwaha NL, Vishwakarma DK (2022) Monthly Pan Evaporation Modelling Using Hybrid Machine Learning Algorithms in a Semi-Arid Environment. SSRN Electron J. https://doi.org/10.2139/ssrn.4050027
Emiru NC, Recha JW, Thompson JR, Belay A, Aynekulu E, Manyevere A, Demissie TD, Osano PM, Hussein J, Molla MB, Mengistu GM, Solomon D (2022) Impact of Climate Change on the Hydrology of the Upper Awash River Basin, Ethiopia. Hydrology, 9(1). https://doi.org/10.3390/hydrology9010003
Endalie D, Haile G, Taye W (2022) Deep learning model for daily rainfall prediction: case study of Jimma. Ethiopia Water Supply 22(3):3448–3461. https://doi.org/10.2166/ws.2021.391
Fanta SS, Yesuf MB, Demissie TA (2023) Investigation of climate change impact on the optimal operation of koka reservoir, upper awash watershed, Ethiopia. Heliyon 9(5):e16287. https://doi.org/10.1016/j.heliyon.2023.e16287
Feng Y, Jia Y, Zhang Q, Gong D, Cui N (2018) National-scale assessment of pan evaporation models across different climatic zones of China. J Hydrol 564:314–328
Gaudard L, Romerio F, Dalla Valle F, Gorret R, Maran S, Ravazzani G, Stoffel M, Volonterio M (2014) Climate change impacts on hydropower in the Swiss and Italian Alps. Sci Total Environ 493:1211–1221. https://doi.org/10.1016/J.SCITOTENV.2013.10.012
Gedefaw M, Wang H, Yan D, Song X, Yan D, Dong G, Wang J, Girma A, Ali BA, Batsuren D, Abiyu A, Qin T (2018) Trend analysis of climatic and hydrological variables in the Awash river basin, Ethiopia. Water (switzerland) 10(11):1–14. https://doi.org/10.3390/w10111554
Gedefaw M, Wang H, Yan D, Qin T, Wang K, Girma A, Batsuren D, Abiyu A (2019) Water resources allocation systems under irrigation expansion and climate change scenario in Awash River Basin of Ethiopia. Water (switzerland) 11(10):1–15. https://doi.org/10.3390/w11101966
Gonzalez JM, Matrosov ES, Obuobie E, Mul M, Pettinotti L, Gebrechorkos SH, Sheffield J, Bottacin-Busolin A, Dalton J, Smith DM, Harou JJ (2021) Quantifying Cooperation Benefits for New Dams in Transboundary Water Systems Without Formal Operating Rules. Front Environ Sci 9(May). https://doi.org/10.3389/fenvs.2021.596612
Jerome H. Friedman* (1999) Stochastic Gradient Boosting
Kayhomayoon Z, Naghizadeh F, Malekpoor M, Arya Azar N, Ball J, Ghordoyee Milan S (2022) Prediction of evaporation from dam reservoirs under climate change using soft computing techniques. In Environmental Science and Pollution Research (Issue 0123456789). Springer Berlin Heidelberg. https://doi.org/10.1007/s11356-022-23899-5
Lu X, Ju Y, Wu L, Fan J, Zhang F, Li Z (2018) Daily pan evaporation modeling from local and cross-station data using three tree-based machine learning models. J Hydrol 566:668–684. https://doi.org/10.1016/j.jhydrol.2018.09.055
Mirani KB, Ayele MA, Lohani TK, Ukumo TY (2022) Evaluation of Hydropower Generation and Reservoir Operation under Climate Change from Kesem Reservoir, Ethiopia. Adv Meteorol 2022. https://doi.org/10.1155/2022/3336257
Mosca E, Szigeti F, Tragianni S, Gallagher D, Groh G (2022) SHAP-Based Explanation Methods: A Review for NLP Interpretability. Proceedings of the 29th International Conference on Computational Linguistics, 4593–4603.
Roya Narimani, Changhyun Jun, Carlo De Michele et al. Multilayer Perceptron-based Predictive Model for the Reconstruction of Missing Rainfall Data, 14 March 2022, PREPRINT (Version 1) available at Research Square https://doi.org/10.21203/rs.3.rs-1377902/v1
Rasouli K, Hsieh WW, Cannon AJ (2012) Daily streamflow forecasting by machine learning methods with weather and climate inputs. J Hydrol 414–415:284–293. https://doi.org/10.1016/j.jhydrol.2011.10.039
Sahu RK, Müller J, Park J, Varadharajan C, Arora B, Faybishenko B, Agarwal D (2020) Impact of Input Feature Selection on Groundwater Level Prediction From a Multi-Layer Perceptron Neural Network. Frontiers in Water 2(November):1–15. https://doi.org/10.3389/frwa.2020.573034
Štrumbelj E, Kononenko I (2014) Explaining prediction models and individual predictions with feature contributions. Knowl Inf Syst 41(3):647–665. https://doi.org/10.1007/s10115-013-0679-x
Sun S, Song Z, Chen X, Wang T, Zhang Y, Zhang D, Zhang H, Hao Q, Chen B (2020) Multimodel-based analyses of evapotranspiration and its controls in China over the last three decades. Ecohydrology 13(3). https://doi.org/10.1002/eco.2195
Tadese M, Kumar L, Koech R, Kogo BK (2020) Mapping of land-use/land-cover changes and its dynamics in Awash River Basin using remote sensing and GIS. Remote Sens Appl: Soc Environ 19:100352. https://doi.org/10.1016/j.rsase.2020.100352
Taravat A, Proud S, Peronaci S, Del Frate F, Oppelt N (2015) Multilayer perceptron neural networks model for meteosat second generation SEVIRI daytime cloud masking. Remote Sensing 7(2):1529–1539. https://doi.org/10.3390/rs70201529
Tufa KN (2021) Review on Status , Opportunities and Challenges of Irrigation Practices in Awash River Basin , Ethiopia Agrotechnology. Agrotechnology June
Wang S, Peng H, Liang S (2022) Prediction of estuarine water quality using interpretable machine learning approach. J Hydrol 605:127320
Yaseen ZM, Al-Juboori AM, Beyaztas U, Al-Ansari N, Chau KW, Qi C, Ali M, Salih SQ, Shahid S (2020) Prediction of evaporation in arid and semi-arid regions: a comparative study using different machine learning models. Eng Appl Comput Fluid Mech 14(1):70–89. https://doi.org/10.1080/19942060.2019.1680576
Zarei G, Homaee M, Liaghat AM, Hoorfar AH (2010) A model for soil surface evaporation based on Campbell’s retention curve. J Hydrol 380(3–4):356–361
Acknowledgements
The author expresses gratitude to all governmental bodies for supplying the information needed for this research project. The author would like to thank Haramaya University for providing the opportunity for my PhD studies and for sponsoring my tuition. I want to express my gratitude to the anonymous reviewer for their insightful comments, which helped the paper's quality greatly.
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by KD. The first draft of the manuscript was written by KD, while TA and TW gave valuable suggestion and corrections. All authors read and approved the final manuscript.
Kidist Demessie Eshetu: Conceptualization, Methodology, Software, Data collection, Writing- Original draft preparation. Tena Alamirew: Supervision,Editing, Tekalegn Ayele Woldesenbet: Writing- Reviewing and Editing,
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable
Consent for publication
Not applicable
Competing interests
The authors declare that they have no competing interests
Additional information
Communicated by H. Babaie
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Baseline and Best performance parameters for each model
Hyperparamters | Search space | RF | KNN | XGBoost | RF | GB | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
Baseline | Best parameters | Baseline | Best parameters | Baseline | Best parameters | Baseline | Best parameters | Baseline | Best parameters | ||
n_estimators | [200, 2000, num = 10] | _ | _ | _ | _ | [‘800’] | [‘10’] | ['100'] | ['600'] | ['100'] | ['200'] |
max_features | ['auto', 'sqrt','log2'] | ['None'] | ['auto'] | _ | _ | [‘6’] | [‘6’] | ['auto'] | ['auto'] | ['auto'] | ['sqrt'] |
max_depth | [6, 110, num = 11] | ['10'] | ['80'] | _ | _ | _ | _ | ['None'] | ['60'] | ['None'] | ['50'] |
min_samples_split | [2, 5, 10] | ['2'] | ['2'] | _ | _ | _ | _ | ['2'] | ['2'] | ['2'] | ['10'] |
min_samples_leaf | [1, 2, 4] | ['1'] | ['2'] | _ | _ | _ | _ | ['1'] | ['2'] | ['1'] | ['2'] |
bootstrap | [True, False] | _ | _ | _ | _ | _ | _ | ['True'] | ['False'] | _ | _ |
Metric | ['euclidean','manhattan','minkowski'] | _ | _ | ['minkowski'] | [''manhattan'] | _ | _ | _ | _ | _ | _ |
N_Neighors | [3,5.11,19] | _ | _ | ['5'] | ['19'] | _ | _ | _ | _ | _ | _ |
Weights | ['uniform','distance'] | _ | _ | ['uniform'] | ['distance'] | _ | _ | _ | _ | _ | _ |
min_child_weight | [1, 10, 100] | _ | _ | _ | _ | [‘1’] | [‘1’] | _ | _ | _ | _ |
Learning Rate | [0.05, 0.1, 0.2,0.3] | _ | _ | _ | _ | [‘0.2’] | [‘0.3’] | _ | _ | _ | _ |
critrion | ['gini'.'entropy'] | ['gini'] | ['entropy'] | _ | _ | _ | _ | _ | _ | _ | _ |
Appendix B: Statistical description of the features for three stations
Metehara Station | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PCP (mm/day) | Tmin(0C) | Tmax(0C) | WS (km/hr) | SH (hrs) | RH6 (%) | RH9 (%) | RH12 (%) | RH15 (%) | RH18 (%) | Tmean (%) | Rhmax (%) | Rhmin (%) | Rhmean (%) | EVP (mm/day) | |
Count | 8766 | 8766 | 8766 | 8766 | 8766 | 8766 | 8766 | 8766 | 8766 | 8766 | 8766 | 8766 | 8766 | 8766 | 8766 |
Mean | 1.36 | 18.02 | 33.93 | 1.54 | 8.64 | 81.50 | 67.00 | 48.67 | 39.99 | 42.81 | 25.97 | 82.32 | 38.32 | 60.32 | 4.12 |
Std | 4.88 | 3.85 | 2.82 | 0.56 | 2.42 | 9.76 | 12.86 | 14.12 | 12.36 | 13.29 | 2.758 | 9.329 | 10.96 | 8.81 | 0.81 |
Min | 0.00 | 0.20 | 10.20 | 0.00 | 0.00 | 2.00 | 5.00 | 11.00 | 6.00 | 5.00 | 12.1 | 39.27 | 2 | 28 | 1.10 |
25% | 0.00 | 15.80 | 32.00 | 1.20 | 7.60 | 76.00 | 58.00 | 39.00 | 31.00 | 33.00 | 24.2 | 76 | 31 | 54.5 | 3.74 |
50% | 0.00 | 19.00 | 34.00 | 1.50 | 9.50 | 82.00 | 66.00 | 46.00 | 38.00 | 41.00 | 26.25 | 82 | 37 | 59.5 | 4.20 |
75% | 0.00 | 20.60 | 36.00 | 1.77 | 10.40 | 88.00 | 75.00 | 56.00 | 46.00 | 50.00 | 28 | 89 | 44 | 65.5 | 4.71 |
Max | 72.90 | 28.50 | 41.86 | 9.00 | 11.80 | 100 | 100 | 100 | 100 | 100 | 34.65 | 100 | 100 | 100 | 5.56 |
Melkasa Station | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Tmin (0C) | Tmax (0C) | PCP (mm/day) | SH (hrs) | WS (km/hr) | Rhmax (%) | Rhmin (%) | Rhmean (%) | Tmean (0c) | EVP (mm/day) | |
Count | 8035 | 8035 | 8035 | 8035 | 8035 | 8035 | 8035 | 8035 | 8035 | 8035 |
Mean | 13.56 | 28.84 | 2.30 | 8.48 | 4.95 | 86.35 | 45.98 | 66.17 | 21.20 | 3.90 |
Std | 3.31 | 2.64 | 7.17 | 2.55 | 4.04 | 7.04 | 10.78 | 7.13 | 2.25 | 0.79 |
Min | -0.50 | 17.00 | 0.00 | 0.00 | 0.10 | 38.63 | 17.78 | 38.63 | 13.50 | 1.05 |
25% | 11.50 | 27.00 | 0.00 | 7.20 | 2.10 | 83.44 | 37.55 | 60.98 | 19.75 | 3.54 |
50% | 14.28 | 28.60 | 0.00 | 9.40 | 3.30 | 86.70 | 45.40 | 66.00 | 21.25 | 3.97 |
75% | 16.00 | 30.60 | 0.50 | 10.40 | 7.90 | 88.40 | 53.59 | 71.04 | 22.75 | 4.46 |
Max | 23.50 | 37.50 | 88.00 | 12.20 | 21.40 | 100 | 88.00 | 94 | 27.50 | 5.43 |
Dubti Station | |||||
---|---|---|---|---|---|
Tmin (0C) | Tmax (0C) | PCP (mm/day) | Tmean (0c) | EVP (mm/day) | |
Count | 4018 | 4018 | 4018 | 4018 | 4018 |
Mean | 22.37 | 38.51 | 0.55 | 30.44 | 6.46 |
Std | 4.36 | 3.60 | 3.47 | 3.69 | 1.03 |
Min | 6.00 | 26.00 | 0.00 | 21.00 | 2.86 |
25% | 19.00 | 35.50 | 0.00 | 27.25 | 5.60 |
50% | 23.00 | 39.00 | 0.00 | 31.00 | 6.61 |
75% | 25.80 | 41.50 | 0.00 | 33.50 | 7.26 |
Max | 32.20 | 46.30 | 68.20 | 38.50 | 9.40 |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Eshetu, K.D., Alamirew, T. & Woldesenbet, T.A. Interpretable machine learning for predicting evaporation from Awash reservoirs, Ethiopia. Earth Sci Inform 16, 3209–3226 (2023). https://doi.org/10.1007/s12145-023-01063-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-023-01063-y