Skip to main content
Log in

Interpretable machine learning for predicting evaporation from Awash reservoirs, Ethiopia

  • RESEARCH
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

An in-depth understanding of a key element such as lake evaporation is particularly beneficial in developing the optimal management approach for reservoirs. In this study, we first aim to evaluate the applicability of regressors Random Forest (RF), Gradient Booting (GB), and Decision Tree (DT), K Nearest Neighbor (kNN), and XGBoost architectures to predict daily lake evaporation of five reservoirs in the Awash River basin, Ethiopia. The best performing models, Gradient Boosting and XGBoost, are then explained through an explanatory framework using daily climate datasets. The interpretability of the models was evaluated using the Shapley Additive explanations (SHAP). The GB model performed better with (RMSE = 0.045, MSE = 0.031, MAE = 0.002, NSE = 0.997, KGF = 0.991, RRMSE = 0.011) for Metehara Station, (RMSE = 0.032, MSE = 0.024, MAE = 0.001, NSE = 0.998, KGF = 0.999, RRMSE = 0.008) at Melkasa Station, and Dubti Station (RMSE = 0.13, MSE = 0.09, MAE = 0.017, NSE = 0.982, KGF = 0.977,RRMSE = 0.022) as the same as of XGBoost. The factors with the greatest overall impact on the daily evaporation for GB and XGboost Architecture were the SH, month, Tmax, and Tmin for Metehara and Melkasa, and Tmax, Tmin, and month had the greatest impact on the daily evaporation for Dubti. Furthermore, the interpretability of the models showed good agreement between the MLAs simulations and the actual hydro-climatic evaporation process. This result allows decision makers to not only rely on the results of an algorithm, but to make more informed decisions by using interpretable results for better control of the basin reservoir operating rules.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The datasets used during the current study are available from the corresponding author on reasonable request.

Abbreviations

SHAP:

SHAPley Additive explanations

ANFIS :

Adaptive Neuro-Fuzzy Interface System

ANN :

Artificial Neural Network

CANFIS :

Coactive Neuro-Fuzzy Inference System

DT :

Decision tree

EVP :

Lake Evaporation

GB :

Gradient Booting

GMDH :

Group method of data handling

HHO :

Harris Hawks Optimization

kNN :

K Nearest Neighbor

LS-SVR :

Least squares suPCPort vector regression

MAE :

Mean Absolute Error

MLAs :

Machine learning Algorithms

MLR :

Multiple Linear Regressions

MoWE :

Ministry of Water and Energy of Ethiopia

MSE :

Mean Square Error

NASH :

Nash Sutcliff-model Efficiency

PSO :

Particle Swarm Optimization

RF :

Random Forest

RMSE :

Root Mean Square Error

SVM :

Support Vector Machine

SVR :

Support Vector Regression

WOA :

Optimization Algorithm

References

  • Abed M, Imteaz MA, Ahmed AN, Huang YF (2022) Modelling monthly pan evaporation utilising Random Forest and deep learning algorithms. Sci Rep 12(1):13132. https://doi.org/10.1038/s41598-022-17263-3

    Article  Google Scholar 

  • Adeba D, Kansal ML, Sen S (2015) Assessment of water scarcity and its impacts on sustainable development in Awash basin, Ethiopia. Sustain Water Resour Manag 1(1):71–87. https://doi.org/10.1007/s40899-015-0006-7

    Article  Google Scholar 

  • Adnan RM, Mostafa RR, Elbeltagi A, Yaseen ZM, Shahid S, Kisi O (2022) Development of new machine learning model for streamflow prediction: case studies in Pakistan. In Stochastic Environmental Research and Risk Assessment (Vol. 36, Issue 4). Springer Berlin Heidelberg. https://doi.org/10.1007/s00477-021-02111-z

  • Al Sudani ZA, Salem GSA (2022) Evaporation Rate Prediction Using Advanced Machine Learning Models: A Comparative Study. Adv Meteorol 2022(1433835):13. https://doi.org/10.1155/2022/1433835

    Article  Google Scholar 

  • Allawi MF, Aidan IA, El-Shafie A (2021) Enhancing the performance of data-driven models for monthly reservoir evaporation prediction. Environ Sci Pollut Res 28(7):8281–8295. https://doi.org/10.1007/s11356-020-11062-x

    Article  Google Scholar 

  • Allawi MF, Othman FB, Afan HA, Ahmed AN, Hossain MS, Fai CM, El-Shafie A (2019) Reservoir evaporation prediction modeling based on artificial intelligence methods. Water (Switzerland), 11(6). https://doi.org/10.3390/w11061226

  • Al-Mukhtar M (2021) Modeling of pan evaporation based on the development of machine learning methods. Theoret Appl Climatol 146(3–4):961–979. https://doi.org/10.1007/s00704-021-03760-4

    Article  Google Scholar 

  • Arya Azar N, Kardan N, Ghordoyee Milan S (2021) Developing the artificial neural network–evolutionary algorithms hybrid models (ANN–EA) to predict the daily evaporation from dam reservoirs. Eng Comput 0123456789. https://doi.org/10.1007/s00366-021-01523-3

  • Barbier EB (2004) Explaining Agricultural Land Expansion and Deforestation in Developing Countries. Am J Agr Econ 86(5):1347–1353. https://doi.org/10.2190/fjdp-vru8-f78e-fhwe

    Article  Google Scholar 

  • Bellido-Jiménez JA, Estévez J, García-Marín AP (2021) New machine learning approaches to improve reference evapotranspiration estimates using intra-daily temperature-based variables in a semi-arid region of Spain. Agric Water Manag 245(September). https://doi.org/10.1016/j.agwat.2020.106558

  • Bergstra J, Bengio Y (2012) Random Search for Hyper-Parameter Optimization. J Mach Learn Res 13:281–305

    Google Scholar 

  • Chakraborty D, Başağaoğlu H, Winterle J (2021) Interpretable vs. noninterpretable machine learning models for data-driven hydro-climatological process modeling. Exp Syst Appl 170:114498

    Article  Google Scholar 

  • Chen T, Guestrin C (2016) XGBoost A scalable tree boosting system. Proc ACM SIGKDD Int Conf Knowl Discov Data Mining 13-17-Augu:785–794. https://doi.org/10.1145/2939672.2939785

    Article  Google Scholar 

  • Doshi-Velez F, Kim B (2017) Towards A Rigorous Science of Interpretable Machine Learning. ArXiv Preprint ArXiv:1702.08608 Ml:1–13. http://arxiv.org/abs/1702.08608 Accessed 15 Feb 2023

  • El Bilali A, Abdeslam T, Ayoub N, Lamane H, Ezzaouini MA, Elbeltagi A (2023) An interpretable machine learning approach based on DNN, SVR, Extra Tree, and XGBoost models for predicting daily pan evaporation. J Environ Manag 327(September 2022):116890. https://doi.org/10.1016/j.jenvman.2022.116890

    Article  Google Scholar 

  • Elbeltagi A, Al Mukhtar M, Kushwaha NL, Vishwakarma DK (2022) Monthly Pan Evaporation Modelling Using Hybrid Machine Learning Algorithms in a Semi-Arid Environment. SSRN Electron J. https://doi.org/10.2139/ssrn.4050027

    Article  Google Scholar 

  • Emiru NC, Recha JW, Thompson JR, Belay A, Aynekulu E, Manyevere A, Demissie TD, Osano PM, Hussein J, Molla MB, Mengistu GM, Solomon D (2022) Impact of Climate Change on the Hydrology of the Upper Awash River Basin, Ethiopia. Hydrology, 9(1). https://doi.org/10.3390/hydrology9010003

  • Endalie D, Haile G, Taye W (2022) Deep learning model for daily rainfall prediction: case study of Jimma. Ethiopia Water Supply 22(3):3448–3461. https://doi.org/10.2166/ws.2021.391

    Article  Google Scholar 

  • Fanta SS, Yesuf MB, Demissie TA (2023) Investigation of climate change impact on the optimal operation of koka reservoir, upper awash watershed, Ethiopia. Heliyon 9(5):e16287. https://doi.org/10.1016/j.heliyon.2023.e16287

    Article  Google Scholar 

  • Feng Y, Jia Y, Zhang Q, Gong D, Cui N (2018) National-scale assessment of pan evaporation models across different climatic zones of China. J Hydrol 564:314–328

    Article  Google Scholar 

  • Gaudard L, Romerio F, Dalla Valle F, Gorret R, Maran S, Ravazzani G, Stoffel M, Volonterio M (2014) Climate change impacts on hydropower in the Swiss and Italian Alps. Sci Total Environ 493:1211–1221. https://doi.org/10.1016/J.SCITOTENV.2013.10.012

    Article  Google Scholar 

  • Gedefaw M, Wang H, Yan D, Song X, Yan D, Dong G, Wang J, Girma A, Ali BA, Batsuren D, Abiyu A, Qin T (2018) Trend analysis of climatic and hydrological variables in the Awash river basin, Ethiopia. Water (switzerland) 10(11):1–14. https://doi.org/10.3390/w10111554

    Article  Google Scholar 

  • Gedefaw M, Wang H, Yan D, Qin T, Wang K, Girma A, Batsuren D, Abiyu A (2019) Water resources allocation systems under irrigation expansion and climate change scenario in Awash River Basin of Ethiopia. Water (switzerland) 11(10):1–15. https://doi.org/10.3390/w11101966

    Article  Google Scholar 

  • Gonzalez JM, Matrosov ES, Obuobie E, Mul M, Pettinotti L, Gebrechorkos SH, Sheffield J, Bottacin-Busolin A, Dalton J, Smith DM, Harou JJ (2021) Quantifying Cooperation Benefits for New Dams in Transboundary Water Systems Without Formal Operating Rules. Front Environ Sci 9(May). https://doi.org/10.3389/fenvs.2021.596612

  • Jerome H. Friedman* (1999) Stochastic Gradient Boosting

  • Kayhomayoon Z, Naghizadeh F, Malekpoor M, Arya Azar N, Ball J, Ghordoyee Milan S (2022) Prediction of evaporation from dam reservoirs under climate change using soft computing techniques. In Environmental Science and Pollution Research (Issue 0123456789). Springer Berlin Heidelberg. https://doi.org/10.1007/s11356-022-23899-5

  • Lu X, Ju Y, Wu L, Fan J, Zhang F, Li Z (2018) Daily pan evaporation modeling from local and cross-station data using three tree-based machine learning models. J Hydrol 566:668–684. https://doi.org/10.1016/j.jhydrol.2018.09.055

    Article  Google Scholar 

  • Mirani KB, Ayele MA, Lohani TK, Ukumo TY (2022) Evaluation of Hydropower Generation and Reservoir Operation under Climate Change from Kesem Reservoir, Ethiopia. Adv Meteorol 2022. https://doi.org/10.1155/2022/3336257

  • Mosca E, Szigeti F, Tragianni S, Gallagher D, Groh G (2022) SHAP-Based Explanation Methods: A Review for NLP Interpretability. Proceedings of the 29th International Conference on Computational Linguistics, 4593–4603.

  • Roya Narimani, Changhyun Jun, Carlo De Michele et al. Multilayer Perceptron-based Predictive Model for the Reconstruction of Missing Rainfall Data, 14 March 2022, PREPRINT (Version 1) available at Research Square https://doi.org/10.21203/rs.3.rs-1377902/v1

  • Rasouli K, Hsieh WW, Cannon AJ (2012) Daily streamflow forecasting by machine learning methods with weather and climate inputs. J Hydrol 414–415:284–293. https://doi.org/10.1016/j.jhydrol.2011.10.039

    Article  Google Scholar 

  • Sahu RK, Müller J, Park J, Varadharajan C, Arora B, Faybishenko B, Agarwal D (2020) Impact of Input Feature Selection on Groundwater Level Prediction From a Multi-Layer Perceptron Neural Network. Frontiers in Water 2(November):1–15. https://doi.org/10.3389/frwa.2020.573034

    Article  Google Scholar 

  • Štrumbelj E, Kononenko I (2014) Explaining prediction models and individual predictions with feature contributions. Knowl Inf Syst 41(3):647–665. https://doi.org/10.1007/s10115-013-0679-x

    Article  Google Scholar 

  • Sun S, Song Z, Chen X, Wang T, Zhang Y, Zhang D, Zhang H, Hao Q, Chen B (2020) Multimodel-based analyses of evapotranspiration and its controls in China over the last three decades. Ecohydrology 13(3). https://doi.org/10.1002/eco.2195

  • Tadese M, Kumar L, Koech R, Kogo BK (2020) Mapping of land-use/land-cover changes and its dynamics in Awash River Basin using remote sensing and GIS. Remote Sens Appl: Soc Environ 19:100352. https://doi.org/10.1016/j.rsase.2020.100352

    Article  Google Scholar 

  • Taravat A, Proud S, Peronaci S, Del Frate F, Oppelt N (2015) Multilayer perceptron neural networks model for meteosat second generation SEVIRI daytime cloud masking. Remote Sensing 7(2):1529–1539. https://doi.org/10.3390/rs70201529

    Article  Google Scholar 

  • Tufa KN (2021) Review on Status , Opportunities and Challenges of Irrigation Practices in Awash River Basin , Ethiopia Agrotechnology. Agrotechnology June

  • Wang S, Peng H, Liang S (2022) Prediction of estuarine water quality using interpretable machine learning approach. J Hydrol 605:127320

    Article  Google Scholar 

  • Yaseen ZM, Al-Juboori AM, Beyaztas U, Al-Ansari N, Chau KW, Qi C, Ali M, Salih SQ, Shahid S (2020) Prediction of evaporation in arid and semi-arid regions: a comparative study using different machine learning models. Eng Appl Comput Fluid Mech 14(1):70–89. https://doi.org/10.1080/19942060.2019.1680576

    Article  Google Scholar 

  • Zarei G, Homaee M, Liaghat AM, Hoorfar AH (2010) A model for soil surface evaporation based on Campbell’s retention curve. J Hydrol 380(3–4):356–361

    Article  Google Scholar 

Download references

Acknowledgements

The author expresses gratitude to all governmental bodies for supplying the information needed for this research project. The author would like to thank Haramaya University for providing the opportunity for my PhD studies and for sponsoring my tuition. I want to express my gratitude to the anonymous reviewer for their insightful comments, which helped the paper's quality greatly.

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by KD. The first draft of the manuscript was written by KD, while TA and TW gave valuable suggestion and corrections. All authors read and approved the final manuscript.

Kidist Demessie Eshetu: Conceptualization, Methodology, Software, Data collection, Writing- Original draft preparation. Tena Alamirew: Supervision,Editing, Tekalegn Ayele Woldesenbet: Writing- Reviewing and Editing,

Corresponding author

Correspondence to Kidist Demessie Eshetu.

Ethics declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests

Additional information

Communicated by H. Babaie

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Baseline and Best performance parameters for each model

Hyperparamters

Search space

RF

KNN

XGBoost

RF

GB

Baseline

Best parameters

Baseline

Best parameters

Baseline

Best parameters

Baseline

Best parameters

Baseline

Best parameters

n_estimators

[200, 2000, num = 10]

_

_

_

_

[‘800’]

[‘10’]

['100']

['600']

['100']

['200']

max_features

['auto', 'sqrt','log2']

['None']

['auto']

_

_

[‘6’]

[‘6’]

['auto']

['auto']

['auto']

['sqrt']

max_depth

[6, 110, num = 11]

['10']

['80']

_

_

_

_

['None']

['60']

['None']

['50']

min_samples_split

[2, 5, 10]

['2']

['2']

_

_

_

_

['2']

['2']

['2']

['10']

min_samples_leaf

[1, 2, 4]

['1']

['2']

_

_

_

_

['1']

['2']

['1']

['2']

bootstrap

[True, False]

_

_

_

_

_

_

['True']

['False']

_

_

Metric

['euclidean','manhattan','minkowski']

_

_

['minkowski']

[''manhattan']

_

_

_

_

_

_

N_Neighors

[3,5.11,19]

_

_

['5']

['19']

_

_

_

_

_

_

Weights

['uniform','distance']

_

_

['uniform']

['distance']

_

_

_

_

_

_

min_child_weight

[1, 10, 100]

_

_

_

_

[‘1’]

[‘1’]

_

_

_

_

Learning Rate

[0.05, 0.1, 0.2,0.3]

_

_

_

_

[‘0.2’]

[‘0.3’]

_

_

_

_

critrion

['gini'.'entropy']

['gini']

['entropy']

_

_

_

_

_

_

_

_

Appendix B: Statistical description of the features for three stations

 

Metehara Station

 
 

PCP

(mm/day)

Tmin(0C)

Tmax(0C)

WS

(km/hr)

SH

(hrs)

RH6 (%)

RH9 (%)

RH12 (%)

RH15 (%)

RH18 (%)

Tmean

(%)

Rhmax

(%)

Rhmin

(%)

Rhmean

(%)

EVP

(mm/day)

Count

8766

8766

8766

8766

8766

8766

8766

8766

8766

8766

8766

8766

8766

8766

8766

Mean

1.36

18.02

33.93

1.54

8.64

81.50

67.00

48.67

39.99

42.81

25.97

82.32

38.32

60.32

4.12

Std

4.88

3.85

2.82

0.56

2.42

9.76

12.86

14.12

12.36

13.29

2.758

9.329

10.96

8.81

0.81

Min

0.00

0.20

10.20

0.00

0.00

2.00

5.00

11.00

6.00

5.00

12.1

39.27

2

28

1.10

25%

0.00

15.80

32.00

1.20

7.60

76.00

58.00

39.00

31.00

33.00

24.2

76

31

54.5

3.74

50%

0.00

19.00

34.00

1.50

9.50

82.00

66.00

46.00

38.00

41.00

26.25

82

37

59.5

4.20

75%

0.00

20.60

36.00

1.77

10.40

88.00

75.00

56.00

46.00

50.00

28

89

44

65.5

4.71

Max

72.90

28.50

41.86

9.00

11.80

100

100

100

100

100

34.65

100

100

100

5.56

 

Melkasa Station

 

Tmin

(0C)

Tmax

(0C)

PCP

(mm/day)

SH

(hrs)

WS

(km/hr)

Rhmax

(%)

Rhmin

(%)

Rhmean

(%)

Tmean

(0c)

EVP

(mm/day)

Count

8035

8035

8035

8035

8035

8035

8035

8035

8035

8035

Mean

13.56

28.84

2.30

8.48

4.95

86.35

45.98

66.17

21.20

3.90

Std

3.31

2.64

7.17

2.55

4.04

7.04

10.78

7.13

2.25

0.79

Min

-0.50

17.00

0.00

0.00

0.10

38.63

17.78

38.63

13.50

1.05

25%

11.50

27.00

0.00

7.20

2.10

83.44

37.55

60.98

19.75

3.54

50%

14.28

28.60

0.00

9.40

3.30

86.70

45.40

66.00

21.25

3.97

75%

16.00

30.60

0.50

10.40

7.90

88.40

53.59

71.04

22.75

4.46

Max

23.50

37.50

88.00

12.20

21.40

100

88.00

94

27.50

5.43

Dubti Station

 

Tmin

(0C)

Tmax

(0C)

PCP

(mm/day)

Tmean

(0c)

EVP

(mm/day)

Count

4018

4018

4018

4018

4018

Mean

22.37

38.51

0.55

30.44

6.46

Std

4.36

3.60

3.47

3.69

1.03

Min

6.00

26.00

0.00

21.00

2.86

25%

19.00

35.50

0.00

27.25

5.60

50%

23.00

39.00

0.00

31.00

6.61

75%

25.80

41.50

0.00

33.50

7.26

Max

32.20

46.30

68.20

38.50

9.40

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Eshetu, K.D., Alamirew, T. & Woldesenbet, T.A. Interpretable machine learning for predicting evaporation from Awash reservoirs, Ethiopia. Earth Sci Inform 16, 3209–3226 (2023). https://doi.org/10.1007/s12145-023-01063-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-023-01063-y

Keywords

Navigation