Skip to main content

Infilling Missing Rainfall and Runoff Data for Sarawak, Malaysia Using Gaussian Mixture Model Based K-Nearest Neighbor Imputation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11606))

Abstract

Hydrologists are often encountered problem of missing values in a rainfall and runoff database. They tend to use the normal ratio or distance power method to deal with the problem of missing data in the rainfall and runoff database. However, this method is time consuming and most of the time, it is less accurate. In this paper, two neighbor-based imputation methods namely K-nearest neighbor (KNN) and Gaussian mixture model based KNN imputation (GMM-KNN) were explored for gap filling the missing rainfall and runoff database. Different percentage of missing data entries were inserted randomly into the database such as 2%, 5%, 10%, 15% and 20% of missing data. Pros and cons of these two methods were compared and discussed. The selected study area is Bedup Basin, located at Samarahan Division, Sarawak, East Malaysia. It is observed that the GMM-KNN imputation method results in the best estimation accuracy for the missing rainfall and runoff database.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Selase, A.E., Agyimpomaa, D.E., Selasi, D.D., Hakii, D.M.: Precipitation and rainfall types with their characteristic features. J. Nat. Sci. Res. 5(20), 1–3 (2015). www.iiste.org

  2. Sattari, M.T., Rezazadeh-Joudi, A., Kusiak, A.: Assessment of different methods for estimation of missing data in precipitation studies. Hydrol. Res. 48(4), 1032–1044 (2017)

    Article  Google Scholar 

  3. Kuok, K.K., Harun, S., Shamsudin, S.M.: Global optimization methods for calibration and optimization of the hydrologic Tank model’s parameters. Can. J. Civ. Eng. 1(1), 2–14 (2010)

    Google Scholar 

  4. Kuok, K.K., Kueh, S.M., Chiu, P.C.: Bat optimisation neural networks for rainfall forecasting: case study for Kuching city. J. Water Clim. Change (2018)

    Google Scholar 

  5. Valizadeh, N., El-Shafie, A., Mirzaei, M., Galavi, H., Mukhlisin, M., Jaafar, O.: Accuracy enhancement for forecasting water levels of reservoirs and river streams using a multiple-input-pattern fuzzification approach. Sci. World J. 2014 (2014)

    Google Scholar 

  6. Yaseen, Z.M., El-Shafie, A., Afan, H.A., Hameed, M., Mohtar, W.H., Hussain, A.: RBFNN versus FFNN for daily river flow forecasting at Johor River, Malaysia. Neural Comput. Appl. 27(6), 1533–1542 (2016)

    Article  Google Scholar 

  7. Ismail, W.N., Zin, W.Z., Ibrahim, W.: Estimation of rainfall and stream flow missing data for Terengganu, Malaysia by using interpolation technique methods. Malay. J. Fundam. Appl. Sci. 13(3), 213–217 (2017)

    Google Scholar 

  8. Suhaila, J., Sayang, M.D., Jemain, A.A.: Revised spatial weighting methods for estimation of missing rainfall data. Asia-Pac. J. Atmos. Sci. 44(2), 93–104 (2008)

    Google Scholar 

  9. Eskelson, B.N., Temesgen, H., Lemay, V., Barrett, T.M., Crookston, N.L., Hudak, A.T.: The roles of nearest neighbor methods in imputing missing data in forest inventory and monitoring databases. Scand. J. For. Res. 24(3), 235–246 (2009)

    Article  Google Scholar 

  10. Kamaruzaman, I.F., Zin, W.Z., Ariff, N.M.: A comparison of method for treating missing daily rainfall data in Peninsular Malaysia. Malay. J. Fundam. Appl. Sci. 13(4–1), 375–380 (2017)

    Article  Google Scholar 

  11. Ferrari, G.T., Ozaki, V.: Missing data imputation of climate datasets: implications to modeling extreme drought events. Revista Brasileira de Meteorologia 29(1), 21–28 (2014)

    Article  Google Scholar 

  12. Teegavarapu, R.S., Chandramouli, V.: Improved weighting methods, deterministic and stochastic data-driven models for estimation of missing precipitation records. J. Hydrol. 312(1–4), 191–206 (2005)

    Article  Google Scholar 

  13. Dastorani, M.T., Moghadamnia, A., Piri, J., Rico-Ramirez, M.A.: Application of ANN and ANFIS models for reconstructing missing flow data. Environ. Monit. Assess. 166, 421–434 (2010)

    Article  Google Scholar 

  14. Mispan, M.R., Rahman, N.F., Ali, M.F., Khalid, K., Bakar, M.H., Haron, S.: Missing river discharge data imputation approach using artificial neural network. J. Eng. Appl. Sci. 10(22) (2015)

    Google Scholar 

  15. Ding, Y., Ross, A.: A comparison of imputation methods for handling missing scores in biometric fusion. Pattern Recogn. 45(3), 919–933 (2012)

    Article  Google Scholar 

  16. Kuok, K.K., Harun, S., Shamsuddin, S.M., Chiu, P.C.: Evaluation of daily rainfall-runoff model using multilayer perceptron and particle swarm optimization feed forward neural networks. J. Environ. Hydrol. 18(10), 1–6 (2010)

    Google Scholar 

  17. Oba, S., Sato, M.A., Takemasa, I., Monden, M., Matsubara, K.I., Ishii, S.: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19(16), 2088–2096 (2003)

    Article  Google Scholar 

  18. Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, Hoboken (2014)

    MATH  Google Scholar 

  19. Zainuri, N.A., Jemain, A.A., Muda, N.: A comparison of various imputation methods for missing values in air quality data. Sains Malaysiana 44(3), 449–456 (2015)

    Article  Google Scholar 

Download references

Acknowledgments

The authors sincerely acknowledge the Department of Irrigation and Drainage (DID), Sarawak, Malaysia for providing the rainfall and runoff data in this study. The authors wish to thank Universiti Teknologi Malaysia (UTM) under Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) Vot 4L876 and the Fundamental Research Grant Scheme (FRGS) Vot 5F073 supported under Ministry of Education Malaysia for the completion of the research. The works were also supported by the SPEV project, University of Hradec Kralove, FIM, Czech Republic (ID: 2102–2019). We are also grateful for the support of Ph.D. student Sebastien Mambou in consultations regarding application aspects.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Po Chan Chiu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chiu, P.C., Selamat, A., Krejcar, O. (2019). Infilling Missing Rainfall and Runoff Data for Sarawak, Malaysia Using Gaussian Mixture Model Based K-Nearest Neighbor Imputation. In: Wotawa, F., Friedrich, G., Pill, I., Koitz-Hristov, R., Ali, M. (eds) Advances and Trends in Artificial Intelligence. From Theory to Practice. IEA/AIE 2019. Lecture Notes in Computer Science(), vol 11606. Springer, Cham. https://doi.org/10.1007/978-3-030-22999-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-22999-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-22998-6

  • Online ISBN: 978-3-030-22999-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics