Skip to main content

Imputation of Rainfall Data Using Improved Neural Network Algorithm

  • Conference paper
  • First Online:
Pattern Recognition. ICPR International Workshops and Challenges (ICPR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12664))

Included in the following conference series:

Abstract

Missing rainfall data have reduced the quality of hydrological data analysis because they are the essential input for hydrological modeling. Much research has focused on rainfall data imputation. However, the compatibility of precipitation (rainfall) and non-precipitation (meteorology) as input data has received less attention. First, we propose a novel input structure for the missing data imputation method. Principal component analysis (PCA) is used to extract the most relevant features from the meteorological data. This paper introduces the combined input of the significant principal components (PCs) and rainfall data from nearest neighbor gauging stations as the input to the estimation of the missing values. Second, the effects of the combination input for infilling the missing rainfall data series were compared using the sine cosine algorithm neural network (SCANN) and feedforward neural network (FFNN). The results showed that SCANN outperformed FFNN imputation in terms of mean absolute error (MAE), root means square error (RMSE) and correlation coefficient (R), with an average accuracy of more than 90%. This study revealed that as the percentage of missingness increased, the precision of both imputation methods reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Muñoz, P., Orellana-Alvear, J., Willems, P., Célleri, R.: Flash-flood forecasting in an Andean mountain catchment—development of a step-wise methodology based on the random forest algorithm. Water 10(11), 1519 (2018)

    Article  Google Scholar 

  2. Szewrański, S., Chruściński, J., Kazak, J., Świąder, M., Tokarczyk-Dorociak, K., Żmuda, R.: Pluvial Flood Risk Assessment Tool (PFRA) for rainwater management and adaptation to climate change in newly urbanised areas. Water 10(4), 386 (2018)

    Article  Google Scholar 

  3. Kuok, K.K.: Parameter Optimization Methods for Calibrating Tank Model and Neural Network Model for Rainfall-runoff Modeling. Doctoral dissertation, Ph.D. thesis. Universiti Technology Malaysia (2010)

    Google Scholar 

  4. Mcdonald, R.A., Thurston, P.W., Nelson, M.R.A.: Monte Carlo study of missing item methods. Organizational Res. Methods 3(1), 71–92 (2000)

    Google Scholar 

  5. McKnight, P.E., McKnight, K.M., Sidani, S., Figueredo, A.J.: Missing Data: A Gentle Introduction. Guilford Press (2007).

    Google Scholar 

  6. Lee, K.J., Carlin, J.B.: Multiple imputation for missing data: fully conditional specification versus multivariate normal imputation. Am. J. Epidemiol. 171(5), 624–632 (2010)

    Article  Google Scholar 

  7. Gao, Y., Merz, C., Lischeid, G., Schneider, M.: A review on missing hydrological data processing. Environ. Earth Sci. 77(2), 1–2 (2018). https://doi.org/10.1007/s12665-018-7228-6

    Article  Google Scholar 

  8. Mispan, M.R., Rahman, N.F.A., Ali, M.F., Khalid, K., Bakar, M.H.A., Haron, S.H.: Missing river discharge data imputation approach using artificial neural network. Methodology 25, 20 (2015)

    Google Scholar 

  9. Chiu, P.C., Selamat, A., Krejcar, O.: Infilling missing rainfall and runoff data for sarawak, malaysia using gaussian mixture model based k-nearest neighbor imputation. In: Wotawa, F., Friedrich, G., Pill, I., Koitz-Hristov, R., Ali, M. (eds.) IEA/AIE 2019. LNCS (LNAI), vol. 11606, pp. 27–38. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22999-3_3

    Chapter  Google Scholar 

  10. Lai, W.Y., Kuok, K.K.: A study on bayesian principal component analysis for addressing missing rainfall data. Water Resour. Manage 33(8), 2615–2628 (2019). https://doi.org/10.1007/s11269-019-02209-8

    Article  Google Scholar 

  11. Mirjalili, S.: SCA: a sine cosine algorithm for solving optimization problems. Knowl.-Based Syst. 96, 120–133 (2016)

    Article  Google Scholar 

  12. Qu, C., Zeng, Z., Dai, J., Yi, Z., He, W.: A modified sine-cosine algorithm based on neighborhood search and greedy levy mutation. Computational intelligence and neuroscience (2018)

    Google Scholar 

  13. Das, S., Bhattacharya, A., Chakraborty, A.K.: Solution of short-term hydrothermal scheduling using sine cosine algorithm. Soft Comput. 22(19), 6409–6427 (2018)

    Google Scholar 

  14. Li, S., Fang, H., Liu, X.: Parameter optimization of support vector regression based on sine cosine algorithm. Expert Syst. Appl. 91, 63–77 (2018)

    Article  Google Scholar 

  15. Tawhid, M.A., Savsani, P.: Discrete Sine-Cosine Algorithm (DSCA) with Local Search for Solving Traveling Salesman Problem. Arab. J. Sci. Eng. 44(4), 3669–3679 (2018). https://doi.org/10.1007/s13369-018-3617-0

    Article  Google Scholar 

  16. Chandler, R.E., Isham, V.S., Leith, N.A., Northrop, P.J., Onof, C.J., Wheater, H.S.: Uncertainty in Rainfall Inputs. World Scientific/Imperial College Press, London (2011)

    Google Scholar 

  17. Stoner, O., Economou, T.: An Advanced Hidden Markov Model for Hourly Rainfall Time Series. arXiv:1906.03846 (2019)

  18. Kashiwao, T., Nakayama, K., Ando, S., Ikeda, K., Lee, M., Bahadori, A.: A neural network-based local rainfall prediction system using meteorological data on the Internet: a case study using data from the Japan Meteorological Agency. Appl. Soft Comput. 56, 317–330 (2017)

    Article  Google Scholar 

  19. Yen, M.H., Liu, D.W., Hsin, Y.C., Lin, C.E., Chen, C.C.: Application of the deep learning for the prediction of rainfall in Southern Taiwan. Sci. Rep. 9(1), 1–9 (2019)

    Article  Google Scholar 

  20. Grange, S.K., Carslaw, D.C.: Using meteorological normalisation to detect interventions in air quality time series. Sci. Total Environ. 653, 578–588 (2019)

    Article  Google Scholar 

  21. Londhe, S., Dixit, P., Shah, S., Narkhede, S.: Infilling of missing daily rainfall records using artificial neural network. ISH J. Hydraulic Eng. 21(3), 255–264 (2015)

    Google Scholar 

  22. Canchala-Nastar, T., Carvajal-Escobar, Y., Alfonso-Morales, W., Cerón, W.L., Caicedo, E.: Estimation of missing data of monthly rainfall in southwestern Colombia using artificial neural networks. Data Brief 26, 104517 (2019)

    Article  Google Scholar 

  23. Chiu, P.C., Selamat, A., Krejcar, O., Kuok, K.K.: Missing rainfall data estimation using artificial neural network and nearest neighbor imputation. In: Advancing Technology Industrialization Through Intelligent Software Methodologies, Tools and Techniques: Proceedings of the 18th International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques (SoMeT_19), 318, 132. IOS Press (2019)

    Google Scholar 

  24. Henry, A.J., Hevelone, N.D., Lipsitz, S., Nguyen, L.L.: Comparative methods for handling missing data in large databases. J. Vasc. Surg. 58(5), 1353–1359 (2013)

    Article  Google Scholar 

  25. Cheema, J.R.: Some general guidelines for choosing missing data handling methods in educational research. J. Mod. Appl. Stat. Meth. 13(2), 3 (2014)

    Article  Google Scholar 

  26. Zhu, P., Xu, Q., Hu, Q., Zhang, C., Zhao, H.: Multi-label feature selection with missing labels. Pattern Recogn. 74, 488–502 (2018)

    Article  Google Scholar 

  27. Hassani, H., Kalantari, M., Ghodsi, Z.: Evaluating the performance of multiple imputation methods for handling missing values in time series data: a study focused on East Africa. Soil-Carbonate-Stable Isotope Data. Stats. 2(4), 457–467 (2019)

    Google Scholar 

  28. Oba, S., Sato, M.A., Takemasa, I., Monden, M., Matsubara, K.I., Ishii, S.: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19(16), 2088–2096 (2003)

    Article  Google Scholar 

  29. Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (2014)

    Google Scholar 

  30. Kurita, T.: Principal Component Analysis (PCA). In: Ikeuchi, K. (eds) Computer Vision. Springer, Boston (2014)

    Google Scholar 

  31. Pearson, K.: Principal components analysis. London, Edinburgh, Dublin Philos. Mag. J. Sci. 6(2), 559 (1901)

    Article  Google Scholar 

  32. Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417 (1933)

    Article  Google Scholar 

  33. Smith, L.I.: A tutorial on principal components analysis (2002) https://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf. Accessed 03 Jan 2020

  34. Khattree, R., Naik, D.N.: Multivariate Data Reduction and Discrimination with SAS Software. Cary, N.C., SAS Institute (2000)

    Google Scholar 

  35. Jamil, M., Yang, X.S.: A literature survey of benchmark functions for global optimisation problems. Int. J. Math. Modell. Numer. Optim. 4(2), 150–194 (2013)

    MATH  Google Scholar 

  36. Zuśka, Z., Kopcińska, J., Dacewicz, E., Skowera, B., Wojkowski, J., Ziernicka–Wojtaszek, A.: Application of the principal component analysis (PCA) method to assess the impact of meteorological elements on concentrations of particulate matter (PM10): a case study of the Mountain Valley (the Sącz Basin, Poland). Sustainability 11, 6740 (2019)

    Google Scholar 

  37. De Silva, C.C., Beckman, S.P., Liu, S., Bowler, N.: Principal component analysis (PCA) as a statistical tool for identifying key indicators of nuclear power plant cable insulation degradation. In: Proceedings of the 18th International Conference on Environmental Degradation of Materials in Nuclear Power Systems–Water Reactors, pp. 1227–1239. Springer, Cham (2019)

    Google Scholar 

  38. Gill, M.K., Asefa, T., Kaheil, Y., McKee, M.: Effect of missing data on performance of learning algorithms for hydrologic predictions: implications to an imputation technique. Water Resour. Res. 43(7) (2007)

    Google Scholar 

  39. Kim, T., Ko, W., Kim, J.: Analysis and impact evaluation of missing data imputation in day-ahead PV generation forecasting. Appl. Sci. 9(1), 204 (2019)

    Article  Google Scholar 

  40. Ayilara, O.F., Zhang, L., Sajobi, T.T., Sawatzky, R., Bohm, E., Lix, L.M.: Impact of missing data on bias and precision when estimating change in patient-reported outcomes from a clinical registry. Health Quality Life Outcomes 17(1), 106 (2019)

    Article  Google Scholar 

Download references

Acknowledgment

The authors would like to acknowledge the Malaysian Meteorological Department and Department of Irrigation and Drainage (DID), Sarawak, Malaysia, for providing the meteorological and rainfall data in this study. The authors sincerely thank Universiti Teknologi Malaysia (UTM) under Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) Vot 4L876; Fundamental Research Grant Scheme (FRGS) Vot 5F073 and SLAI supported under Ministry of Higher Education Malaysia for the completion of the research. The work is partially supported by the SPEV project (ID: 2103–2020), Faculty of Informatics and Management, University of Hradec Kralove. We are also grateful for the support of Ph.D. students Jan Hruska and Michal Dobrovolny in consultations regarding application aspects from Hradec Kralove University, Czech Republic. The APC was funded by the SPEV project 2103/2020, Faculty of Informatics and Management, University of Hradec Kralove.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Selamat .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chiu, P.C., Selamat, A., Krejcar, O., Kuok, K.K. (2021). Imputation of Rainfall Data Using Improved Neural Network Algorithm. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12664. Springer, Cham. https://doi.org/10.1007/978-3-030-68799-1_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-68799-1_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-68798-4

  • Online ISBN: 978-3-030-68799-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics