Abstract
Missing rainfall data have reduced the quality of hydrological data analysis because they are the essential input for hydrological modeling. Much research has focused on rainfall data imputation. However, the compatibility of precipitation (rainfall) and non-precipitation (meteorology) as input data has received less attention. First, we propose a novel input structure for the missing data imputation method. Principal component analysis (PCA) is used to extract the most relevant features from the meteorological data. This paper introduces the combined input of the significant principal components (PCs) and rainfall data from nearest neighbor gauging stations as the input to the estimation of the missing values. Second, the effects of the combination input for infilling the missing rainfall data series were compared using the sine cosine algorithm neural network (SCANN) and feedforward neural network (FFNN). The results showed that SCANN outperformed FFNN imputation in terms of mean absolute error (MAE), root means square error (RMSE) and correlation coefficient (R), with an average accuracy of more than 90%. This study revealed that as the percentage of missingness increased, the precision of both imputation methods reduced.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Muñoz, P., Orellana-Alvear, J., Willems, P., Célleri, R.: Flash-flood forecasting in an Andean mountain catchment—development of a step-wise methodology based on the random forest algorithm. Water 10(11), 1519 (2018)
Szewrański, S., Chruściński, J., Kazak, J., Świąder, M., Tokarczyk-Dorociak, K., Żmuda, R.: Pluvial Flood Risk Assessment Tool (PFRA) for rainwater management and adaptation to climate change in newly urbanised areas. Water 10(4), 386 (2018)
Kuok, K.K.: Parameter Optimization Methods for Calibrating Tank Model and Neural Network Model for Rainfall-runoff Modeling. Doctoral dissertation, Ph.D. thesis. Universiti Technology Malaysia (2010)
Mcdonald, R.A., Thurston, P.W., Nelson, M.R.A.: Monte Carlo study of missing item methods. Organizational Res. Methods 3(1), 71–92 (2000)
McKnight, P.E., McKnight, K.M., Sidani, S., Figueredo, A.J.: Missing Data: A Gentle Introduction. Guilford Press (2007).
Lee, K.J., Carlin, J.B.: Multiple imputation for missing data: fully conditional specification versus multivariate normal imputation. Am. J. Epidemiol. 171(5), 624–632 (2010)
Gao, Y., Merz, C., Lischeid, G., Schneider, M.: A review on missing hydrological data processing. Environ. Earth Sci. 77(2), 1–2 (2018). https://doi.org/10.1007/s12665-018-7228-6
Mispan, M.R., Rahman, N.F.A., Ali, M.F., Khalid, K., Bakar, M.H.A., Haron, S.H.: Missing river discharge data imputation approach using artificial neural network. Methodology 25, 20 (2015)
Chiu, P.C., Selamat, A., Krejcar, O.: Infilling missing rainfall and runoff data for sarawak, malaysia using gaussian mixture model based k-nearest neighbor imputation. In: Wotawa, F., Friedrich, G., Pill, I., Koitz-Hristov, R., Ali, M. (eds.) IEA/AIE 2019. LNCS (LNAI), vol. 11606, pp. 27–38. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22999-3_3
Lai, W.Y., Kuok, K.K.: A study on bayesian principal component analysis for addressing missing rainfall data. Water Resour. Manage 33(8), 2615–2628 (2019). https://doi.org/10.1007/s11269-019-02209-8
Mirjalili, S.: SCA: a sine cosine algorithm for solving optimization problems. Knowl.-Based Syst. 96, 120–133 (2016)
Qu, C., Zeng, Z., Dai, J., Yi, Z., He, W.: A modified sine-cosine algorithm based on neighborhood search and greedy levy mutation. Computational intelligence and neuroscience (2018)
Das, S., Bhattacharya, A., Chakraborty, A.K.: Solution of short-term hydrothermal scheduling using sine cosine algorithm. Soft Comput. 22(19), 6409–6427 (2018)
Li, S., Fang, H., Liu, X.: Parameter optimization of support vector regression based on sine cosine algorithm. Expert Syst. Appl. 91, 63–77 (2018)
Tawhid, M.A., Savsani, P.: Discrete Sine-Cosine Algorithm (DSCA) with Local Search for Solving Traveling Salesman Problem. Arab. J. Sci. Eng. 44(4), 3669–3679 (2018). https://doi.org/10.1007/s13369-018-3617-0
Chandler, R.E., Isham, V.S., Leith, N.A., Northrop, P.J., Onof, C.J., Wheater, H.S.: Uncertainty in Rainfall Inputs. World Scientific/Imperial College Press, London (2011)
Stoner, O., Economou, T.: An Advanced Hidden Markov Model for Hourly Rainfall Time Series. arXiv:1906.03846 (2019)
Kashiwao, T., Nakayama, K., Ando, S., Ikeda, K., Lee, M., Bahadori, A.: A neural network-based local rainfall prediction system using meteorological data on the Internet: a case study using data from the Japan Meteorological Agency. Appl. Soft Comput. 56, 317–330 (2017)
Yen, M.H., Liu, D.W., Hsin, Y.C., Lin, C.E., Chen, C.C.: Application of the deep learning for the prediction of rainfall in Southern Taiwan. Sci. Rep. 9(1), 1–9 (2019)
Grange, S.K., Carslaw, D.C.: Using meteorological normalisation to detect interventions in air quality time series. Sci. Total Environ. 653, 578–588 (2019)
Londhe, S., Dixit, P., Shah, S., Narkhede, S.: Infilling of missing daily rainfall records using artificial neural network. ISH J. Hydraulic Eng. 21(3), 255–264 (2015)
Canchala-Nastar, T., Carvajal-Escobar, Y., Alfonso-Morales, W., Cerón, W.L., Caicedo, E.: Estimation of missing data of monthly rainfall in southwestern Colombia using artificial neural networks. Data Brief 26, 104517 (2019)
Chiu, P.C., Selamat, A., Krejcar, O., Kuok, K.K.: Missing rainfall data estimation using artificial neural network and nearest neighbor imputation. In: Advancing Technology Industrialization Through Intelligent Software Methodologies, Tools and Techniques: Proceedings of the 18th International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques (SoMeT_19), 318, 132. IOS Press (2019)
Henry, A.J., Hevelone, N.D., Lipsitz, S., Nguyen, L.L.: Comparative methods for handling missing data in large databases. J. Vasc. Surg. 58(5), 1353–1359 (2013)
Cheema, J.R.: Some general guidelines for choosing missing data handling methods in educational research. J. Mod. Appl. Stat. Meth. 13(2), 3 (2014)
Zhu, P., Xu, Q., Hu, Q., Zhang, C., Zhao, H.: Multi-label feature selection with missing labels. Pattern Recogn. 74, 488–502 (2018)
Hassani, H., Kalantari, M., Ghodsi, Z.: Evaluating the performance of multiple imputation methods for handling missing values in time series data: a study focused on East Africa. Soil-Carbonate-Stable Isotope Data. Stats. 2(4), 457–467 (2019)
Oba, S., Sato, M.A., Takemasa, I., Monden, M., Matsubara, K.I., Ishii, S.: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19(16), 2088–2096 (2003)
Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, New York (2014)
Kurita, T.: Principal Component Analysis (PCA). In: Ikeuchi, K. (eds) Computer Vision. Springer, Boston (2014)
Pearson, K.: Principal components analysis. London, Edinburgh, Dublin Philos. Mag. J. Sci. 6(2), 559 (1901)
Hotelling, H.: Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417 (1933)
Smith, L.I.: A tutorial on principal components analysis (2002) https://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf. Accessed 03 Jan 2020
Khattree, R., Naik, D.N.: Multivariate Data Reduction and Discrimination with SAS Software. Cary, N.C., SAS Institute (2000)
Jamil, M., Yang, X.S.: A literature survey of benchmark functions for global optimisation problems. Int. J. Math. Modell. Numer. Optim. 4(2), 150–194 (2013)
Zuśka, Z., Kopcińska, J., Dacewicz, E., Skowera, B., Wojkowski, J., Ziernicka–Wojtaszek, A.: Application of the principal component analysis (PCA) method to assess the impact of meteorological elements on concentrations of particulate matter (PM10): a case study of the Mountain Valley (the Sącz Basin, Poland). Sustainability 11, 6740 (2019)
De Silva, C.C., Beckman, S.P., Liu, S., Bowler, N.: Principal component analysis (PCA) as a statistical tool for identifying key indicators of nuclear power plant cable insulation degradation. In: Proceedings of the 18th International Conference on Environmental Degradation of Materials in Nuclear Power Systems–Water Reactors, pp. 1227–1239. Springer, Cham (2019)
Gill, M.K., Asefa, T., Kaheil, Y., McKee, M.: Effect of missing data on performance of learning algorithms for hydrologic predictions: implications to an imputation technique. Water Resour. Res. 43(7) (2007)
Kim, T., Ko, W., Kim, J.: Analysis and impact evaluation of missing data imputation in day-ahead PV generation forecasting. Appl. Sci. 9(1), 204 (2019)
Ayilara, O.F., Zhang, L., Sajobi, T.T., Sawatzky, R., Bohm, E., Lix, L.M.: Impact of missing data on bias and precision when estimating change in patient-reported outcomes from a clinical registry. Health Quality Life Outcomes 17(1), 106 (2019)
Acknowledgment
The authors would like to acknowledge the Malaysian Meteorological Department and Department of Irrigation and Drainage (DID), Sarawak, Malaysia, for providing the meteorological and rainfall data in this study. The authors sincerely thank Universiti Teknologi Malaysia (UTM) under Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) Vot 4L876; Fundamental Research Grant Scheme (FRGS) Vot 5F073 and SLAI supported under Ministry of Higher Education Malaysia for the completion of the research. The work is partially supported by the SPEV project (ID: 2103–2020), Faculty of Informatics and Management, University of Hradec Kralove. We are also grateful for the support of Ph.D. students Jan Hruska and Michal Dobrovolny in consultations regarding application aspects from Hradec Kralove University, Czech Republic. The APC was funded by the SPEV project 2103/2020, Faculty of Informatics and Management, University of Hradec Kralove.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Chiu, P.C., Selamat, A., Krejcar, O., Kuok, K.K. (2021). Imputation of Rainfall Data Using Improved Neural Network Algorithm. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12664. Springer, Cham. https://doi.org/10.1007/978-3-030-68799-1_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-68799-1_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68798-4
Online ISBN: 978-3-030-68799-1
eBook Packages: Computer ScienceComputer Science (R0)