Abstract
A measurement buoy with attached sensors has been deployed at our study area to monitor hydrodynamics, water properties, and water quality conditions. High-resolution temporal data have been collected and streamed into an online system that is accessible in nearly real-time. However, in certain circumstances the sensors may fail to provide continuous and high quality data. This results in gaps or corrupted values. The aim of this study was to reconstruct the faulty values. This paper proposes a method based on a data-driven model, using an Artificial Neural Network combined with a Genetic Algorithm to generate a synthetic data series. The generated data can be used as a patch for the incomplete measured data. Additional improvements were achieved by removing seasonal patterns from the original time series using a wavelet decomposition prior to the data-driven model training process. Comparisons with a standard missing-data imputation method using the Kohonen self-organizing map were made to further asses the performance of the proposed data-driven model. The algorithm was applied to water temperature data, but the same approach is applicable to other parameters of interest.
Similar content being viewed by others
References
Adeloye AJ, Rustum R, Kariyama ID (2011) Kohonen self-organizing map estimator for the reference crop evapotranspiration. Water Resour Res. doi:10.1029/2011WR010690
Aguilar-Martinez S, Hsieh WW (2009) Forecasts of tropical pacific sea surface temperatures by neural networks and support vector regression. Int J Oceanogr. doi:10.1155/2009/167239
Babovic V, Sannasiraj SA, Chan ES (2005) Error correction of a predictive ocean wave model using local model approximation. J Mar Syst 53:1–17
Behera MR, Chun C, Sundarambal P, Tkalich P (2013) Temporal variability and climatology of hydrodynamic, water property, and water quality parameters in West Johor Strait of Singapore. Mar Pollut Bull 77:380–395
Bixler GD, Bhushan B (2012) Biofouling: lessons from nature. Phil Trans R Soc A 370:2381–2417
Breaker LC, Brewster JK (2009) Predicting offshore temperatures in Monterey Bay based on coastal observation using linear forecast models. Ocean Model 27:82–97
Chen H, Wei J, Tkalich P, Mallanote-Rizzoli P (2010) The various components of the circulation in the Singapore Strait region: tidal, wind, Eddy-driven circulations and their relative importance. In Papers of the 20th International Offshore and Polar Engineering Conference, ISOPE-2010, Beijing, China, June 20–26
Corani G (2005) Air quality prediction in Milan: feed-forward neural networks, pruned neural networks and lazy learning. Ecol Model 185:513–529
Daubechies I (1990) The wavelet transform, time–frequency localization and signal analysis. IEEE T Inf Theory 36(5):961–1005
De Pascalis F, Perez-Ruzafa A, Gilabert J, Marcos C, Umgeisser G (2012) Climate change response of the Mar Menor coastal lagoon (Spain) using a hydrodynamic finite element model. Estuar Coast Shelf Sci 114:118–129
Delauney L, Compere C, Lehaitre M (2010) Biofouling protection for marine environmental sensors. Ocean Sci 6:503–511
Elshorbagy W, Azam MH, Elhakeem A (2013) Temperature-salinity modeling for Ruwais coastal area in United Arab Emirates. Mar Pollut Bull 73:170–182
Faruk DO (2010) A hybrid neural network and ARIMA model for water quality time series prediction. Eng Appl Artif Intell 23:586–594
Gazzaz NM, Yusoff MK, Aris AZ, Juahir H, Ramli MF (2012) Artificial neutral network modeling of the water quality index for Kinta River (Malaysia) using water quality variables as predictors. Mar Pollut Bull 64:2409–2420
Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Longman Publishing Co., New York
Hatzikos E, Hatonen J, Bassiliades N, Vlahavas I, Fournou E (2009) Applying adaptive prediction to sea-water quality measurements. Expert Syst Appl 36:6773–6779
Haykin S (2009) Neural networks and learning machines, 3rd edn. Pearson Prentice Hall, New Jersey
Kohonen T, Oja E, Simula O, Visa A, Kangas J (1996) Engineering applications of the self-organizing map. Proc IEEE 84(10):1358–1384
Lamrini B, Lakhal EK, Le Lann MV, Wehenkel L (2011) Data validation and missing data reconstruction using self-organizing map for water treatment. Neural Comput Applic 20:575–588
Lavenberg K (1944) A method for the solution of certain non-linear problems in least squares. Quart Appl Math 2:164–168
Mallat S (1989) Multiresolution approximation and wavelet orthonormal bases of L2(R). T Am Math Soc 315:69–87
Manov DV, Chang GC, Dickey TD (2004) Methods for reducing biofouling of moored optical sensors. J Atmos Ocean Technol 21(6):958–968
Marquardt DW (1963) An algorithm for least-squares estimation of nonlinear parameters. J Soc Ind Appl Math 11:431–441
May R, Dandy G, Maier H (2011) Review of input variable selection methods for artificial neural networks. In: Suzuki K (ed) Artificial neural networks-methodological advances and biomedical application (pp. InTech, New York, pp 19–44
Mulia IE, Harold T, Roopsekhar K, Tkalich P (2013a) Hybrid ANN-GA model for predicting turbidity and chlorophyll-a concentrations. J Hydro Environ Res 7:279–299
Mulia IE, Asano T, Tkalich P (2013b) Signal decomposition technique to improve data-driven model for sea temperature data series. Proc JSCE 4:70–74
Mwale FD, Adeloye AJ, Rustum R (2012) Infilling of missing rainfall and streamflow data in the Shire River basin, Malawi – A self organizing map approach. Phys Chem Earth A/B/C 50–52:34–43
Nelson M, Hill T, Remus T, O’Connor M (1999) Time series forecasting using NNs: Should the data be deseasonalized first? J Forecast 18:359–367
Pairaud IL, Gatti J, Bensoussan N, Verney R, Garreau P (2011) Hydrology and circulation in a coastal area off Marseille: Validation of a nested 3D model with observations. J Mar Syst 88:20–33
Pang WC, Tkalich P (2003) Modeling tidal and monsoon driven currents in the Singapore Strait. Singap Marit Port J 151–162
Patil K, Deo MC, Ghosh S, Ravidchandran M (2013) Predicting sea surface temperature in the North Indian Ocean with nonlinear autoregressive neural networks. Int J Oceanogr. doi:10.1155/2013/302479
Singh KP, Basant A, Malik A, Jain G (2009) Artificial neural network modeling of the river water quality. A case study. Ecol Model 220:888–895
Srivastava N, Hinston G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Sundarambal P, Liong SY, Tkalich P (2008) An ANN application for water quality forecasting. Mar Pollut Bull 56:1586–1597
Wei S, Song J, Khan NI (2012) Simulating and predicting river discharge time series using a wavelet-neural network hybrid modelling approach. Hydrol Process 26:281–296
Zhai L, Tang C, Platt T, Sathyendranath S (2011) Ocean response to attenuation of visible light by phytoplankton in the Gulf of St. Lawrence. J Mar Syst 88:285–297
Zhang GP (2003) Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50:159–175
Zhang GP, Qi M (2005) Neural network forecasting for seasonal and trend time series. Eur J Oper Res 160:501–514
Acknowledgments
The authors are grateful to the Public Utility Board of Singapore for sponsoring this work and to all Tropical Marine Science Institute (TMSI) colleagues for their valuable contributions to the success of this study. We would also like to thank the anonymous reviewers for their insightful comments and suggestions to improve the quality of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: H. A. Babaie
Rights and permissions
About this article
Cite this article
Mulia, I.E., Asano, T. & Tkalich, P. Retrieval of missing values in water temperature series using a data-driven model. Earth Sci Inform 8, 787–798 (2015). https://doi.org/10.1007/s12145-015-0210-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-015-0210-x