Abstract
Artificial neural network (ANN)-based data-driven model is an effective and robust tool for multi-input single-output (MISO) system simulation task. However, there are several conundrums which deteriorate the performance of the ANN model. These problems include the hard task of topology design, parameter training, and the balance between simulation accuracy and generalization capability. In order to overcome conundrums mentioned above, a novel hybrid data-driven model named KEK was proposed in this paper. The KEK model was developed by coupling the K-means method for input clustering, ensemble back-propagation (BP) ANN for output estimation, and K-nearest neighbor (KNN) method for output error estimation. A novel calibration method was also proposed for the automatic and global calibration of the KEK model. For the purpose of intercomparison of model performance, the ANN model, KNN model, and proposed KEK model were applied for two applications including the Peak benchmark function simulation and the real-world electricity system daily total load forecasting. The testing results indicated that the KEK model outperformed other two models and showed very good simulation accuracy and generalization capability in the MISO system simulation tasks.
Similar content being viewed by others
References
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185
Benmouiza K, Cheknane A (2013) Forecasting hourly global solar radiation using hybrid k-means and nonlinear autoregressive neural network models. Energy Convers Manag 75:561–569
Bowden GJ, Dandy GC, Maier HR (2005) Input determination for neural network models in water resources applications. Part 1-background and methodology. J Hydrol 301:75–92
Bowden GJ, Maier HR, Dandy GC (2005) Input determination for neural network models in water resources applications. Part 2. Case study: forecasting salinity in a river. J Hydrol 301:93–107
Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40:200–210
Dong J, Zheng C, Kan G, Wen J, Zhao M, Yu J (2015) Applying the ensemble artificial neural network-based hybrid data-driven model to daily total load forecasting. Neural Comput Appl 26(3):603–611
Everitt BS, Landau S, Leese M, Stahl D (2011) Miscellaneous clustering methods, in cluster analysis, 5th edn. Wiley, Chichester
Gutierrez-Corea FV, Manso-Callejo MA, Moreno-Regidor MP, Manrique-Sancho MT (2016) Forecasting short-term solar irradiance based on artificial neural networks and data from neighboring meteorological stations. Sol Energy 134:119–131
He H, Tan Y (2012) A two-stage genetic algorithm for automatic clustering. Neurocomputing 81:49–59
He X, He F, Cai W (2016) Underdetermined BSS based on K-means and AP clustering. Circ Syst Signal Process 32(8):2881–2913
Kan G, Yao C, Li Q, Li Z, Yu Z, Liu Z, Ding L, He X, Liang K (2015) Improving event-based rainfall-runoff simulation using an ensemble artificial neural network based hybrid data-driven model. Stoch Environ Res Risk Assess 29:1345–1370
Kan G, Lei T, Liang K, Li J, Ding L, He X, Yu H, Zhang D, Zuo D, Bao Z, Mark Amo-boateng HuY, Zhang M (2016) A multi-core CPU and many-core GPU based fast parallel shuffled complex evolution global optimization approach. IEEE Trans Parallel Distrib Syst. doi:10.1109/TPDS.2016.2575822
Kan GY, Li J, Zhang X, Ding L, He X, Liang K, Jiang X, Ren M, Li H, Wang F, Zhang Z, Hu Y (2015) A new hybrid data-driven model for event-based rainfall-runoff simulation. Neural Comput Appl. doi:10.1007/s00521-016-2200-4
Kan G, Liang K, Li JR, Ding LQ, He XY, Hu YB, Mark AB (2016) Accelerating the SCE-UA global optimization method based on multi-core CPU and many-core GPU. Adv Meteorol. doi:10.1155/2016/8483728
Khadse CB, Chaudhari MA, Borghate VB (2016) Conjugate gradient back-propagation based on artificial neural network for real time power quality assessment. Int J Electr Power Energy Syst 82:197–206
Kim HJ, Jo NO, Shin KS (2016) Optimization of cluster-based evolutionary undersampling for the artificial neural networks in corporate bankruptcy prediction. Expert Syst Appl 59:226–234
Li Z, Kan G, Yao C, Liu Z, Li Q, Yu S (2014) An improved neural network model and its application in hydrological simulation. J Hydrol Eng 19(10):04014019-1–04014019-17
Mangalova E, Shesterneva O (2016) K-nearest neighbors for GEFCom2014 probabilistic wind power forecasting. Int J Forecast 32(3):1067–1073
Sahoo AK, Zuo MJ, Tiwari MK (2012) A data clustering algorithm for stratified data partitioning in artificial neural network. Expert Syst Appl 39:7004–7014
Sfidari E, Kadkhodaie-Ilkhchi A, Najjari S (2012) Comparison of intelligent and statistical clustering approaches to predicting total organic carbon using intelligent systems. J Petrol Sci Eng 86–87:190–205
Shahrivari S, Jalili S (2016) Single-pass and linear-time k-means clustering based on MapReduce. Inf Syst 60:1–12
Xie JY, Gao HC, Xie WX, Liu XH, Grant PW (2016) Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Inf Syst 254:19–40
Yang S, Ting TO, Man KL, Guan SU (2013) Investigation of neural network for function approximation. Information Technology and Quantitative Management (ITQM2013). Proc Comput Sci 17:586–594
Zhang Y, Wang J (2016) K-nearest neighbors and a kernel density estimator for GEFCom2014 probabilistic wind power forecasting. Int J Forecast 32(3):1074–1080
Zhao Z, Zhang Y, Liao H (2008) Design of ensemble neural network using the Akaike information criterion. Eng Appl Artif Intell 21:1182–1188
Acknowledgments
This research was funded by the IWHR Research and Development Support Program (JZ0145B052016) IWHR Scientific Research Projects of Outstanding Young Scientists “Research and application on the fast global optimization method for the Xinanjiang model parameters based on the high performance heterogeneous computing” (No. KY1605), Specific Research of China Institute of Water Resources and Hydropower Research (Grant Nos. Fangji 1240), the Third Sub-Project: Flood Forecasting, Controlling and Flood Prevention Aided Software Development—Flood Control Early Warning Communication System and Flood Forecasting, Controlling and Flood Prevention Aided Software Development for Poyang Lake Area of Jiangxi Province (0628-136006104242, JZ0205A432013, SLXMB200902), Study of distributed flood risk forecast model and technology based on multi-source data integration and hydrometeorological coupling system (2013CB036400), IWHR application project of multi-source precipitation fusion and soil moisture remote sensing assimilation, the NNSF of China, Numerical Simulation Technology of Flash Flood based on Godunov Scheme and Its Mechanism Study by Experiment (No. 51509263), and China Postdoctoral Science Foundation on Grant (Grant NO.: 2016M591214). We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Tesla K40 GPU used for this research.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kan, G., He, X., Li, J. et al. A novel hybrid data-driven model for multi-input single-output system simulation. Neural Comput & Applic 29, 577–593 (2018). https://doi.org/10.1007/s00521-016-2534-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-016-2534-y