A hybrid-forecasting model based on Gaussian support vector machine and chaotic particle swarm optimization

https://doi.org/10.1016/j.eswa.2009.07.057Get rights and content

Abstract

Load forecasting is an important subject for power distribution systems and has been studied from different points of view. This paper aims at the Gaussian noise parts of load series the standard v-support vector regression machine with ε-insensitive loss function that cannot deal with it effectively. The relation between Gaussian noises and loss function is built up. On this basis, a new v-support vector machine (v-SVM) with the Gaussian loss function technique named by g-SVM is proposed. To seek the optimal unknown parameters of g-SVM, a chaotic particle swarm optimization is also proposed. And then, a hybrid-load-forecasting model based on g-SVM and embedded chaotic particle swarm optimization (ECPSO) is put forward. The results of application of load forecasting indicate that the hybrid model is effective and feasible.

Introduction

Precise short-term load forecasting (STLF) is a basic requirement for the power system. As a very important task for power system operation, STLF helps the electric utility to make important decisions including unit commitment, load switching, etc. In addition, precise load forecasting improves the security of the power system. The research approaches of short-term load forecasting can mainly be divided into two categories: statistical method and artificial intelligence method. In statistical method, an equation can be obtained showing the relationship between load and its relative factors after training the historical data, while artificial intelligence method tries to imitate human being’s way of thinking and reasoning in forecasting the future load.

The statistical category includes multiple linear regression (Amjady, 2001, Papalexopoulos and Hesterberg, 1990), stochastic time series (Christianse, 1971), general exponential smoothing, state space, etc. Usually, statistical method can predict the linear load series very well, but it lacks the ability to analyze the nonlinear character of load series due to the inflexibility of its structure. Expert system (Dash, Liew, Rahman, & Ramakrishna, 1995), artificial neural network (ANN) (Chiu et al., 1997, Xiao et al., 2009) and fuzzy inference (Ying & Pan, 2008) belong to the artificial intelligence category. Expert system tries to get the knowledge of experienced operators and express it in an “if … then” rule, but the difficulty is sometimes the expert’s knowledge is intuitive and could not easily be expressed. Artificial neural network does not need the expression of the human experience. It aims to establish a network between the input data set and the observed output data set. It is good at dealing with the nonlinear relationship between the load and its relative factors, but the shortcoming lies in over-fitting and long training time. Fuzzy inference is an extension of expert system. It constructs an optimal structure of the simplified fuzzy inference, which minimizes model errors and the number of the membership functions to grasp nonlinear behavior of short-term loads. However, it still needs the experts’ experience to generate the fuzzy rules. Generally, artificial intelligence methods are flexible in finding the relationship between load and its relative factors, especially for the anomalous load forecasting.

Most of the STLF methods hypothesize a regression function (or a network structure, e.g. in ANN) to represent the relationship between the input and the output variables. How to hypothesize the function or the network is a major difficulty because it needs detailed transcendental knowledge of the problem. If the regression form or the network structure is improperly selected, the prediction result would be unsatisfactory. Moreover, it is always a difficulty to select the input variables. Too many or too few input variables would decrease the accuracy of prediction expert system and fuzzy inference do not need to hypothesize the input–output relationship, but it is even more difficult to transform the experts’ experience to a rule database. Unlike the statistical models, this NN is a data-driven and nonparametric weak model. Thus, the NN performs well in the problem of load forecasting when the sample data are sufficient. Nevertheless, the available pre-existing load series in companies are often finite. Under this condition, the approximation ability and generalization performance of the NN are poor. To overcome this disadvantage, a new approach should be explored.

Recently, support vector machine (SVM) (Vapnik, 1995), which is a very promising statistical-learning method, has also been applied to STLF and has shown good result. SVM is firmly grounded in the framework of statistical-learning theory and Vapnik–Chervonenkis theory (VC), which has been developed over the last three decades by Vapnik and Chervonenkis, 1974, Vapnik, 1982. Generally speaking, SVM is to minimize the structural risk instead of the usual empirical risk by minimizing an upper bound of the generalization error, and it obtains an excellent generalization performance. Moreover, SVM is especially suitable for solving problems of small sample size and has already been used for classification (Akay, 2009, Chandaka et al., 2009, Lee and Lee, 2006, Wu et al., 2009), regression and time series prediction (Tang et al., 2009, Wu, 2009, Wu, in press, Wu et al., 2008, Wu et al., 2009). SVM is to map the input data into a higher dimensional feature space through a nonlinear mapping, and then a linear regression problem is obtained and solved in this feature space (Hu and Song, 2004, Ikeda and Aoishi, 2005, Xiao et al., 2008, Yao and Yu, 2006).

However, the standard SVM encounters some difficulty in real application. Some improved SVMs have been put forward to solve the concrete problem (Wu, 2009, Wu, in press, Wu and Yan, 2009a, Wu and Yan, 2009b, Wu et al., 2008, Wu et al., 2009, Wu et al., 2009). The standard v-SVM adopting ε-insensitive loss function has good generalization capability in some applications (Wu, 2009, Wu, in press, Wu et al., 2008). But it is difficult to deal with the normal distribution noise parts of series. Therefore, the main contribution of this paper can be summarized as follows:

  • (a)

    A new version of SVM called SVM with Gaussian loss function (g-SVM) is proposed to approximate load series with normal distribution noise. Compared with standard SVM, the proposed SVM can penalize the Gaussian noise parts of load series effectively.

  • (b)

    A new version of PSO called embedded chaotic particle swarm optimization (ECPSO) is also proposed for parameters selection of g-SVM. ECPSO can augment diversity of particles by means of chaotic mapping and enhance the searching ergodicity.

  • (c)

    A new hybrid-forecasting method composed of g-SVM and ECPSO is proposed for the STLF. The hybrid-forecasting method can find better solutions in the solution space of the training phase than standard SVM and ARMA.

This paper is organized as follows. The g-SVM is described in Section 2. Section 3 provides a new PSO called embedded chaotic PSO (ECPSO) to obtain the optimal parameters of g-SVM. And then gives the hybrid model based on the g-SVM and ECPSO. In Section 4, g-SVM is used to learn the relationships among all influencing factors and loads. The suitability of the proposed approach is illustrated through an application to real load forecasting from the Jiangsu Electricity Distribution Corporation in china, and then g-SVM is compared with the standard v-SVM and ARMA. Section 5 draws some conclusions.

Section snippets

Standard v-SVM model

Suppose training sample set T={(xi,yi)}i=1l, where xiRd,yiR. ε-insensitive loss function can be described as follows:c(xi,yi,f(xi))=|yi-f(xi)|εwhere |yi-f(xi)|ε=max{0,|yi-f(xi)|-ε},ε is a given real number.

The standard v-SVM with ε-insensitive loss function can be described as follows:minw,b,ξ(),ετ(w,ξ(),ε)=12w2+C·v·ε+1li=1l(ξi+ξi)s.t.(w·xi+b)-yiε+ξiyi-(w·xi+b)ε+ξiξ()0,ε0where w is a column vector with d dimension, C > 0 is a penalty factor, ξi()(i=1,,l) are slack variables and v(

Embedded chaotic particle swarm optimization

It is difficult to confirm the optimal parameters of the SVM model. There exists crossover error in crossover validation method used commonly to determine penalty coefficient, controlling vector and kernel parameter. To overcome the shortage, a new PSO with chaotic mapping is proposed, namely, embedded chaotic particle swarm optimization (ECPSO). ECPSO utilized to optimize the parameters of g-SVM can increase the diversity of individual and searching ergodicity.

Application

In developing a hybrid model forecastor based on SVM, the first important step is feature selection (new features are selected from the original inputs) or feature extraction (new features are transformed from the original inputs). In the development of SVM, all available indicators can be used as the inputs, but irrelevant or correlated features could adversely impact the generalization performance due to the curse of dimensionality problem. Thus, it is critical to perform feature selection or

Conclusions

Accurate load forecasting is crucial for an energy-limited economy system, like China. The historical electricity load data of each region in China show a strong growth trend, particularly in southern region. Although this is a common phenomenon in developing countries, overproduction or underproduction electricity load influence the sustainable development of economy a lot. This study introduced a novel forecasting technique, g-SVM, to investigate its feasibility in forecasting annual regional

Acknowledgements

This research is supported by the National Natural Science Foundation of China under grants (60904043), China Postdoctoral Science Foundation (20090451152) and Jiangsu Planned Projects for Postdoctoral Research Funds (0901023C).

References (34)

Cited by (89)

  • Meta-Heuristic Search Optimization and its application to Time Series Forecasting Model

    2022, Intelligent Systems with Applications
    Citation Excerpt :

    A prevalent example of NNOT is GA (Liu and Wang, 2016, Min et al., 2006, Liang and Fang, 2002, Wu et al., 2009, Shokri et al., 2014). Another example, POS investigated in optimizing SVM which can be found in (Dai et al., 2018, Barati and Sharifian, 2015, Bao et al., 2013, Liu et al., 2011, Wu, 2010), other nature- motivated meta-heuristic algorithms that proposed for hyper-parameter optimization was reported in several studies such as, artificial bee colony (Gao and Wang, 2017, Hong, 2011), Bat Algorithm (Tavakkoli et al., 2015, Tharwat et al., 2017), krill herd algorithm (Stasinakis et al., 2016), moth-flame optimization (Li et al., 2016), grey wolf optimizer (Mustaffa et al., 2015) have been utilized to select optimal parameters for SVR. However, meta-heuristics algorithms such as ant colony optimization (Zhang et al., 2010), flower pollination algorithm (Hoang et al., 2016), social-spider optimization (Pereira et al., 2016), ant lion optimizer (Zhao et al., 2016), and multi-verse optimizer (Faris et al., 2018).

View all citing articles on Scopus
View full text