Hybrid wavelet ν-support vector machine and chaotic particle swarm optimization for regression estimation
Introduction
RECENTLY, a novel machine learning technique, called support vector machine (SVM), has drawn much attention in the fields of pattern classification and regression forecasting. SVM was first introduced by Vapnik (1995). SVM is a kind of classifier’s studying method on statistic learning theory. This algorithm derives from linear classifier, and can solve the problem of two kind classifier, later this algorithm applies in non-linear fields, that is to say, we can find the optimal hyper-plane to classify the samples set. It is an approximate implementation to the structure risk minimization (SRM) principle in statistical learning theory, rather than the empirical risk minimization (ERM) method.
Compared with traditional neural networks, SVM can use the theory of minimizing the structure risk to avoid the problems of excessive study, calamity data, local minimal value and so on. For the small samples set, this algorithm can be generalized well. SVM has been successfully used for machine learning with large and high dimensional data sets. These attractive properties make SVM become a promising technique. This is due to the fact that the generalization property of an SVM does not depend on the complete training data but only a subset thereof, the so-called support vectors. Now, SVM has been applied in many fields as follows: supply chain demand forecasting (Carbonneau, Laframbois, & Vahidov, 2008), fault diagnosis (Ge et al., 2008, Lu et al., 2005), load forecasting (Hong, 2009), feature selection (Du et al., 2007, Huang and Dun, 2007, Lin et al., 2008, Pfingsten et al., 2007), gene selection and tumor classification (Shen, Shi, Kong, & Ye, 2007), etc.
For pattern recognition and regression analysis, the non-linear ability of SVM can use kernel mapping to achieve. For the kernel mapping, the kernel function must satisfy the condition of Mercer theorem. The Gauss function is a kind of kernel function which is general used. It shows the good generalization ability. However, for our used kernel functions so far, the SVM cannot approach any curve in L2(Rn) space (quadratic continuous integral space), because the kernel function which is used now is not the complete orthonormal base. This character lead the SVM cannot approach every curve in L2(Rn) space. Similarly, the regression SVM cannot approach every function.
According to the above describing, we need find a new kernel function, and this function can build a set of complete base through horizontal floating and flexing. As we know, this kind of function has already existed, and it is the wavelet functions. Based on wavelet theory, this paper proposes a kind of admissive support vector (SV) kernel function which is named wavelet kernel function, and we can prove that this kind of kernel function is existent. The Morlet and Mexican wavelet kernel functions are the orthonormal base of L2(Rn) space. Based on the wavelet analysis and conditions of the support vector kernel function, Morlet or Mexican wavelet kernel function for support vector regression machine (SVM) is proposed, which is a kind of approximately orthonormal function. This kernel function can approximate any curve in L2(Rn) space, thus it enhances the generalization ability of the SVM.
For the study of wavelet support vector machine, the field of research can be categorized into two types. In the first type, wavelet is used to decompose a time series into high- and low-frequency components, or into different scale components. And then, SVM is used to estimate each component respectively. The final estimating value is obtained by reconstructing each estimating values using wavelet reverse transform on different scale or high- and low-frequency. In fact, wavelet is only used as a tool of data pretreatment in this method (Fernandez, 2007, Neumann et al., 2005). In the second type, wavelet function is used as support vector (SV) kernel by dot-product or horizontal floating. This method can integrate their characteristics between wavelet and SVM closely (Chen et al., 2006, Chen and Dudek, 2005, Lu et al., 2005, Wu et al., 2005). For classifier machine, many studies deal with wavelet ε-support vector machine (W ε-SVM) Wu et al., 2005, (Lu et al., 2005). For regression machine, the published literatures almost focus on wavelet ε-support vector machine (W ε-SVM) (Chen and Dudek, 2005, Chen et al., 2006), whereas wavelet ν-support vector machine (W ν-SVM) is given no attention.
Based on the analysis of the above WSVM literatures, the contribution of this paper focuses on the establishment and application of W ν-SVM for regression estimation.
However, some unknown parameters of SVM model, such as parameter v, penalty coefficient C and kernel constant coefficient, need be confirmed before the final regression estimation in SVM theory. To seek the optimal combination of these parameters in W ν-SVM regression estimation, embedded chaotic particle swarm optimization (ECPSO) is also proposed to optimize the unknown parameters of W ν-SVM.
Based on the W ν-SVM and ECPSO, a hybrid model for estimating scale series with multi-dimension, nonlinearity and uncertain characteristics is proposed in this paper. W ν-SVM is proposed in Section 2. Section 3 gives ECPSO algorithm. Hybrid model based on W ν-SVM and ECPSO is proposed in Section 4. In Section 5, an application of the hybrid model is done. Section 6 draws the conclusion.
Section snippets
W ν-SVM
SVM is one of the most important achievements of statistical learning theory. WSVM is a kind of SVM, whose key idea is that it firstly maps the samples into the high dimension feature space by wavelet kernel function, and then finds the support vectors and their corresponding coefficients by solving a quadratic programming problem based on the rule of structure risk minimization and dual principle, and finally uses these support vectors and coefficients to construct the classifier. The
Optimization algorithm of parameters selection
The confirmation of unknown parameters of W ν-SVM is complicated process. In fact, it is a multivariable optimization problem in a continuous space. The appropriate parameter combination of models can enhance approximating degree of the proposed model. Therefore, it is necessary to select an intelligence algorithm to get the optimal parameters of the proposed model. The parameters of W ν-SVM have a great effect on the generalization performance of W ν-SVM. An appropriate parameter combination
The hybrid model
In the estimating technique of series, two of the key problems are how to deal with establishment of model and selection of model parameters. A potential solution to the above two problem is to use a hybrid model (HM) architecture illuminated by Fig. 1. HM architecture is generalized into a two-stage architecture to handle the non-stationary in the data. In the first of the two-stage architecture, a mixture of experts including evolutionary algorithm are competed to optimize the model in the
Experiment
To analyze the performance of the proposed model, the estimation of car sale series is studied. For finding some performance of ECPSO, standard PSO and adaptive PSO (APSO) which has linear inertia weight also are adopted to optimize the parameters of Wν-SVM. To evaluate estimating capacity of the HM, some evaluation indexes, such as mean absolute error (MAE), mean absolute percentage error (MAPE) and mean square error (MSE), are adopted to deal with the forecasting result of ECPSOW v-SVM, APSOW
Conclusion
In this paper, a new version of SVM, named W v-SVM, is proposed for regression estimation. To seek the optimal parameters of W v -SVM, a new PSO is also proposed to optimize the unknown parameters of W v -SVM. The performance of the hybrid model based on ECPSO and W v-SVM is evaluated by means of estimating car sale, and the simulation results demonstrate that the hybrid model is effective in dealing with many dimensions, nonlinearity and finite samples. Moreover, it is shown that ECPSO
Acknowledgements
This research is supported by the National Natural Science Foundation of China under grant 60904043; a research grant funded by the Hong Kong Polytechnic University (G-YX5J); funding from the China Postdoctoral Science Foundation (20090451152); the third special grant from the China Postdoctoral Science Foundation, Jiangsu Planned Projects for Postdoctoral Research Funds (0901023C); and by the Southeast University Planned Projects for Postdoctoral Research Fund.
References (28)
- et al.
Particle swarm approaches using Lozi map chaotic sequences to modelling of an experimental thermal-vacuum system
Applied Soft Computing
(2008) - et al.
Application of machine learning techniques for supply chain demand forecasting
European Journal of Operational Research
(2008) Wavelet- and SVM-based forecasts: An analysis of the US metal and materials manufacturing industry
Resources Policy
(2007)- et al.
New chaotic algorithm for image encryption
Chaos, Solitons and Fractals
(2006) Electric load forecasting by support vector model
Applied Mathematical Modelling
(2009)- et al.
Particle swarm optimization for parameter determination and feature selection of support vector machines
Expert Systems with Applications
(2008) - et al.
Improved particle swarm optimization combined with chaos
Chaos, Solitons and Fractals
(2005) - et al.
Efficient wavelet adaptation for hybrid wavelet–large margin classifiers
Pattern Recognition
(2005) - et al.
A combination of modified particle swarm optimization algorithm and support vector machine for gene selection and tumor classification
Talanta
(2007) - et al.
The connection between regularization operators and support vector kernels
Neural Networks
(1998)