Skip to main content
Log in

Novel hybrid SVM-TLBO forecasting model incorporating dimensionality reduction techniques

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In this paper, we present a highly accurate forecasting method that supports improved investment decisions. The proposed method extends the novel hybrid SVM-TLBO model consisting of a support vector machine (SVM) and a teaching-learning-based optimization (TLBO) method that determines the optimal SVM parameters, by combining it with dimensional reduction techniques (DR-SVM-TLBO). The dimension reduction techniques (feature extraction approach) extract critical, non-collinear, relevant, and de-noised information from the input variables (features), and reduce the time complexity. We investigated three different feature extraction techniques: principal component analysis, kernel principal component analysis, and independent component analysis. The feasibility and effectiveness of this proposed ensemble model were examined using a case study, predicting the daily closing prices of the COMDEX commodity futures index traded in the Multi Commodity Exchange of India Limited. In this study, we assessed the performance of the new ensemble model with the three feature extraction techniques, using different performance metrics and statistical measures. We compared our results with results from a standard SVM model and an SVM-TLBO hybrid model. Our experimental results show that the new ensemble model is viable and effective, and provides better predictions. This proposed model can provide technical support for better financial investment decisions and can be used as an alternative model for forecasting tasks that require more accurate predictions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Cai LJ, Zhang JQ, Zongwu CAI, Kian Guan LIM (2006) An empirical study of dimensionality reduction in support vector machine. Neural Network World 16(3):177–192

    Google Scholar 

  2. Cao LJ (2003) Support vector machines experts for time series forecasting. Neorocomputing 51:321–339

    Article  Google Scholar 

  3. Cao LJ, Chua KS, Chong WK, Lee HP, Gu QM (2003) A comparison of PCA, KPCA and ICA for dimensional reduction in support vector machines. Neurocomputing 55(1):321–336

    Google Scholar 

  4. Cao LJ, Tay FEH (2003) Support vector machine with adaptive parameters in financial time series forecasting. IEEE Trans Neural Netw 14(6):1506–1518

    Article  Google Scholar 

  5. Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27

    Google Scholar 

  6. Chang PC, Wu JL (2015) A critical feature extraction by kernel PCA in stock trading model. Soft Comput 19(5):1393–1408

    Article  Google Scholar 

  7. Chen WH, Shih JY, Wu S (2006) Comparison of support-vector machines and back propagation neural networks in forecasting the six major Asian stock markets. International Journal of Electronic Finance 1(1):49–67

    Article  Google Scholar 

  8. Das SP, Padhy S (2015) A novel hybrid model using teaching–learning-based optimization and a support vector machine for commodity futures index forecasting. Int J Mach Learn Cyber:1–15. doi:10.1007/s13042-015-0359-0

  9. Diebold FX, Mariano RS (1995) Comparing predictive accuracy. J Bus Econ Stat 13(3):253–263

    MathSciNet  Google Scholar 

  10. Ekenel HK Sankur B (2004) Feature selection in the independent component subspace for face recognition. Pattern Recogn Lett 25(12):377–1388

    Google Scholar 

  11. Haykin S (2010) Neural Networks and Learning Machines. 3rd Edition, PHI Learning Private Limited

  12. Hsu CM (2013) A hybrid procedure with feature selection for resolving stock/futures price forecasting problems. Neural Comput Applic 22(3–4):651–671. doi:10.1007/s00521-011-07214

    Article  Google Scholar 

  13. Huang CL, Tsai CY (2009) A hybrid SOFM-SVR with a filter based feature selection for stock market forecasting. Expert Syst Appl 36(2):1529–1539. doi:10.1016/j.eswa.2007.11.062

    Article  MathSciNet  Google Scholar 

  14. Hyvarinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York

    Book  Google Scholar 

  15. Hyvarinen A, Oja E (1997) A fast fixed-point algorithm for independent component analysis. Neural Comput 9(7):1483– 1492

    Article  Google Scholar 

  16. Hyvarinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural networks 13(4):411–430

    Article  Google Scholar 

  17. Ince H, Trafalis TB (2007) Kernel principal component analysis and support vector machines for stock price prediction. IIE Trans 39(6):629–637

    Article  Google Scholar 

  18. Ince H, Trafalis TB (2008) Short term forecasting with support vector machines and application to stock price prediction. Int J Gen Syst 37(6):77–687. doi:10.1080/03081070601068595

    Article  MathSciNet  MATH  Google Scholar 

  19. Jiang M, Jiang S, Zhu L, Wang Y, Huang W, Zhang H (2013) Study on parameter optimization for support vector regression in solving the inverse ECG problem. Comput Math Methods Med Article ID 158059. doi:10.1155/2013/158056

    MathSciNet  MATH  Google Scholar 

  20. Jolliffe IT (2002) Principle components analysis 2 nd Edition. Springer, New York

    Google Scholar 

  21. Kim KJ (2003) Financial time series forecasting using support vector machines. Neurocomputing 55(1):307–319

    Article  Google Scholar 

  22. Kim KJ, Han I (2000) Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index. Expert Syst Appl 19(2):125–132

    Article  MathSciNet  Google Scholar 

  23. Kim KJ, Lee WB (2004) Stock market prediction using artificial neural networks with optimal feature transformation. Neural Comput Applic 13(3):255–260. doi:10.1007/s00521-004-0428-x

    Article  Google Scholar 

  24. Kuang F, Zhang S, Jin Z, Xu W (2015) A novel SVM by combining kernel principal component analysis and improved chaotic particle swarm optimization for intrusion detection. Soft Comput 19:1187–1199. doi:10.1007/s00500-014-1332-7

    Article  Google Scholar 

  25. Lai RK, Fan CY, Huang WH, Chang PC (2009) Evolving and clustering fuzzy decision tree for financial time series data forecasting. Expert Syst Appl 36(2):3761–3773. doi:10.1016/j.eswa.2008.02.025

    Article  Google Scholar 

  26. Leung MT, Daouk H, Chen AS (2000) Forecasting stock indices: a comparison of classification and level estimation models. Int J Forecast 16(2):173–190

    Article  Google Scholar 

  27. Liang X, Zhang H, Xiao J, Chen Y (2009) Improving option price forecasts with neural networks and support vector regressions. Neurocomputing 72(13):3055–3065. doi:10.1016/j.neucom.2009.03.015

    Article  Google Scholar 

  28. Lin HT, Lin CJ (2003) A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods Technical report, University of National Taiwan Department of Computer Science and Information Engineering, March 1–32

  29. Lin SW, Ying KC, Chen SC, Lee ZJ (2008) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35(4):1817– 1824

    Article  Google Scholar 

  30. Liu S, Tian L, Huang Y (2014) A comparative study on prediction of throughput in coal ports among three models. Int J Mach Learn Cybern 5(1):125–133. doi:10.1007/s13042-013-0201-5

    Article  Google Scholar 

  31. Lu CJ (2013) Hybridizing nonlinear independent component analysis and support vector regression with particle swarm optimization for stock index forecasting. Neural Comput Applic 23(7–8):2417–2427. doi:10.1007/s00521-012-1198-5

    Article  Google Scholar 

  32. Lu CJ, Lee TS, Chiu CC (2009) Financial time series forecasting using independent component analysis and support vector regression. Decis Support Syst 47(2):115–125

    Article  Google Scholar 

  33. Musa AB (2014) A comparison of 1-regularizion, PCA, KPCA and ICA for dimensionality reduction in logistic regression. Int J Mach Learn Cybern 5(6):861–873. doi:10.1007/s13042-013-0171-7

    Article  Google Scholar 

  34. Pawar PV, Rao RV (2013) Parameter optimization of machining using teaching-learning-based optimization algorithm. Int J Adv Manuf Technol 67:995–1006

    Article  Google Scholar 

  35. Porikli F, Haga T (2004) Event detection by eigenvector decomposition using object and frame features. IEEE Conference In Computer Vision and Pattern Recognition Workshop 2004(CVPRW’04):114–114

    Google Scholar 

  36. Rao RV, Patel V (2014) A multi-objective improved teaching-learning based optimization algorithm for unconstrained and constrained optimization problems. Int J Ind Eng Comput 5(1):1–22. doi:10.5267/j.ijiec.2013.09.007

    Google Scholar 

  37. Rao RV, Savsani VJ, Vakharia DP (2011) Teaching-learning-based optimization: a novel method for constrained mechanical design optimization problems. Comput Aided Des 43(3):303– 315

    Article  Google Scholar 

  38. Sapankevych NI, Sankar R (2009) Time series prediction using support vector machines: a survey. IEEE Comput Intell Mag 4(2):24–38. doi:10.1109/MCI.2009.932254

    Article  Google Scholar 

  39. Tay FE, Cao LJ (2002) Modified support vector machines in financial time series forecasting. Neurocomputing 48(1):847– 861

    Article  MATH  Google Scholar 

  40. Tsai CF, Hsiao YC (2010) Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches. Decis Support Syst 50(1):258–269. doi:10.1007/s00500-014-1350-5

    Article  Google Scholar 

  41. Tsang PM, Kwok P, Choy SO, Kwan R, Ng SC, Mak J, Tsang J, Koong K, Wong TL (2007) Design and implementation of NN5 for Hong Kong stock price forecasting. Eng Appl Artif Intell 20 (4):453–461. doi:10.1016/j.engappai.2006.10.002

    Article  Google Scholar 

  42. Twining CJ, Taylor CJ (2003) The use of kernel principal component analysis to model data distributions. Pattern Recogn 36(1):217–227

    Article  MATH  Google Scholar 

  43. Van Gestel T, Suykens JA, Baestaens DE, Lambrechts A, Lanckriet G, Vandaele B, Vandewalle J (2001) Financial time series prediction using least squares support vector machines within the evidence framework. IEEE Trans Neural Netw 12(4):809–821

    Article  Google Scholar 

  44. Vapnik V (1995) The nature of statistical learning theory. Springer, NY

    Book  MATH  Google Scholar 

  45. Wang J, Wang J (2015) Forecasting stock market indexes using principle component analysis and stochastic time effective neural networks. Neurocomputing 156:68–78

    Article  Google Scholar 

  46. Wang S, Meng B (2011) Parameter selection algorithm for support vector machine. Prog Environ Sci 11:538–544. doi:10.1016/j.proenv.2011.12.085

    Article  Google Scholar 

  47. Wu CH, Tzeng GH, Lin RH (2009) A Novel hybrid genetic algorithm for kernel function and parameter optimization in support vector regression. Expert Syst Appl 36(3):4725–4735. doi:10.1016/j.eswa.2008.06.046

    Article  Google Scholar 

  48. Zhai G, Chen J, Wang S, Li K, Zhang L (2015) Material identification of loose particles in sealed electronics devices using PCA and SVM. Neurocomputing 148:222–228. doi:10.1016/j.neucom.2013.10.043

    Article  Google Scholar 

Download references

Acknowledgments

We would like to express our gratitude to the National Institute of Science and Technology (NIST), for the facilities and resources provided at the Data Science Laboratory at NIST for the development of this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shom Prasad Das.

Ethics declarations

Conflict of interests

The authors declare that there are no conflict of interests (either financial or non-financial) regarding the publication of the paper.

Appendices

Appendix A:: Technical indicators (features) used in this study

ᅟ ᅟ

Appendix B:: Dimensionality reduction techniques used in this study

The objective of a dimension reduction technique is to reduce the dimension (number of features) of the input from a high-dimensional space to a low dimensional subspace. Dimensional reduction methods can be divided into two types: (i) feature selection and (ii) feature extraction. In feature selection, a subset of features is selected from the originals. In feature extraction new features are computed by transforming the original features. We present brief reviews of the dimensional reduction methods based on feature extraction that were used in our study. That is, PCA, KPCA, and ICA.

B.1 Principal component analysis (PCA)

Principal component analysis is a well-known linear statistical approach for feature extraction. The objective is to reduce the dimension of the input features from the original dataset [20]. It uses an orthogonal transformation to convert a set of N patterns (samples) of l possibly correlated features into a set of Nsamples of m(≤l) uncorrelated features called principal components (PCs). The transformation mechanism is designed such that the first principal component (PC1) has the highest possible variance, the second principal component (PC2) is orthogonal to the PC1 and accounts for next highest variance, and so on for the other PCs.

The PCA procedure is briefly described as follows

Step 1::

Input Npatterns (samples) X 1,X 2,…,X N that each have l features (X j R l). Each vector X j for j=1,2,…,N is such that the mean value of the features in X j is zero (that is, we subtract the mean value of the original feature from each feature value).

Step 2::

Compute the covariance matrix

$$ C=\frac{1}{N}\sum\limits_{k=1}^{N} {X_{k} {X_{k}^{T}}} $$
(B.1)

The ij-th element of matrix Cis

$$ C_{ij} =\frac{1}{N}\sum\limits_{k=1}^{N} {X_{k} (i)X_{k} (j)} $$
(B.2)

where X k (i)denotes the ith component of the X k sample.

Step 3::

Calculate l eigenvalues of C and arrange them in non-increasing order λ 1λ 2≥...≥λ l . For each eigenvalue λ i , i=1,2,…,l, compute an associated eigenvector α i R lof matrix C using an eigenvector decomposition technique [35].

Step 4::

Choose the ml largest eigenvalues (choose the smallest integer m, so that λ m−1λ m is large or \(\sum \limits _{i=1}^{m} {\lambda _{i} } \ge t\sum \limits _{i=1}^{N} {\lambda _{i} } \) where t=0.95 if we wish to retain 95 % variance in the transformed data, where \(\sum \limits _{i=1}^{N} {\lambda _{i} }\) represents the total variance).

Step 5::

Use the eigenvectors (column vectors) α 1,α 2,…,α m to form the transformation matrix.

$$ A=[\alpha_{1} \alpha_{2} ...\alpha_{m}] $$
(B.3)
Step 6::

Transform each pattern X i in the original space R l to the vector Y i in the m-dimensional space R m(m<l) using

$$ Y_{i} =A^{T}X_{i} ,i=1,2,\ldots,N $$
(B.4)

So the jth component Y i (j) of Y i is the projection of X i on α i (i.e., \(Y_{i} (j)=\alpha _{j}^{T} X_{i} )\).

B.2 Kernel principal component analysis (KPCA)

In the PCA technique, each input pattern (sample) in R l is linearly projected onto a lower dimensional subspace. This is appropriate when the data approximately lie on a linear manifold (for example a hyperplane). However, in many applications the input data lie on a low dimensional nonlinear manifold. Then it is more appropriate to use KPCA, which is a nonlinear dimensional reduction technique. In this method the input patterns X i R l for i=1,2,…,N (where N is the number of input samples) are first mapped onto a space H with more than l dimensions using a non-linear mapping ϕ:R lH[42]. Their images ϕ(X i ) are projected along the orthonormal eigenvectors of the covariance matrix of ϕ(X i )’s. These projections only involve the inner product of the ϕ(X i )’s in H,ϕ is not explicitly known, and it is difficult to construct a kernel function. So we use Kdefined by K:R l×R lR such that

$$ K(X_{i} ,X_{j} )=<\phi (X_{i} ),\phi (X_{j})> $$
(B.5)

(where <, > denotes inner product in H) to compute the inner products involved in the projections leading to the computation of Y i s having fewer dimensions m(m<l)than X i s. It has been proved that the components Y i (k),k=1,2,…,m of the Y i s are uncorrelated and the first q(≤m) principal components have maximum mutual information with respect to the inputs, which justifies the use of the method for dimensionality reduction.

The KPCA procedure is given in the form of the following algorithm.

Step 1::

Input the data patterns (samples) X i R l for i=1,2,…,N (where N is the number of input samples).

Step 2::

Choose a kernel function K:R l×R lR and compute the kernel matrix K 1whose ij-th element is equal to K(X i ,X j ) for i,j=1,2,…,l

Step 3::

Compute the eigenvalues and eigenvectors of K 1. Arrange the eigenvalues in non-increasing order λ 1λ 2≥...≥λ l . Let the corresponding eigenvectors be a 1,a 2,…,a l .

Step 4::

Choose mdominant eigenvalues λ 1,λ 2,…,λ m (ml)[choose the smallest integer m such that λ m−1λ m is large or \(\sum \limits _{i=1}^{m} {\lambda _{i} } \ge t\sum \limits _{i=1}^{N} {\lambda _{i} } \), where t=0.95 if we wish to retain 95% of the variance in the transformed data, and \(\sum \limits _{i=1}^{N} {\lambda _{i} } \)represents the total variance], and normalize the corresponding eigenvectors a 1,a 2,…,a m using

$$ a_{k}^{\prime} =\frac{a_{k} }{\left\| {a_{k} } \right\|\sqrt {\lambda_{k} } },k=1,2,\ldots,m $$
(B.6)
Step 5::

For each X i ,i=1,2,…,N, compute the m projections Y i (k) of ϕ(X i ) onto each of the orthonormal eigenvectors \(a_{{k}^{\prime }}\)s, k=1,2,…,m,i.e.,

$$ Y_{i} (k)=\sum\limits_{j=1}^{l} {{a_{k}^{i}} (j)K(X_{i} ,X_{j} ),k=} 1,2,\ldots,m $$
(B.7)

B.3 Independent component analysis (ICA)

Independent component analysis (ICA) is a relatively new statistical method [14,16]. ICA does not transform uncorrelated components or factors, but instead attempts to find statistically independent components or factors in the transformed vectors. The primary goal of this method is to find representations of non-Gaussian data, so those components are statistically independent or as independent as possible [16].

In ICA, we assume that lmeasured variables X=[x 1,x 2,…,x l ]T can be expressed as linear combinations of n unknown latent source components S=[s 1,s 2,…,s n ]T, i.e.,

$$ X=AS $$
(B.8)

where A l×l is an unknown mixing matrix. Here, we consider that ln if A is a full rank matrix. S is the latent source data that cannot be directly observed from the input mixture data, X. The basic ICA objective is to estimate the latent source components, S, and unknown mixing matrix A from X with appropriate assumptions on the statistical properties of the source distribution. The basic ICA model for feature transformation aims to find a de-mixing matrix W l×l that can be written as

$$ Y=WX $$
(B.9)

where Y=[y 1,y 2,…,y n ]T is the independent component vector. The elements of Ymust be statistically independent and are called independent components (ICs). Here, W = A −1(i.e., the de-mixing matrix Wis the inverse of mixing matrix A). The ICs (y i ) can be used to compute the latent source signals s i .

Many algorithms can perform the ICA. The fixed-point fast ICA method presented by Hyvärinen and Oja [15] is the most popular. We used fixed-point fast ICA in our experimental study. In this algorithm, PCA is first used to transform the original input vectors (X) to a set of new uncorrelated vectors with zero means and unity variance. This process reduces the dimension of X and consequently reduces the number of Y. Then, the uncorrelated vector obtained by PCA is used to estimate the independent components vectors (Y) and the transformed matrix using the fixed point algorithm.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Das, S.P., Achary, N.S. & Padhy, S. Novel hybrid SVM-TLBO forecasting model incorporating dimensionality reduction techniques. Appl Intell 45, 1148–1165 (2016). https://doi.org/10.1007/s10489-016-0801-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-016-0801-3

Keywords

Navigation