Novel hybrid SVM-TLBO forecasting model incorporating dimensionality reduction techniques

Das, Shom Prasad; Achary, N. Sangita; Padhy, Sudarsan

doi:10.1007/s10489-016-0801-3

Novel hybrid SVM-TLBO forecasting model incorporating dimensionality reduction techniques

Published: 08 July 2016

Volume 45, pages 1148–1165, (2016)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Shom Prasad Das¹,
N. Sangita Achary¹ &
Sudarsan Padhy²

642 Accesses
13 Citations
Explore all metrics

Abstract

In this paper, we present a highly accurate forecasting method that supports improved investment decisions. The proposed method extends the novel hybrid SVM-TLBO model consisting of a support vector machine (SVM) and a teaching-learning-based optimization (TLBO) method that determines the optimal SVM parameters, by combining it with dimensional reduction techniques (DR-SVM-TLBO). The dimension reduction techniques (feature extraction approach) extract critical, non-collinear, relevant, and de-noised information from the input variables (features), and reduce the time complexity. We investigated three different feature extraction techniques: principal component analysis, kernel principal component analysis, and independent component analysis. The feasibility and effectiveness of this proposed ensemble model were examined using a case study, predicting the daily closing prices of the COMDEX commodity futures index traded in the Multi Commodity Exchange of India Limited. In this study, we assessed the performance of the new ensemble model with the three feature extraction techniques, using different performance metrics and statistical measures. We compared our results with results from a standard SVM model and an SVM-TLBO hybrid model. Our experimental results show that the new ensemble model is viable and effective, and provides better predictions. This proposed model can provide technical support for better financial investment decisions and can be used as an alternative model for forecasting tasks that require more accurate predictions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A two-stage model for stock price prediction based on variational mode decomposition and ensemble machine learning method

Article 11 June 2023

A Hybrid Machine Learning Approach for Multistep Ahead Future Price Forecasting

Automatic optimized support vector regression for financial data prediction

Article 06 May 2019

References

Cai LJ, Zhang JQ, Zongwu CAI, Kian Guan LIM (2006) An empirical study of dimensionality reduction in support vector machine. Neural Network World 16(3):177–192
Google Scholar
Cao LJ (2003) Support vector machines experts for time series forecasting. Neorocomputing 51:321–339
Article Google Scholar
Cao LJ, Chua KS, Chong WK, Lee HP, Gu QM (2003) A comparison of PCA, KPCA and ICA for dimensional reduction in support vector machines. Neurocomputing 55(1):321–336
Google Scholar
Cao LJ, Tay FEH (2003) Support vector machine with adaptive parameters in financial time series forecasting. IEEE Trans Neural Netw 14(6):1506–1518
Article Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27
Google Scholar
Chang PC, Wu JL (2015) A critical feature extraction by kernel PCA in stock trading model. Soft Comput 19(5):1393–1408
Article Google Scholar
Chen WH, Shih JY, Wu S (2006) Comparison of support-vector machines and back propagation neural networks in forecasting the six major Asian stock markets. International Journal of Electronic Finance 1(1):49–67
Article Google Scholar
Das SP, Padhy S (2015) A novel hybrid model using teaching–learning-based optimization and a support vector machine for commodity futures index forecasting. Int J Mach Learn Cyber:1–15. doi:10.1007/s13042-015-0359-0
Diebold FX, Mariano RS (1995) Comparing predictive accuracy. J Bus Econ Stat 13(3):253–263
MathSciNet Google Scholar
Ekenel HK Sankur B (2004) Feature selection in the independent component subspace for face recognition. Pattern Recogn Lett 25(12):377–1388
Google Scholar
Haykin S (2010) Neural Networks and Learning Machines. 3rd Edition, PHI Learning Private Limited
Hsu CM (2013) A hybrid procedure with feature selection for resolving stock/futures price forecasting problems. Neural Comput Applic 22(3–4):651–671. doi:10.1007/s00521-011-07214
Article Google Scholar
Huang CL, Tsai CY (2009) A hybrid SOFM-SVR with a filter based feature selection for stock market forecasting. Expert Syst Appl 36(2):1529–1539. doi:10.1016/j.eswa.2007.11.062
Article MathSciNet Google Scholar
Hyvarinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York
Book Google Scholar
Hyvarinen A, Oja E (1997) A fast fixed-point algorithm for independent component analysis. Neural Comput 9(7):1483– 1492
Article Google Scholar
Hyvarinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neural networks 13(4):411–430
Article Google Scholar
Ince H, Trafalis TB (2007) Kernel principal component analysis and support vector machines for stock price prediction. IIE Trans 39(6):629–637
Article Google Scholar
Ince H, Trafalis TB (2008) Short term forecasting with support vector machines and application to stock price prediction. Int J Gen Syst 37(6):77–687. doi:10.1080/03081070601068595
Article MathSciNet MATH Google Scholar
Jiang M, Jiang S, Zhu L, Wang Y, Huang W, Zhang H (2013) Study on parameter optimization for support vector regression in solving the inverse ECG problem. Comput Math Methods Med Article ID 158059. doi:10.1155/2013/158056
MathSciNet MATH Google Scholar
Jolliffe IT (2002) Principle components analysis 2 ^nd Edition. Springer, New York
Google Scholar
Kim KJ (2003) Financial time series forecasting using support vector machines. Neurocomputing 55(1):307–319
Article Google Scholar
Kim KJ, Han I (2000) Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index. Expert Syst Appl 19(2):125–132
Article MathSciNet Google Scholar
Kim KJ, Lee WB (2004) Stock market prediction using artificial neural networks with optimal feature transformation. Neural Comput Applic 13(3):255–260. doi:10.1007/s00521-004-0428-x
Article Google Scholar
Kuang F, Zhang S, Jin Z, Xu W (2015) A novel SVM by combining kernel principal component analysis and improved chaotic particle swarm optimization for intrusion detection. Soft Comput 19:1187–1199. doi:10.1007/s00500-014-1332-7
Article Google Scholar
Lai RK, Fan CY, Huang WH, Chang PC (2009) Evolving and clustering fuzzy decision tree for financial time series data forecasting. Expert Syst Appl 36(2):3761–3773. doi:10.1016/j.eswa.2008.02.025
Article Google Scholar
Leung MT, Daouk H, Chen AS (2000) Forecasting stock indices: a comparison of classification and level estimation models. Int J Forecast 16(2):173–190
Article Google Scholar
Liang X, Zhang H, Xiao J, Chen Y (2009) Improving option price forecasts with neural networks and support vector regressions. Neurocomputing 72(13):3055–3065. doi:10.1016/j.neucom.2009.03.015
Article Google Scholar
Lin HT, Lin CJ (2003) A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods Technical report, University of National Taiwan Department of Computer Science and Information Engineering, March 1–32
Lin SW, Ying KC, Chen SC, Lee ZJ (2008) Particle swarm optimization for parameter determination and feature selection of support vector machines. Expert Syst Appl 35(4):1817– 1824
Article Google Scholar
Liu S, Tian L, Huang Y (2014) A comparative study on prediction of throughput in coal ports among three models. Int J Mach Learn Cybern 5(1):125–133. doi:10.1007/s13042-013-0201-5
Article Google Scholar
Lu CJ (2013) Hybridizing nonlinear independent component analysis and support vector regression with particle swarm optimization for stock index forecasting. Neural Comput Applic 23(7–8):2417–2427. doi:10.1007/s00521-012-1198-5
Article Google Scholar
Lu CJ, Lee TS, Chiu CC (2009) Financial time series forecasting using independent component analysis and support vector regression. Decis Support Syst 47(2):115–125
Article Google Scholar
Musa AB (2014) A comparison of 1-regularizion, PCA, KPCA and ICA for dimensionality reduction in logistic regression. Int J Mach Learn Cybern 5(6):861–873. doi:10.1007/s13042-013-0171-7
Article Google Scholar
Pawar PV, Rao RV (2013) Parameter optimization of machining using teaching-learning-based optimization algorithm. Int J Adv Manuf Technol 67:995–1006
Article Google Scholar
Porikli F, Haga T (2004) Event detection by eigenvector decomposition using object and frame features. IEEE Conference In Computer Vision and Pattern Recognition Workshop 2004(CVPRW’04):114–114
Google Scholar
Rao RV, Patel V (2014) A multi-objective improved teaching-learning based optimization algorithm for unconstrained and constrained optimization problems. Int J Ind Eng Comput 5(1):1–22. doi:10.5267/j.ijiec.2013.09.007
Google Scholar
Rao RV, Savsani VJ, Vakharia DP (2011) Teaching-learning-based optimization: a novel method for constrained mechanical design optimization problems. Comput Aided Des 43(3):303– 315
Article Google Scholar
Sapankevych NI, Sankar R (2009) Time series prediction using support vector machines: a survey. IEEE Comput Intell Mag 4(2):24–38. doi:10.1109/MCI.2009.932254
Article Google Scholar
Tay FE, Cao LJ (2002) Modified support vector machines in financial time series forecasting. Neurocomputing 48(1):847– 861
Article MATH Google Scholar
Tsai CF, Hsiao YC (2010) Combining multiple feature selection methods for stock prediction: Union, intersection, and multi-intersection approaches. Decis Support Syst 50(1):258–269. doi:10.1007/s00500-014-1350-5
Article Google Scholar
Tsang PM, Kwok P, Choy SO, Kwan R, Ng SC, Mak J, Tsang J, Koong K, Wong TL (2007) Design and implementation of NN5 for Hong Kong stock price forecasting. Eng Appl Artif Intell 20 (4):453–461. doi:10.1016/j.engappai.2006.10.002
Article Google Scholar
Twining CJ, Taylor CJ (2003) The use of kernel principal component analysis to model data distributions. Pattern Recogn 36(1):217–227
Article MATH Google Scholar
Van Gestel T, Suykens JA, Baestaens DE, Lambrechts A, Lanckriet G, Vandaele B, Vandewalle J (2001) Financial time series prediction using least squares support vector machines within the evidence framework. IEEE Trans Neural Netw 12(4):809–821
Article Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Springer, NY
Book MATH Google Scholar
Wang J, Wang J (2015) Forecasting stock market indexes using principle component analysis and stochastic time effective neural networks. Neurocomputing 156:68–78
Article Google Scholar
Wang S, Meng B (2011) Parameter selection algorithm for support vector machine. Prog Environ Sci 11:538–544. doi:10.1016/j.proenv.2011.12.085
Article Google Scholar
Wu CH, Tzeng GH, Lin RH (2009) A Novel hybrid genetic algorithm for kernel function and parameter optimization in support vector regression. Expert Syst Appl 36(3):4725–4735. doi:10.1016/j.eswa.2008.06.046
Article Google Scholar
Zhai G, Chen J, Wang S, Li K, Zhang L (2015) Material identification of loose particles in sealed electronics devices using PCA and SVM. Neurocomputing 148:222–228. doi:10.1016/j.neucom.2013.10.043
Article Google Scholar

Download references

Acknowledgments

We would like to express our gratitude to the National Institute of Science and Technology (NIST), for the facilities and resources provided at the Data Science Laboratory at NIST for the development of this study.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Science and Technology, Palur Hills, Odisha, 761008, India
Shom Prasad Das & N. Sangita Achary
Silicon Institute of Technology, Silicon Hills, Bhubaneswar, Odisha, 751024, India
Sudarsan Padhy

Authors

Shom Prasad Das
View author publications
You can also search for this author in PubMed Google Scholar
N. Sangita Achary
View author publications
You can also search for this author in PubMed Google Scholar
Sudarsan Padhy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shom Prasad Das.

Ethics declarations

Conflict of interests

The authors declare that there are no conflict of interests (either financial or non-financial) regarding the publication of the paper.

Appendices

Appendix A:: Technical indicators (features) used in this study

ᅟ ᅟ

Full size table

Appendix B:: Dimensionality reduction techniques used in this study

The objective of a dimension reduction technique is to reduce the dimension (number of features) of the input from a high-dimensional space to a low dimensional subspace. Dimensional reduction methods can be divided into two types: (i) feature selection and (ii) feature extraction. In feature selection, a subset of features is selected from the originals. In feature extraction new features are computed by transforming the original features. We present brief reviews of the dimensional reduction methods based on feature extraction that were used in our study. That is, PCA, KPCA, and ICA.

B.1 Principal component analysis (PCA)

Principal component analysis is a well-known linear statistical approach for feature extraction. The objective is to reduce the dimension of the input features from the original dataset [20]. It uses an orthogonal transformation to convert a set of N patterns (samples) of l possibly correlated features into a set of Nsamples of m(≤l) uncorrelated features called principal components (PCs). The transformation mechanism is designed such that the first principal component (PC1) has the highest possible variance, the second principal component (PC2) is orthogonal to the PC1 and accounts for next highest variance, and so on for the other PCs.

The PCA procedure is briefly described as follows

Step 1::

Input Npatterns (samples) X ₁,X ₂,…,X _N that each have l features (X _j∈R ^l). Each vector X _jfor j=1,2,…,N is such that the mean value of the features in X _j is zero (that is, we subtract the mean value of the original feature from each feature value).

Step 2::

Compute the covariance matrix

$$ C=\frac{1}{N}\sum\limits_{k=1}^{N} {X_{k} {X_{k}^{T}}} $$

(B.1)

The ij-th element of matrix Cis

$$ C_{ij} =\frac{1}{N}\sum\limits_{k=1}^{N} {X_{k} (i)X_{k} (j)} $$

(B.2)

where X _k(i)denotes the ith component of the X _ksample.

Step 3::

Calculate l eigenvalues of C and arrange them in non-increasing order λ ₁≥λ ₂≥...≥λ _l. For each eigenvalue λ _i, i=1,2,…,l, compute an associated eigenvector α _i∈R ^lof matrix C using an eigenvector decomposition technique [35].

Step 4::

Choose the m≤l largest eigenvalues (choose the smallest integer m, so that λ _m−1−λ _m is large or $\sum \limits _{i=1}^{m} {\lambda _{i} } \ge t\sum \limits _{i=1}^{N} {\lambda _{i} } $ where t=0.95 if we wish to retain 95 % variance in the transformed data, where $\sum \limits _{i=1}^{N} {\lambda _{i} }$ represents the total variance).

Step 5::

Use the eigenvectors (column vectors) α ₁,α ₂,…,α _m to form the transformation matrix.

$$ A=[\alpha_{1} \alpha_{2} ...\alpha_{m}] $$

(B.3)

Step 6::

Transform each pattern X _i in the original space R ^l to the vector Y _i in the m-dimensional space R ^m(m<l) using

$$ Y_{i} =A^{T}X_{i} ,i=1,2,\ldots,N $$

(B.4)

So the jth component Y _i(j) of Y _i is the projection of X _i on α _i(i.e., $Y_{i} (j)=\alpha _{j}^{T} X_{i} )$.

B.2 Kernel principal component analysis (KPCA)

In the PCA technique, each input pattern (sample) in R ^l is linearly projected onto a lower dimensional subspace. This is appropriate when the data approximately lie on a linear manifold (for example a hyperplane). However, in many applications the input data lie on a low dimensional nonlinear manifold. Then it is more appropriate to use KPCA, which is a nonlinear dimensional reduction technique. In this method the input patterns X _i∈R ^l for i=1,2,…,N (where N is the number of input samples) are first mapped onto a space H with more than l dimensions using a non-linear mapping ϕ:R ^l→H[42]. Their images ϕ(X _i) are projected along the orthonormal eigenvectors of the covariance matrix of ϕ(X _i)’s. These projections only involve the inner product of the ϕ(X _i)’s in H,ϕ is not explicitly known, and it is difficult to construct a kernel function. So we use Kdefined by K:R ^l×R ^l→R such that

$$ K(X_{i} ,X_{j} )=<\phi (X_{i} ),\phi (X_{j})> $$

(B.5)

(where <, > denotes inner product in H) to compute the inner products involved in the projections leading to the computation of Y _i’s having fewer dimensions m(m<l)than X _i’s. It has been proved that the components Y _i(k),k=1,2,…,m of the Y _i’s are uncorrelated and the first q(≤m) principal components have maximum mutual information with respect to the inputs, which justifies the use of the method for dimensionality reduction.

The KPCA procedure is given in the form of the following algorithm.

Step 1::: Input the data patterns (samples) X _i∈R ^l for i=1,2,…,N (where N is the number of input samples).
Step 2::: Choose a kernel function K:R ^l×R ^l→R and compute the kernel matrix K ₁whose ij-th element is equal to K(X _i,X _j) for i,j=1,2,…,l
Step 3::: Compute the eigenvalues and eigenvectors of K ₁. Arrange the eigenvalues in non-increasing order λ ₁≥λ ₂≥...≥λ _l. Let the corresponding eigenvectors be a ₁,a ₂,…,a _l.
Step 4::: Choose mdominant eigenvalues λ ₁,λ ₂,…,λ _m(m≤l)[choose the smallest integer m such that λ _m−1−λ _mis large or $\sum \limits _{i=1}^{m} {\lambda _{i} } \ge t\sum \limits _{i=1}^{N} {\lambda _{i} } $, where t=0.95 if we wish to retain 95% of the variance in the transformed data, and $\sum \limits _{i=1}^{N} {\lambda _{i} } $represents the total variance], and normalize the corresponding eigenvectors a ₁,a ₂,…,a _m using
$$ a_{k}^{\prime} =\frac{a_{k} }{\left\| {a_{k} } \right\|\sqrt {\lambda_{k} } },k=1,2,\ldots,m $$
(B.6)
Step 5::: For each X _i,i=1,2,…,N, compute the m projections Y _i(k) of ϕ(X _i) onto each of the orthonormal eigenvectors $a_{{k}^{\prime }}$’s, k=1,2,…,m,i.e.,
$$ Y_{i} (k)=\sum\limits_{j=1}^{l} {{a_{k}^{i}} (j)K(X_{i} ,X_{j} ),k=} 1,2,\ldots,m $$
(B.7)

B.3 Independent component analysis (ICA)

Independent component analysis (ICA) is a relatively new statistical method [14,16]. ICA does not transform uncorrelated components or factors, but instead attempts to find statistically independent components or factors in the transformed vectors. The primary goal of this method is to find representations of non-Gaussian data, so those components are statistically independent or as independent as possible [16].

In ICA, we assume that lmeasured variables X=[x ₁,x ₂,…,x _l]^T can be expressed as linear combinations of n unknown latent source components S=[s ₁,s ₂,…,s _n]^T, i.e.,

$$ X=AS $$

(B.8)

where A _l×l is an unknown mixing matrix. Here, we consider that l≥n if A is a full rank matrix. S is the latent source data that cannot be directly observed from the input mixture data, X. The basic ICA objective is to estimate the latent source components, S, and unknown mixing matrix A from X with appropriate assumptions on the statistical properties of the source distribution. The basic ICA model for feature transformation aims to find a de-mixing matrix W _l×l that can be written as

$$ Y=WX $$

(B.9)

where Y=[y ₁,y ₂,…,y _n]^T is the independent component vector. The elements of Ymust be statistically independent and are called independent components (ICs). Here, W = A ⁻¹(i.e., the de-mixing matrix Wis the inverse of mixing matrix A). The ICs (y _i) can be used to compute the latent source signals s _i.

Many algorithms can perform the ICA. The fixed-point fast ICA method presented by Hyvärinen and Oja [15] is the most popular. We used fixed-point fast ICA in our experimental study. In this algorithm, PCA is first used to transform the original input vectors (X) to a set of new uncorrelated vectors with zero means and unity variance. This process reduces the dimension of X and consequently reduces the number of Y. Then, the uncorrelated vector obtained by PCA is used to estimate the independent components vectors (Y) and the transformed matrix using the fixed point algorithm.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Das, S.P., Achary, N.S. & Padhy, S. Novel hybrid SVM-TLBO forecasting model incorporating dimensionality reduction techniques. Appl Intell 45, 1148–1165 (2016). https://doi.org/10.1007/s10489-016-0801-3

Download citation

Published: 08 July 2016
Issue Date: December 2016
DOI: https://doi.org/10.1007/s10489-016-0801-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Novel hybrid SVM-TLBO forecasting model incorporating dimensionality reduction techniques

Abstract

Access this article

Similar content being viewed by others

A two-stage model for stock price prediction based on variational mode decomposition and ensemble machine learning method

A Hybrid Machine Learning Approach for Multistep Ahead Future Price Forecasting

Automatic optimized support vector regression for financial data prediction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Appendices

Appendix A:: Technical indicators (features) used in this study

Appendix B:: Dimensionality reduction techniques used in this study

B.1 Principal component analysis (PCA)

B.2 Kernel principal component analysis (KPCA)

B.3 Independent component analysis (ICA)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Novel hybrid SVM-TLBO forecasting model incorporating dimensionality reduction techniques

Abstract

Access this article

Similar content being viewed by others

A two-stage model for stock price prediction based on variational mode decomposition and ensemble machine learning method

A Hybrid Machine Learning Approach for Multistep Ahead Future Price Forecasting

Automatic optimized support vector regression for financial data prediction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Appendices

Appendix A:: Technical indicators (features) used in this study

Appendix B:: Dimensionality reduction techniques used in this study

B.1 Principal component analysis (PCA)

B.2 Kernel principal component analysis (KPCA)

B.3 Independent component analysis (ICA)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation