Abstract
This paper presents a kernel-based principal component analysis (kernel PCA) to extract critical features for improving the performance of a stock trading model. The feature extraction method is one of the techniques to solve dimensionality reduction problems (DRP). The kernel PCA is a feature extraction approach which has been applied to data transformation from known variables to capture critical information. The kernel PCA is a kernel-based data mapping tool that has characteristics of both principal component analysis and non-linear mapping. The feature selection method is another DRP technique that selects only a small set of features from known variables, but these features still indicate possible collinearity problems that fail to reflect clear information. However, most feature extraction methods use a variable mapping application to eliminate noisy and collinear variables. In this research, we use the kernel-PCA method in a stock trading model to transform stock technical indices (TI) which allows features of smaller dimension to be formed. The kernel-PCA method has been applied to various stocks and sliding window testing methods using both half-year and 1-year testing strategies. The experimental results show that the proposed method generates more profits than other DRP methods on the America stock market. This stock trading model is very practical for real-world application, and it can be implemented in a real-time environment.
Similar content being viewed by others
References
Achelis B (2000) Technical analysis from A to Z, 4th edn. McGraw-Hill, New York
Chang PC, Liao TW, Lin JJ, Fan CY (2011) A dynamic threshold decision system for stock trading signals detection. Appl Soft Comput 1(5):3998–4010
Chang PC, Lin JJ, Hsieh JC (2012) Myocardial infarction classification with multi-lead ECG using hidden Markov models and Gaussian mixture models. Appl Soft Comput 12(10):3165–3175
Comon P (1994) Independent component analysis, a new concept? Signal Process 36(3):287–314
Dai J, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Appl Soft Comput 13(1):211–221
Derrac J, Verbiest N, García S, Cornelis C, Herrera F (2013) On the use of evolutionary feature selection for improving fuzzy rough set based prototype selection. Soft Comput 17(2):223–238
Diamantaras KI, Kung SY (1996) Principal component neural networks. Wiley, New York
Ding C, He X, Zha H, Simon HD (2003) Adaptive dimension reduction for clustering high dimensional data. In: Proceedings of IEEE international conference on data mining, pp 147–154
Draper N, Smith H (1981) Applied regression analysis, 2nd edn. Wiley, New York
Ekbal A, Saha S (2013) Combining feature selection and classifier ensemble using a multiobjective simulated annealing approach: application to named entity recognition. Soft Comput 17(1):1–16
Fan TH, Cheng KF (2007) Tests and variables selection on regression analysis for massive datasets. Data Knowl Eng 63(3):811–819
Guo L, Rivero D, Dorado J, Munteanu CR, Pazos A (2011) Automatic feature extraction using genetic programming: an application to epileptic EEG classification. Expert Syst Appl 38(8):10425–10436
Guo Z, Wang H, Liu Q (2013) Financial time series forecasting using LPP and SVM optimized by PSO. Soft Comput 17(5):805–818
Hoffmann H (2007) Kernel PCA for novelty detection. Pattern Recognit 40(3):863–874
Hoyer PO, Hyvärinen A (2000) Independent component analysis applied to feature extraction from colour and stereo images. Network 11(3):191–210
Hoyer PO, Hyvärinen A, Yamamoto R (2012) Intraday technical analysis of individual stocks on the Tokyo Stock Exchange. J Bank Financ 36(8):3033–3047
Jolliffe IT (2002) Principal component analysis, 2nd edn., Springer series in statisticsSpringer, New York
Li W, Liu Z (2011) A method of SVM with normalization in intrusion detection. Procedia Environ Sci 11(A): 256–262
Lin X, Yang Z, Song Y (2011) Intelligent stock trading system based on improved technical analysis and echo state network. Expert Syst Appl 38(9):11347–11354
Luna I, Ballini R (2011) Top-down strategies based on adaptive fuzzy rule-based systems for daily time series forecasting. Int J Forecast 27(3):708–724
Mika S, Schölkopf B, Smola A, Müller KR, Scholz M, Rätsch G (1998) Kernel PCA and de-noising in feature spaces. In: Proceeding of the 1998 conference on advances in neural information processing system II, pp 536–542
Mitchell TM (1997) Machine learning. McGraw-Hill, New York
Samet H (2006) Foundations of multidimensional and metric data structures. Morgan Kaufmann, San Francisco
Scholkopf B, Smola A, Muller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
Scholkopf B, Mika S, Burges CJC, Knirsch P, Muller KR, Ratsch G, Smola A (1999) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 10(5):1000–1017
Schölkopf B, Smola A, Muller KR (1999) Kernel principal component analysis. In: Schölkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods-support vector learning. MIT Press, Cambridge, pp 327–352
Scholz M, Kaplan F, Guy CL, Kopka J, Selbig J (2005) Non-linear PCA: a missing data approach. Bioinformatics 21(15):3887–3895
Smola A, Schölkopf B (2004) A tutorial on support vector regression. J Stat Comput 14(3):199–222
Ssegane H, Tollner EW, Mohamoud YM, Rasmussen TC, Dowd JF (2012) Advances in variable selection methods I: causal selection methods versus stepwise regression and principal component analysis on data of known and unknown functional relationships. J Hydrol 438–439:16–25
Tan F, Fu X, Zhang Y, Bourgeois AG (2006) A genetic algorithm-based method for feature subset selection. Soft Comput 12(2):111–120
Tsai CF, Hsiao YC (2010) Combining multiple feature selection methods for stock prediction: union, intersection, and multi-intersection approaches. Decis Support Syst 50(1):258–269
Wu JL, Chang PC (2012) A trend-based segmentation method and the support vector regression for financial time series forecasting. Math Probl Eng, 20 pp. Article ID 615152
Wu JL, Chang PC, Chang KT, Zhang L (2011) A collaborative trading model by support vector regression and TS fuzzy rule for daily stock turning points detection. In: Proceedings of the 2011 3rd international conference on computer engineering and technology, pp 185–190
Wu JL, Yu LC, Chang PC (2011) Emotion classification by removal of the overlap from incremental association language features. J Chin Inst Eng 34(7):947–955
Zhang C, Xiang S, Nie F, Song Y (2009) Nonlinear dimensionality reduction with relative distance comparison. Neurocomputing 72(7–9):1719–1731
Zhu X, Huang Z, Yang Y, Shen HT, Xu C, Luo J (2013) Self-taught dimensionality reduction on the high-dimensional small-sized data. Pattern Recognit 46(1):215–229
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Chang, PC., Wu, JL. A critical feature extraction by kernel PCA in stock trading model. Soft Comput 19, 1393–1408 (2015). https://doi.org/10.1007/s00500-014-1350-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-014-1350-5