Abstract
Changes in the distribution of financial time series, particularly stock market prices, can happen at a very high frequency. Such changes make the prediction of future behavior very challenging. Application of traditional regression algorithms in this scenario is based on the assumption that all data samples are equally important for model building. Our work examines the use of an alternative data pre-processing approach, whereby knowledge of distribution changes is used to pre-filter the training dataset. Experimental results indicate that this simple and efficient technique can produce effective results and obtain improvements in prediction accuracy when used in conjunction with a range of forecasting techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Tsay, R.S.: Analysis of Financial Time Series. Wiley-Interscience (2005)
Chatfield, C.: The Analysis of Time Series: an Introduction. Chapman & Hall/CRC (2004)
Witten, I.H., Frank, E.: Data mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2005)
Alpaydin, E.: Introduction to Machine Learning. The MIT Press (2004)
Gaber, M.M., Zaslavsky, A., Krishnaswamy, S.: Mining Data Streams: A Review. SIGMOD Record 24(2), 18–26 (2005)
Xindong, W., Yu, P.S., et al.: Data Mining: How Research Meets Practical Development? Knowledge and Information Systems 5(2), 248–261 (2003)
Yoo, P.D., Kim, M.H., et al.: Machine Learning Techniques and Use of Event Information for Stock Market Prediction: A Survey and Evaluation. CIMCA-IAWTIC (2005)
Hulten, G., Spencer, L., et al.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California, pp. 97–106 (2001)
Dong, G., Han, J., et al.: Online mining of changes from data streams: Research problems and preliminary results. In: Proceedings of the 2003 ACM SIGMOD Workshop on Management and Processing of Data Streams (2003)
Chen, J., Gupta, A.K.: Testing and locating variance changepoints with application to stock prices. Journal of the American Statistical Association 92(438), 739–747 (1997)
Adya, M., Collopy, F.: How effective are neural networks at forecasting and prediction? A review and evaluation. Journal of Forecasting 17(5-6), 481–495 (1998)
Kifer, D., Ben-David, S., et al.: Detecting change in data streams. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases,Toronto, Canada, vol. 30, pp. 180–191. VLDB Endowment (2004)
Hollander, M., Wolfe, D.: Nonparametric Statistical Methods, 2nd edn. Wiley-Interscience (1999)
Kecman, V.: Learning and Soft Computing: support vector machines, neural networks, and fuzzy logic models. MIT Press (2001)
Liu, X., Zhang, R., et al.: Incremental Detection of Distribution Change in Stock Order Streams. In: 26th International Conference on Data Engineering Conference (ICDE), Long Beach, California, USA (2010)
Thomason, M.: The Practitioner Methods and Tools. Journal of Computational Intelligence in Finance 7(3), 36–45 (1999)
Web enabled scientific services and applications (2011), http://www.wessa.net/stocksdata.wasp
Hyndman, R.J.: S&P quarterly index online database (2008), http://robjhyndman.com/tsdldata/data/9-17b.dat
Tsay, R.S.: Analysis of Financial Time Series datasets (2002), http://faculty.chicagobooth.edu/ruey.tsay/teaching/fts/d-ibmln.dat
Waikato Environment for Knowledge Analysis, WEKA (2011), http://www.cs.waikato.ac.nz/ml/weka/
Ganti, V., Gehrke, J., Ramakrishnan, R.: DEMON: mining and monitoring evolving data. IEEE Transactions on Knowledge and Data Engineering 13(1) (2001)
Babcock, B., Datar, M., Motwani, R.: Load Shedding in Data Stream Systems. In: Proc. of the 2003 Workshop on Management and Processing of Data Streams, MPDS (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ristanoski, G., Bailey, J. (2011). Distribution Based Data Filtering for Financial Time Series Forecasting. In: Wang, D., Reynolds, M. (eds) AI 2011: Advances in Artificial Intelligence. AI 2011. Lecture Notes in Computer Science(), vol 7106. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25832-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-25832-9_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25831-2
Online ISBN: 978-3-642-25832-9
eBook Packages: Computer ScienceComputer Science (R0)