News Sensitive Stock Trend Prediction

Fung, Gabriel Pui Cheong; Yu, Jeffrey Xu; Lam, Wai

doi:10.1007/3-540-47887-6_48

Gabriel Pui Cheong Fung⁴,
Jeffrey Xu Yu⁴ &
Wai Lam⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2336))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2598 Accesses
63 Citations

Abstract

Stock market prediction with data mining techniques is one of the most important issues to be investigated. In this paper, we present a system that predicts the changes of stock trend by analyzing the influence of non-quantifiableinformation (news articles). In particular, we investigate the immediate impact of news articles on the time series based on the Efficient Markets Hypothesis. Several data mining and text mining techniques are used in a novel way. A new statistical based piecewise segmentation algorithm is proposed to identify trends on the time series. The segmented trends are clustered into two categories, Rise and Drop, according to the slope of trends and the coefficient of determination. We propose an algorithm, which is called guided clustering, to filter news articles with the help of the clusters that we have obtained from trends. We also propose a new differentiated weighting scheme that assigns higher weights to the features if they occur in the Rise (Drop) news-article cluster but do not occur in its opposite Drop (Rise).

For example, the same article may align to more than one type of trend.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. B. Achelis. Technical Analysis from A to Z. Irwin Professional Publishing, Chicago, 2nd edition, 1995.
Google Scholar
P. A. Adler and P. Adler. The Social Dynamics of Financial Markets. Jai Press Inc., 1984.
Google Scholar
W. J. Eiteman, C. A. Dice and D. K. Eiteman. The Stock Market. McDGraw-Hill Book Company, 4th edition, 1966.
Google Scholar
C. Faloutsos, M. Rangantathan and Y. Manalopoulos. Fast Subsequence Matching in Time-Series Database. In Proceedings of the ACM SIGMOD International Conference on Management of Data, 419–429, Minneapolis, May 1994.
Google Scholar
T. Fawcett and F. Provost. Activity Monitoring: Noticing Interesting Changes in Behavior. In Proceedings of the 5th International Conference on KDD, San Diego, California, 1999.
Google Scholar
T. Hellstrom and K. Holmstrom. Predicting the Stock Market. Technical Report Series IMa-TOM-1997-07, 1998.
Google Scholar
J. D. Holt and S. M. Chung. Efficient Mining of Association Rules in Text Databases. In Proceedings of the 8th International Conference on Information Knowledge Management, 234–242, ACM Press, 1999.
Google Scholar
T. Joachims. Making large-Scale SVM Learning Practical. Advances in Kernel Methods-Support Vector Learning. B. Sholkopf and C. Burges and A. Smola, MIT-Press, 1999.
Google Scholar
T. Joachims. Text Categorization with Support Vector Machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning, Springer, 1998.
Google Scholar
E. Keogh and P. Smyth. A Probabilistic Approach to Fast Pattern Matching in Time Series Databases. In Proceedings of the 3rd International Conference of KDD, 24–40, AAAl Press, 1997.
Google Scholar
L. Kaufman and P. J. Rousseeuw. Finding Groups in Data-An Introduction to Cluster Analysis. John Wiley & Sons, Inc., 1990.
Google Scholar
B. Larsen and C. Aone. Fast and Effective Text Mining Using Linear-time Document Clustering. In Proceedings of the 5th International Conference on KDD, San Diego, California, 1999.
Google Scholar
V. Lavrenko, M. Schmill, D. Lawire, P. Ogilvie, D. Jensen and J. Allan. Mining of Concurrent Text and Time Series, In Proceedings of the 6th International Conference on KDD, Boston, MA, 2000.
Google Scholar
W. Mendenhall and T. Sincich. A Second Course in Business Statistics: Regression Analysis. Dellen Publishing Company, 1989.
Google Scholar
D. C. Montgomery and G. C. Runger. Applied Statistics and Probability for Engineers. John Wiley & Sons, Inc., 2nd edition, 1999.
Google Scholar
T. Pavlidis and S. L. Horowitz. Segmentation of Plan Curves. IEEE Transactions on Computers, Vol. c-23, No. 8, August 1974.
Google Scholar
C. Pratten. The Stock Market. Cambridge University Press, 1993.
Google Scholar
C. J. vanRijsbergen. A Theoretical Basis for the use of Co-occurance Data in Information Retrieval. Journal of Documentation, 33:106–119, 1977.
Article Google Scholar
P. Smyth. Hidden Markov Models for Fault Detection in Dynamic Systems. Pattern Recognition, 27(1), 149–164, 1994.
Article Google Scholar
M. Steinbach, G. Karypis and V. Kumar. A Comparison of Document Clustering Techniques. Technical Report, 2000.
Google Scholar
T. Takenobu and I. Makoto. Text Categorization Based on Weighted Inverse Document Frequency. Technical Report, ISSN 0918-2802, 1994.
Google Scholar
V. N. Vapnik, The Nature of Statistical Learning Theory. Springer, 1995.
Google Scholar
Y. Yang and X. Liu. A Re-examination of Text Categorization Methods. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 42–49, 1999.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Systems Engineering & Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Gabriel Pui Cheong Fung, Jeffrey Xu Yu & Wai Lam

Authors

Gabriel Pui Cheong Fung
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Xu Yu
View author publications
You can also search for this author in PubMed Google Scholar
Wai Lam
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

EE Department, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan, ROC
Ming-Syan Chen
IBM Thomas J. Watson Research Center, 30 Sawmill River Road, Hawthorne, NY, 10532, USA
Philip S. Yu
School of Computing, National University of Singapore, Lower Kent Ridge Road, Singapore, 119260
Bing Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fung, G.P.C., Yu, J.X., Lam, W. (2002). News Sensitive Stock Trend Prediction. In: Chen, MS., Yu, P.S., Liu, B. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2002. Lecture Notes in Computer Science(), vol 2336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47887-6_48

Download citation

DOI: https://doi.org/10.1007/3-540-47887-6_48
Published: 29 April 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43704-8
Online ISBN: 978-3-540-47887-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics