ABSTRACT
For people who are not exposed to the financial markets but who would be interested to invest in stocks for them it is important to know which ticker symbols to follow, based on which investment decisions can be made. In this work, we propose how we can classify stock ticker symbols from tweets vertically using P-Trees. The solution described in this paper analyzes 3000 financial and news symbols vertically from the Twitter platform and finds the ticker symbols which are most frequently been discussed. It also provides an ability to scan through the tweet texts associated with the common ticker occurrences so that the context can be identified to help users make better informed business decisions. The paper also discusses on the bias of investors and the affect it has on the volatility of the stocks in the market.
- Ding, Khan, Roy and Perrizo, "The P-tree algebra," Proceedings of the ACM SAC, Symposium on Applied Computing (Madrid, Spain), 2002. Google ScholarDigital Library
- M. Khan, Q. Ding, and W. Perrizo, "K-nearest Neighbor Classification on Spatial Data Stream Using P-trees," Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 02), pp. 517--528, Taipei, Taiwan, May 2002. Google ScholarDigital Library
- Perrizo, Ding and Roy, "Deriving high confidence rules from spatial data using peano count trees," Proceedings of the WAIM, International Conference on Web-Age Information Management,(Xi'an, China), 91-102, July 2001. Google ScholarDigital Library
- Rahal and Perrizo, "Query acceleration in Multi-level secure database systems using the P-tree technology," Proceedings of the ISCA CATA, International Conference on Computers and Their Applications (Honolulu, Hawaii), March 2003.Google Scholar
- Rahal and Perrizo, "An Optimized Approach for KNN Text Categorization using P-trees," Proceedings of the 2004 ACM Symposium on Applied Computing (SAC-04) Nicosia, Cyprus, March 14-17 2004. Google ScholarDigital Library
- I. Rahal, D. Ren, W. Perrizo, "A Scalable Vertical Model for Mining Association Rules," Journal of Information and Knowledge Management (JIKM), V3:4, pp. 317--329, 2004.Google Scholar
- Jiawei Han, Micheline Kamber, "Data mining: Concepts and Techniques, "Morgan Kaufmann Publishers Inc., San Francisco, CA, 2000. Google ScholarDigital Library
- P.Adriaans and D.Zantinge, "Data Mining," Addison Wesley, 1996. Google ScholarDigital Library
- N.G. Das,"A Book on Statistical Methods,"Publisher M. Das and Co, 2001.Google Scholar
- C.Yang, U.M Fayyad and P.S. Bradley, "Efficient discovery of error-tolerant frequent item sets in high dimensions," Proceedings of the KDD 2001, pages 194-203. Google ScholarDigital Library
- Salton and Buckley, "Term-weighting approaches in automatic text retrieval," Information Processing & Management, 24(5), 513-523, May 1988. Google ScholarDigital Library
- Salton, Wong and Yang, "A vector space model for automatic indexing," Communications of the ACM 18(11), 613-620, November 1975. Google ScholarDigital Library
- Baker, Malcolm, and Jeffrey Wurgler, 2006, ©Investor sentiment and the cross-section of stock returns, Journal of Finance 61, 1645--1680.Google Scholar
- Baker, Malcolm, and Jeffrey Wurgler, 2007, ©Investor sentiment in the stock market, Journal of Economic Perspectives 21, 129--151.Google Scholar
- T. Abidin and W. Perrizo, "SMART-TV: A Fast and Scalable Nearest Neighbor Based Classifier for Data Mining," Proceedings of the 21st Association of Computing Machinery Symposium on Applied Computing (SAC-06), Dijon, France, April 23-27, 2006. Google ScholarDigital Library
- A. Perera, T. Abidin, M. Serazi, G. Hamer, and W. Perrizo, "Vertical Set Squared Distance Based Clustering without Prior Knowledge of K," International Conference on Intelligent and Adaptive Systems and Software Engineering (IASSE-05), pp. 72--77, Toronto, Canada, July 20-22, 2005.Google Scholar
- D. Ren, B. Wang, and W. Perrizo, "RDF: A Density-Based Outlier Detection Method using Vertical Data Representation," Proceedings of the 4th Institute of Electrical and Electronic Engineers (IEEE) International Conference on Data Mining (ICDM-04), pp. 503--506, Nov 1-4, 2004. Google ScholarDigital Library
- I. Rahal, M. Serazi, A. Perera, Q. Ding, F. Pan, D. Ren, W. Wu, W. Perrizo, "DataMIME™", Association of Computing Machinery, Management of Data (ACM SIGMOD 04), Paris, France, June 2004. Google ScholarDigital Library
- Y. Cui and W. Perrizo, Aggregate Function Computation and Iceberg Querying in Vertical Database. Computers and Their Applications, 2006.Google Scholar
- S.Kim, E.Hovy, "Determining the sentiment of opinions," Proceedings of the 20th International Conference on Computational Linguistics (COLING-04), Article No.1367, Stroudsburg, PA, USA, 2004. Google ScholarDigital Library
Classifying Stocks using P-Trees and Investor Sentiment
Recommendations
Correlating S&P 500 stocks with Twitter data
HotSocial '12: Proceedings of the First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks ResearchTwitter is a widely used online social media. One important characteristic of Twitter is its real-time nature. In this paper, we investigate whether the daily number of tweets that mention Standard & Poor 500 (S&P 500) stocks is correlated with S&P 500 ...
Investor Sentiment and Stock Returns, Evidence from Chinese Securities Market
BCGIN '13: Proceedings of the 2013 International Conference on Business Computing and Global InformatizationBased on the method of Baker&Wurgler, we create a comprehensive investor sentiment indicator using the principal component analysis and aim to examine the relationships between investor sentiment and market return, stock portfolio returns and industry ...
The Measurement Method of Investor Sentiment and Its Relationship with Stock Market
Investor sentiment is a hot topic in behavioral finance. How to measure investor sentiment? Is the influence of investor sentiment on the stock market symmetrical? That is all we need to think about. Therefore, this paper firstly selects five emotional ...
Comments