skip to main content
10.1145/2808797.2808845acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Classifying Stocks using P-Trees and Investor Sentiment

Authors Info & Claims
Published:25 August 2015Publication History

ABSTRACT

For people who are not exposed to the financial markets but who would be interested to invest in stocks for them it is important to know which ticker symbols to follow, based on which investment decisions can be made. In this work, we propose how we can classify stock ticker symbols from tweets vertically using P-Trees. The solution described in this paper analyzes 3000 financial and news symbols vertically from the Twitter platform and finds the ticker symbols which are most frequently been discussed. It also provides an ability to scan through the tweet texts associated with the common ticker occurrences so that the context can be identified to help users make better informed business decisions. The paper also discusses on the bias of investors and the affect it has on the volatility of the stocks in the market.

References

  1. Ding, Khan, Roy and Perrizo, "The P-tree algebra," Proceedings of the ACM SAC, Symposium on Applied Computing (Madrid, Spain), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Khan, Q. Ding, and W. Perrizo, "K-nearest Neighbor Classification on Spatial Data Stream Using P-trees," Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 02), pp. 517--528, Taipei, Taiwan, May 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Perrizo, Ding and Roy, "Deriving high confidence rules from spatial data using peano count trees," Proceedings of the WAIM, International Conference on Web-Age Information Management,(Xi'an, China), 91-102, July 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Rahal and Perrizo, "Query acceleration in Multi-level secure database systems using the P-tree technology," Proceedings of the ISCA CATA, International Conference on Computers and Their Applications (Honolulu, Hawaii), March 2003.Google ScholarGoogle Scholar
  5. Rahal and Perrizo, "An Optimized Approach for KNN Text Categorization using P-trees," Proceedings of the 2004 ACM Symposium on Applied Computing (SAC-04) Nicosia, Cyprus, March 14-17 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. I. Rahal, D. Ren, W. Perrizo, "A Scalable Vertical Model for Mining Association Rules," Journal of Information and Knowledge Management (JIKM), V3:4, pp. 317--329, 2004.Google ScholarGoogle Scholar
  7. Jiawei Han, Micheline Kamber, "Data mining: Concepts and Techniques, "Morgan Kaufmann Publishers Inc., San Francisco, CA, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P.Adriaans and D.Zantinge, "Data Mining," Addison Wesley, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. N.G. Das,"A Book on Statistical Methods,"Publisher M. Das and Co, 2001.Google ScholarGoogle Scholar
  10. C.Yang, U.M Fayyad and P.S. Bradley, "Efficient discovery of error-tolerant frequent item sets in high dimensions," Proceedings of the KDD 2001, pages 194-203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Salton and Buckley, "Term-weighting approaches in automatic text retrieval," Information Processing & Management, 24(5), 513-523, May 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Salton, Wong and Yang, "A vector space model for automatic indexing," Communications of the ACM 18(11), 613-620, November 1975. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Baker, Malcolm, and Jeffrey Wurgler, 2006, ©Investor sentiment and the cross-section of stock returns, Journal of Finance 61, 1645--1680.Google ScholarGoogle Scholar
  14. Baker, Malcolm, and Jeffrey Wurgler, 2007, ©Investor sentiment in the stock market, Journal of Economic Perspectives 21, 129--151.Google ScholarGoogle Scholar
  15. T. Abidin and W. Perrizo, "SMART-TV: A Fast and Scalable Nearest Neighbor Based Classifier for Data Mining," Proceedings of the 21st Association of Computing Machinery Symposium on Applied Computing (SAC-06), Dijon, France, April 23-27, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. A. Perera, T. Abidin, M. Serazi, G. Hamer, and W. Perrizo, "Vertical Set Squared Distance Based Clustering without Prior Knowledge of K," International Conference on Intelligent and Adaptive Systems and Software Engineering (IASSE-05), pp. 72--77, Toronto, Canada, July 20-22, 2005.Google ScholarGoogle Scholar
  17. D. Ren, B. Wang, and W. Perrizo, "RDF: A Density-Based Outlier Detection Method using Vertical Data Representation," Proceedings of the 4th Institute of Electrical and Electronic Engineers (IEEE) International Conference on Data Mining (ICDM-04), pp. 503--506, Nov 1-4, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. I. Rahal, M. Serazi, A. Perera, Q. Ding, F. Pan, D. Ren, W. Wu, W. Perrizo, "DataMIME™", Association of Computing Machinery, Management of Data (ACM SIGMOD 04), Paris, France, June 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Y. Cui and W. Perrizo, Aggregate Function Computation and Iceberg Querying in Vertical Database. Computers and Their Applications, 2006.Google ScholarGoogle Scholar
  20. S.Kim, E.Hovy, "Determining the sentiment of opinions," Proceedings of the 20th International Conference on Computational Linguistics (COLING-04), Article No.1367, Stroudsburg, PA, USA, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Classifying Stocks using P-Trees and Investor Sentiment

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASONAM '15: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015
        August 2015
        835 pages
        ISBN:9781450338547
        DOI:10.1145/2808797

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 August 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate116of549submissions,21%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader