Skip to main content

A Proposed Framework for Evaluating the Effectiveness of Financial News Sentiment Scoring Datasets

  • Conference paper
  • First Online:
Enterprise Applications and Services in the Finance Industry (FinanceCom 2014)

Abstract

The impact of financial news on financial markets has been studied extensively. A number of news sentiment scoring techniques are being widely used in research and industry. However, results from sentiment studies are hard to interpret contextual and sentiment related parameters change. Sometimes, the conditions which lead to the results are not fully documented and the results are not repeatable. Based on service-oriented computing principles, this paper proposes a framework that automates the process of incorporating different contextual parameters when running news sentiment impact studies. The framework also preserves the set of parameters/dataset and conditions for the end user to enable them to reproduce their results. This is demonstrated using a case study that shows how end users can flexibly select different contextual and sentiment related parameters and conduct news impact studies on daily stock prices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Healy, A.D., Lo, A.W.: Managing real-time risks and returns: the thomson reuters newsscope event indices. In: Professor Hand, D.J., Professor of Statistics, Imperial College, London; Chief Scientific Advisor, Winton Capital Management; and President, Royal Statistical Society, 73

    Google Scholar 

  2. Moniz, A., Brar, G., Davis, C.: Have I got news for youMacQuarie Research Report (2009)

    Google Scholar 

  3. Al Shaikh, M.M., Prendinger, H., Ishizuka, M.: An analytical approach to assess sentiment of text. In: 10th International Conference on Computer and Information Technology, 2007 ICCIT 2007, pp. 1–6 (2007)

    Google Scholar 

  4. Antweiler, W., Frank, M.Z.: Is all that talk just noise? the information content of internet stock message boards. J. Finan. 59(3), 1259–1294 (2004)

    Article  Google Scholar 

  5. Azar, P.D.: Sentiment analysis in financial news (Doctoral dissertation, Harvard University) (2009)

    Google Scholar 

  6. Baker, M., Wurgler, J.: Investor sentiment and the cross section of stock returns. J. Finan. 61(4), 1645–1680 (2006)

    Article  Google Scholar 

  7. Barber, B.M., Odean, T.: All that glitters: The effect of attention and news on the buying behavior of individual and institutional investors. Rev. Finan. Stud. 21(2), 785–818 (2008)

    Article  Google Scholar 

  8. Beheshti, S., Venugopal, S., Ryu, S.H., Benatallah, B., Wang, W.: Big data and cross-document coreference resolution: Current state and future opportunities (2013). ArXiv Preprint arXiv:1311.3987

  9. Bollen, J., Mao, H.: Twitter mood as a stock market predictor. Computer 44(10), 0091–94 (2011)

    Article  Google Scholar 

  10. Baker, B.H.: Types of media bias. Retrieved August, 2014. http://www.studentnewsdaily.com/types-of-media-bias/. (2013)

  11. Cahan, R., Jussa, J., Luo, Y.: Breaking news: How to use news sentiment to pick stocks. Macquarie US Equity Research (2009)

    Google Scholar 

  12. Cambria, E., Schuller, B., Xia, Y., Havasi, C.: New avenues in opinion mining and sentiment analysis ieeexplore.ieee.org. (2013)

    Google Scholar 

  13. Cambria, E., Song, Y., Wang, H., Howard, N.: Semantic multi-dimensional scaling for open-domain sentiment analysis ieeexplore.ieee.org. (2013)

    Google Scholar 

  14. Cambria, E., Xia, Y., Hussain, A.: Affective common sense knowledge acquisition for sentiment analysis lrec.elra.info. (2012)

    Google Scholar 

  15. Carmelo Montalbano. (2014). How to measure stock returns. Retrieved Jan, 2014. http://www.ehow.com/how_7811128_measure-stock-returns.html

  16. Da, Z., Engelberg, J., Gao, P.: In search of attention. J. Finan. 66(5), 1461–1499 (2011)

    Article  Google Scholar 

  17. Das, S.R., Chen, M.Y.: Yahoo! for amazon: Sentiment extraction from small talk on the web. Manage. Sci. 53(9), 1375–1388 (2007)

    Article  Google Scholar 

  18. Dzielinski, M., Rieger, M.O., Talpsepp, T.: Volatility asymmetry, news, and private investors. The Handbook of News Analytics in Finance, pp. 255–270 (2011)

    Google Scholar 

  19. Fang, L., Peress, J.: Media coverage and the Cross section of stock returns. J. Finan. 64(5), 2023–2052 (2009)

    Article  Google Scholar 

  20. Hafez, P.: Detection of seasonality in newsflow. White Paper Available from RavenPack (2009)

    Google Scholar 

  21. Hagenau, M., Korczak, A., Neumann, D.: Buy on bad news, sell on good news: How insider trading analysis can benefit from textual analysis of corporate disclosures. In: Workshop on Information Systems and Economics (WISE 2012), Orlando, Florida, USA (2012)

    Google Scholar 

  22. Hirshleifer, D., Lim, S.S., Teoh, S.H.: Driven to distraction: Extraneous events and underreaction to earnings news. J. Finan. 64(5), 2289–2325 (2009)

    Article  Google Scholar 

  23. Investopedia (2014). Expected return. Retrieved Jan, 2014. http://www.investopedia.com/terms/e/expectedreturn.asp

  24. Investopedia. (2014). Retrieved Jan, 2014. http://www.investopedia.com

  25. Jasny, B.R., Chin, G., Chong, L., Vignieri, S.: Data replication & reproducibility. again, and again, and again…. introduction. Science 334(6060), 1225 (2011). (New York, N.Y.)

    Article  Google Scholar 

  26. Jindal, N., Liu, B.: Identifying comparative sentences in text documents. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 244–251 (2006)

    Google Scholar 

  27. Joachims, T.: Making large scale SVM learning practical. Universität Dortmund (1999)

    Google Scholar 

  28. McCoy, C.J.: Understanding seasonality in search. Retrieved July, 2014. http://searchenginewatch.com/article/2325080/Understanding-Seasonality-in-Search. (2014)

  29. Kothari, S., Li, X., Short, J.E.: The effect of disclosures by management, analysts, and business press on cost of capital, return volatility, and analyst forecasts: A study using content analysis. Account. Rev. 84(5), 1639–1670 (2009)

    Article  Google Scholar 

  30. Leinweber, D.: Nerds on wall street. Math, Machines and Wired Markets (2009)

    Google Scholar 

  31. Zhang, L.: Sentiment analysis on twitter with stock price and significant keyword correlation. Retrieved Jan, 2014. http://apps.cs.utexas.edu/tech_reports/reports/tr/TR-2124.pdf. (2013)

  32. Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)

    Article  Google Scholar 

  33. Loughran, T., McDonald, B.: When is a liability not a liability? textual analysis, dictionaries, and 10 Ks. J. Finan. 66(1), 35–65 (2011)

    Article  Google Scholar 

  34. Lugmayr, A.: Predicting the future of investor sentiment with social media in stock exchange investments: A basic framework for the DAX performance index. In: Handbook of social media management, pp. 565–589. Springer, Heidelberg (2013)

    Google Scholar 

  35. Mitra, G., Mitra, L.: The handbook of news analytics in finance John Wiley & Sons. (2011)

    Google Scholar 

  36. Narayanan, R., Liu, B., & Choudhary, A. (2009). Sentiment analysis of conditional sentences. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-vol. 1, pp. 180–189

    Google Scholar 

  37. Nicholls, C., Song, F.: Comparison of feature selection methods for sentiment analysis. In: Advances in Artificial Intelligence, pp. 286–289. Springer, Berlin Heidelberg (2010)

    Google Scholar 

  38. O’Keefe, T., Koprinska, I.: Feature selection and weighting methods in sentiment analysis cs.otago.ac.nz. (2009)

    Google Scholar 

  39. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing-vol. 10, pp. 79–86 (2002)

    Google Scholar 

  40. Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd Annual Meeting on (2004)

    Google Scholar 

  41. Peng, R.D.: Reproducible research in computational science. Science 334(6060), 1226–1227 (2011). (New York, N.Y.)

    Article  Google Scholar 

  42. Pink, G., Radford, W., Cannings, W., Naoum, A., Nothman, J., Tse, D., et al.: SYDNEY CMCRC at TAC 2013. In: Proceedings of the Text Analysis Conference (TAC2013) (2013)

    Google Scholar 

  43. Princeton University. WordNet: A lexical database for english. Retrieved June, 2014. http://wordnet.princeton.edu/. (2014)

  44. Rabhi, F.A., Yao, L., Guabtni, A.: ADAGE: A framework for supporting user-driven ad-hoc data analysis processes. Computing 94(6), 489–519 (2012)

    Article  Google Scholar 

  45. Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 207–218. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  46. RavenPack. RavenPack news scores user guideRavenPack (2010)

    Google Scholar 

  47. Robertson, C., Geva, S., Wolff, R.: What types of events provide the strongest evidence that the stock market is affected by company specific news? Proc. Fifth Australas. Conf. Data Min. Analystics 61, 145–153 (2006)

    Google Scholar 

  48. Robertson, C.S., Rabhi, F.A., Peat, M.: A service-oriented approach towards real time financial news analysis. In: Consumer Information Systems (2011)

    Google Scholar 

  49. Schneider, K.: On word frequency information and negative evidence in naive bayes text classification. In: Advances in Natural Language Processing, pp. 474–485. Springer (2004)

    Google Scholar 

  50. Scott, J., Stumpp, M., Xu, P.: News, not trading volume, builds momentum. Finan. Anal. J. 46, 45–54 (2003)

    Article  Google Scholar 

  51. SenticNet (2014). Semantic based sentiment analysis. Retrieved April, 2014. http://sentic.net/api/en/concept/celebrate_special_occasion/

  52. Siering, M.: “Boom” or “ruin”–does it make a difference? using text mining and sentiment analysis to support intraday investment decisions. In: 2012 45th Hawaii International Conference on System Science (HICSS), pp. 1050–1059 (2012)

    Google Scholar 

  53. Sirca (2014). Retrieved June, 2014. http://www.sirca.org.au/

  54. Stanford named entity recognizer (NER). (27/08/2014). Retrieved May 2014, 2014. http://nlp.stanford.edu/software/CRF-NER.shtml

  55. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Computat. Linguist. 37(2), 267–307 (2011)

    Article  Google Scholar 

  56. Tetlock, P.C.: Giving content to investor sentiment: The role of media in the stock market. J. Finan. 62(3), 1139–1168 (2007)

    Article  Google Scholar 

  57. Tetlock, P.C., Saar Tsechansky, M., Macskassy, S.: More than words: Quantifying language to measure firms’ fundamentals. J. Finan. 63(3), 1437–1467 (2008)

    Article  Google Scholar 

  58. Reuters, T.: (2013). OpenCalais product. Retrieved July, 2014. http://www.opencalais.com/

  59. Reuters, T.: Thomson reuters news analyticsÂ. Retrieved Jan, 2014. http://thomsonreuters.com/products/financial-risk/01_255/news-analytics-product-brochure–oct-2010.pdf. (2010)

  60. Turney, P.D.: Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417–424 (2002)

    Google Scholar 

  61. University of Sheffield. (2014). GATE projects. Retrieved Mar 2014, 2014. https://gate.ac.uk/projects.html

  62. What is search volume index? (2013). Retrieved August, 2014. http://www.quora.com/What-is-Search-Volume-Index

  63. Wiebe, J.M., Bruce, R.F., O’Hara, T.P.: Development and use of a gold-standard data set for subjectivity classifications. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 246–253 (1999)

    Google Scholar 

  64. BHP Billiton. BHP billiton. Retrieved September, 2014. http://www.bhpbilliton.com/home/Pages/default.aspx. (2014)

  65. Qantas. Qantas. Retrieved September, 2014, from http://www.qantas.com.au/travel/airlines/home/au/en. (2014)

  66. Australian Stock Exchange (ASX) All ordinaries index. Retrieved September, 2014. http://www.asx.com.au/listings/listing-IPO-on-ASX.htm. (2014)

  67. Li, F.: Do Stock Market Investors Understand the Downside Risk Sentiment of Corporate Annual Reports (2007)

    Google Scholar 

  68. Minev, M., Schommer, C., Grammatikos, T.: News and stock markets: A survey on abnormal returns and prediction models (2012)

    Google Scholar 

  69. Nassirtoussi, A.K., Aghabozorgi, S., Wah, T.Y., Ngo, D.C.L.: Text mining for market prediction: A systematic review. Expert Syst. Appl. 41(16), 7653–7670 (2014)

    Article  Google Scholar 

  70. Reuters, T.: Thomson reuters news analytics. Retrieved February, 2015. http://thomsonreuters.com/content/dam/openweb/documents/pdf/tr-com-financial/news-analytics-product-brochure–oct-2010.pdf

  71. Cowan Research LC, U. (2012). Eventus software. Retrieved February, 2015. http://www.eventstudy.com/index.html

  72. Professor Carole Goble School of Computer Science at the University of Manchester, UK. Taverna workflow management system. Retrieved February, 2015. http://www.taverna.org.uk/

Download references

Acknowledgments

We are grateful to Dennis Kundisch and Joerg Honnacker for their comments and Sirca [53] for providing access to the data used in this research.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Islam Qudah or Fethi A. Rabhi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Qudah, I., Rabhi, F.A., Peat, M. (2015). A Proposed Framework for Evaluating the Effectiveness of Financial News Sentiment Scoring Datasets. In: Lugmayr, A. (eds) Enterprise Applications and Services in the Finance Industry. FinanceCom 2014. Lecture Notes in Business Information Processing, vol 217. Springer, Cham. https://doi.org/10.1007/978-3-319-28151-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28151-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28150-6

  • Online ISBN: 978-3-319-28151-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics