Skip to main content
Log in

Analyzing market performance via social media: a case study of a banking industry crisis

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Analyzing market performance via social media has attracted a great deal of attention in the finance and machine- learning disciplines. However, the vast majority of research does not consider the enormous influence a crisis has on social media that further affects the relationship between social media and the stock market. This article aims to address these challenges by proposing a multistage dynamic analysis framework. In this framework, we use an authorship analysis technique and topic model method to identify stakeholder groups and topics related to a special firm. We analyze the activities of stakeholder groups and topics in different periods of a crisis to evaluate the crisis’s influence on various social media parameters. Then, we construct a stock regression model in each stage of crisis to analyze the relationships of changes among stakeholder groups/topics and stock behavior during a crisis. Finally, we discuss some interesting and significant results, which show that a crisis affects social media discussion topics and that different stakeholder groups/topics have distinct effects on stock market predictions during each stage of a crisis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal R, Rajagopalan S, Srikant R, et al. Mining newsgroups using networks arising from social behavior. In: Proceedings of the 12th International Conference on World Wide Web. New York: ACM, 2003. 529–535

    Google Scholar 

  2. Schumaker R P, Chen H. Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans Inf Syst, 2009, 27: 12

    Article  Google Scholar 

  3. Das S R, Chen M Y. Yahoo! for Amazon: sentiment extraction from small talk on the web. Manag Sci, 2007, 53: 1375–1388

    Article  Google Scholar 

  4. Antweiler W, Frank M Z. Is all that talk just noise? The information content of internet stock message boards. J Finan, 2004, 59: 1259–1294

    Article  Google Scholar 

  5. Donaldson T, Preston L E. The stakeholder theory of the corporation: concepts, evidence, and implications. Acad Manage Rev, 1995, 20: 65–91

    Google Scholar 

  6. Kim W, Jeong O R, Lee S W. On social Web sites. Inf Syst, 2010, 35: 215–236

    Article  Google Scholar 

  7. Chen H. Smart market and money. IEEE Intell Syst, 2011, 26: 82–96

    Article  Google Scholar 

  8. Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. J Mach Learn Res, 2003, 3: 993–1022

    MATH  Google Scholar 

  9. Tetlock P, Teschansky M, Macskassy S. More than words: quantifying language to measure firms’ fundamentals. J Finan, 2008, 63: 1437–1467

    Article  Google Scholar 

  10. Shiller R. Do stock price move too much to be justified by subsequent changes in dividends? Amer Psychol Rev, 1981, 5: 296–320

    Google Scholar 

  11. Roll R. R-squared. J Finan, 1988, 43: 541–566

    Google Scholar 

  12. Watts D J, Dodds P S. Influentials, networks, and public opinion formation. J Consum Res, 2007, 34: 441–458

    Article  Google Scholar 

  13. Chung W, Chen H, Reid E. Business stakeholder analyzer: an experiment of classifying stakeholders on the Web. J AM Soc Inf Sci Technol, 2009, 60: 59–74

    Article  Google Scholar 

  14. Mei Q, Zhai C X. Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. New York: ACM, 2005. 198–207

    Google Scholar 

  15. Zhou Y, Guan X, Zhang Z, et al. Predicting the tendency of topic discussion on the online social networks using a dynamic probability model. In: Proceedings of the Hypertext 2008 Workshop on Collaboration and Collective Intelligence. New York: ACM, 2008. 7–11

    Chapter  Google Scholar 

  16. Dubinko M, Kumar R, Magnani J, et al. Visualizing tags over time. ACM Trans Web, 2007, 1: 7

    Article  Google Scholar 

  17. Kaplan A M, Haenlein M. Users of the world, unite! The challenges and opportunities of Social Media. Bus Horiz, 2010, 53: 59–68

    Article  Google Scholar 

  18. Zheng R, Li J, Chen H, et al. A framework for authorship identification of online messages: Writing style features and classification techniques. J AM Soc Inf Sci Technol, 2006, 57: 378–393

    Article  Google Scholar 

  19. Burrows J F. Word-patterns and story-shapes: the statistical analysis of narrative style. Lit Linguist Comput, 1987, 2: 61–70

    Article  Google Scholar 

  20. Stamatatos E, Fakotakis N, Kokkinakis G. Computer-based authorship attribution without lexical measures. Comput Hum, 2001, 35: 193–214

    Article  Google Scholar 

  21. De Vel O, Anderson A, Corney M, et al. Mining e-mail content for author identification forensics. ACM Sigmod Rec, 2001, 30: 55–64

    Article  Google Scholar 

  22. Zheng R, Qin Y, Huang Z, et al. Authorship analysis in cybercrime investigation. In: Proceedings of the 1st NSF/NIJ Conference on Intelligence and Security Informatics. Berlin/heidelberg: Springer-Verlag, 2003. 59–73

    Chapter  Google Scholar 

  23. Hofmann T. Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 1999. 50–57

    Google Scholar 

  24. Griffiths T L, Steyvers M. Finding scientific topics. Proc Nat Acad Sci USA, 2004, 101(Suppl. 1): 5228–5235

    Article  Google Scholar 

  25. Carlson B A. Unsupervised topic clustering of switchboard speech messages. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Atlanta, 1996. 315–318

    Google Scholar 

  26. Hofmann T. Unsupervised learning by probabilistic latent semantic analysis. Mach learn, 2001, 42: 177–196

    Article  MATH  Google Scholar 

  27. Wei X, Croft W B. LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2006. 178–185

    Google Scholar 

  28. Bao H, Chang E Y. Adheat: an influence-based diffusion model for propagating hints to match ads. In: Proceedings of the 19th International Conference on World Wide Web. New York: ACM, 2010. 71–80

    Chapter  Google Scholar 

  29. Jain A K, Murty M N, Flynn P J. Data clustering: a review. ACM Comput Surv, 1999, 31: 264–323

    Article  Google Scholar 

  30. Witten I H, Frank E. Data Mining: Practical Machine Learing Tools and Techniques. 2nd ed. Morgan Kaufmann, 2005

    Google Scholar 

  31. Fisher D H. Knowledge acquisition via incremental conceptual clustering. Mach Learn, 1987, 2: 139–172

    Google Scholar 

  32. Cheeseman P, Stutz J. Bayesian classification (AutoClass): theory and results. In: Fayyad U M, Piatetsky-Shapiro G, Smyth P, et al., eds. Advances in Knowledge Discovery and Data Mining. Menlo Park: AAAI Press, 1995

    Google Scholar 

  33. Abbasi A, Chen H, Salem A. Sentiment analysis in multiple languages: feature selection for opinion classification in Web forums. ACM Trans Inf Syst, 2008, 26: 12

    Google Scholar 

  34. Chen H, Zimbra D. AI and opinion mining. IEEE Intell Syst, 2010, 25: 74–80

    Article  Google Scholar 

  35. Pang B, Lee L. Opinion mining and sentiment analysis. Found Trends Inf Retr, 2008, 2: 1–135

    Article  Google Scholar 

  36. Gamon M. Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis. In: Proceedings of the 20th International Conference on Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2004. 841

    Google Scholar 

  37. Hatzivassiloglou V, Wiebe J M. Effects of adjective orientation and gradability on sentence subjectivity. In: Proceedings of the 18th International Conference on Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2000. 299–305

    Chapter  Google Scholar 

  38. Pang B, Lee L, Vaithyanathan S. Thumbs up?: sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2002. 79–86

    Chapter  Google Scholar 

  39. Engle R, Patton A. What good is a volatility model? Quant Financ, 2001, 1: 237–245

    Article  Google Scholar 

  40. Abbasi A, Chen H. Writeprints: a stylometric approach to identity-level identification and similarity detection in cyberspace. ACM Trans Inf Syst, 2008, 26: 7

    Google Scholar 

  41. Abbasi A, Chen H, Nunamaker J F. Stylometric identification in electronic markets: scalability and robustness. J Manage Inf Syst, 2008, 25: 49–78

    Article  Google Scholar 

  42. Abbasi A, Chen H. CyberGate: a design framework and system for text analysis of computer-mediated communication. MIS Quart, 2008, 32: 811

    Google Scholar 

  43. Zhang Y L, Dang C. Gender classification for Web forums. IEEE Trans Syst Man Cybern A-Syst Hum, 2011, 41: 668–677

    Article  Google Scholar 

  44. Abbasi A, Chen H. Visualizing authorship for identification. In: Proceedings of the 4th IEEE International Conference on Intelligence and Security Informatics. Berlin/Heidelberg: Springer-Verlag, 2006. 60–71

    Chapter  Google Scholar 

  45. Huang S, Ward M O, Rundensteiner E A. Exploration of dimensionality reduction for text visualization. In: Proceedings of the 3rd International Conference on Coordinated and Multiple Views in Exploratory Visualization. Washington DC: IEEE, 2005. 63–74

    Chapter  Google Scholar 

  46. Riloff E, Wiebe J. Learning extraction patterns for subjective expressions. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2003. 105–112

    Google Scholar 

  47. Esuli A, Sebastiani F. Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of the 5th Conference on Language Resources and Evaluation, Genoa, 2006. 417–422

    Google Scholar 

  48. Antweiler W, Frank M Z. Internet stock message boards and stock returns. University of British Columbia Working Paper, 2002

    Google Scholar 

  49. De Choudhury M, Sundaram H, John A, et al. Can blog communication dynamics be correlated with stock market activity? In: Proceedings of the 9th ACM Conference on Hypertext and Hypermedia. New York: ACM, 2008. 55–60

    Chapter  Google Scholar 

  50. Hansen P R, Lunde A. A forecast comparison of volatility models: does anything beat a GARCH (1,1)? J Appl Econom, 2005, 20: 873–889

    Article  MathSciNet  Google Scholar 

  51. Bossaerts P, Hillion P. Implementing statistical criteria to select return forecasting models: what do we learn? Rev Financ Stud, 1999, 12: 405–428

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kun Liang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, C., Liang, K., Chen, H. et al. Analyzing market performance via social media: a case study of a banking industry crisis. Sci. China Inf. Sci. 57, 1–18 (2014). https://doi.org/10.1007/s11432-013-4860-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11432-013-4860-3

Keywords

Navigation