Leveraging statistical information in fine-grained financial sentiment analysis

Zhang, Han; Li, Zongxi; Xie, Haoran; Lau, Raymond Y. K.; Cheng, Gary; Li, Qing; Zhang, Dian

doi:10.1007/s11280-021-00993-1

Leveraging statistical information in fine-grained financial sentiment analysis

Published: 05 February 2022

Volume 25, pages 513–531, (2022)
Cite this article

World Wide Web Aims and scope Submit manuscript

Han Zhang¹,
Zongxi Li ORCID: orcid.org/0000-0002-1708-7099²,
Haoran Xie³,
Raymond Y. K. Lau⁴,
Gary Cheng⁵,
Qing Li⁶ &
…
Dian Zhang⁷

1116 Accesses
11 Citations
1 Altmetric
Explore all metrics

Abstract

The recent development of deep learning-based natural language processing (NLP) methods has fostered many downstream applications in various fields. As one of the applications in the financial industry, fine-grained financial sentiment analysis (FSA) aims to understand the sentimental orientation, i.e., bullish or bearish, of financial texts by predicting the polarity score and has been widely applied in the financial industry stock-related opinion mining. Because of the lack of a large-scale labeled dataset and the domain-dependent nature, FSA is challenging. Previous works mainly focus on constructing and exploiting handcrafted lexicons that encode expert knowledge to enhance the semantic features in decision making, which yields improvements but are expensive to acquire. This paper proposes a lightweight regression model incorporating the statistical distribution of a term over the polarity range, say between − 1 and 1, to address the fine-grained FSA task. More concretely, we first count each word’s appearance at different polarity intervals and produce a statistic-based representation for each text, which will be encoded as a corpus-level statistical feature vector by an autoencoder. Subsequently, the obtained feature vector will be integrated with the semantic feature vector in the regression model. Our experiments show such a model can produce significant improvements compared with the baseline models on two FSA subsets, i.e., news headlines and microblogs, without a computational overhead. Furthermore, we notice the signs that lexicon-based approaches have neglected can play an important role in FSA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysis of news sentiments using natural language processing and deep learning

Article Open access 30 November 2020

Chinese fine-grained financial sentiment analysis with large language models

Article 07 December 2024

Financial sentiment analysis model utilizing knowledge-base and domain-specific representation

Article 14 February 2022

Notes

References

Akhtar, M.S., Kumar, A., Ghosal, D., Ekbal, A., Bhattacharyya, P.: A multilayer perceptron based ensemble technique for fine-grained financial sentiment analysis. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. https://doi.org/10.18653/v1/D17-1057, pp 540–546. Association for Computational Linguistics, Copenhagen, Denmark (2017)
Antweiler, W., Frank, M.Z.: Is all that talk just noise? the information content of internet stock message boards. J. Financ. 59(3), 1259–1294 (2004). http://www.jstor.org/stable/3694736
Article Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: Proceedings of the 3rd International Conference on Learning Representations (2015)
Brown, G.W., Cliff, M.T.: Investor sentiment and the near-term stock market. J. Empir. Financ. 11(1), 1–27 (2004). https://doi.org/10.1016/j.jempfin.2002.12.001, https://www.sciencedirect.com/science/article/pii/S0927539803000422
Article Google Scholar
Cai, Y., Huang, Q., Lin, Z., Xu, J., Chen, Z., Li, Q.: Recurrent neural network with pooling operation and attention mechanism for sentiment analysis: A multi-task learning approach. Knowledge-Based Systems 203, 105856 (2020)
Article Google Scholar
Cambria, E., Li, Y., Xing, F.Z., Poria, S., Kwok, K.: Senticnet 6: Ensemble application of symbolic and subsymbolic ai for sentiment analysis. In: Proceedings of the 29th ACM International Conference on Information & Knowledge Management. Association for Computing Machinery, pp 105–114 (2020)
Chen, X., Xie, H., Cheng, G., Li, Z.: A decade of sentic computing: Topic modeling and bibliometric analysis. Cogn. Comput., 1–24 (2021)
Cortis, K., Freitas, A., Daudert, T., Huerlimann, M., Zarrouk, M., Handschuh, S., Davis, B.: SemEval-2017 task 5: Fine-grained sentiment analysis on financial microblogs and news. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). https://doi.org/10.18653/v1/S17-2089, pp 519–535. Association for Computational Linguistics, Vancouver, Canada (2017)
Davis, B., Cortis, K., Vasiliu, L., Koumpis, A., McDermott, R., Handschuh, S.: Social sentiment indices powered by x-scores. 2nd International Conference on Big Data, Small Data, Linked Data and Open Data, ALLDATA 2016. p. 21 (2016)
Do, H.H., Prasad, P., Maag, A., Alsadoon, A.: Deep learning for aspect-based sentiment analysis: A comparative review. Expert Syst. Appl. 118, 272–299 (2019). https://doi.org/10.1016/j.eswa.2018.10.003, https://www.sciencedirect.com/science/article/pii/S0957417418306456
Article Google Scholar
Fama, E.F.: Efficient capital markets: A review of theory and empirical work. J. Financ. 25(2), 383–417 (1970). http://www.jstor.org/stable/2325486
Article Google Scholar
Feuerriegel, S., Prendinger, H.: News-based trading strategies. Decis. Support Syst. 90, 65–74 (2016)
Article Google Scholar
Ghosal, D., Bhatnagar, S., Akhtar, M.S., Ekbal, A., Bhattacharyya, P.: IITP at SemEval-2017 task 5: An ensemble of deep learning and feature based models for financial sentiment analysis. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). https://doi.org/10.18653/v1/S17-2154, pp 899–903. Association for Computational Linguistics, Vancouver, Canada (2017)
Graves, A., Jaitly, N., Mohamed, A.: Hybrid speech recognition with deep bidirectional lstm. In: 2013 IEEE Workshop on Automatic Speech Recognition and Understanding. https://doi.org/10.1109/ASRU.2013.6707742, pp 273–278 (2013)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04. https://doi.org/10.1145/1014052.1014073, pp 168–177. Association for Computing Machinery, New York, NY, USA (2004)
Hutto, C., Gilbert, E.: Vader: A parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 8 (2014)
Jiang, M., Lan, M., Wu, Y.: ECNU at SemEval-2017 task 5: An ensemble of regression algorithms with effective features for fine-grained sentiment analysis in financial domain. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). https://doi.org/10.18653/v1/S17-2152, pp 888–893. Association for Computational Linguistics, Vancouver, Canada (2017)
Johnson, R., Zhang, T.: Deep pyramid convolutional neural networks for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp 562–570. Association for Computational Linguistics (2017)
Kar, S., Maharjan, S., Solorio, T.: RiTUAL-UH at SemEval-2017 task 5: Sentiment analysis on financial data using neural networks. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). https://doi.org/10.18653/v1/S17-2150, pp 877–882. Association for Computational Linguistics, Vancouver, Canada (2017)
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 1746–1751. Association for Computational Linguistics (2014)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: Proceedings of the 2014 International Conference on Learning Representations (2014)
Kiritchenko, S., Zhu, X., Mohammad, S.M.: Sentiment analysis of short informal texts. J. Artif. Intell. Res. 50, 723–762 (2014)
Article Google Scholar
Li, F.: The information content of forward-looking statements in corporate filings—a naïve bayesian machine learning approach. J. Account. Res. 48(5), 1049–1102 (2010). https://doi.org/10.1111/j.1475-679X.2010.00382.x
Article Google Scholar
Li, X., Li, Z., Xie, H., Li, Q.: Merging statistical feature via adaptive gate for improved text classification. Proc. AAAI Conf. Artif. Intell. 35 (15), 13288–13296 (2021). https://ojs.aaai.org/index.php/AAAI/article/view/17569
Google Scholar
Li, X., Li, Z., Zhao, Y., Xie, H., Li, Q.: Incorporating effective global information via adaptive gate attention for text classification. arXiv:2002.09673 (2020)
Li, X., Xie, H., Chen, L., Wang, J., Deng, X.: News impact on stock price return via sentiment analysis. Knowl.-Based Syst. 69, 14–23 (2014). https://doi.org/10.1016/j.knosys.2014.04.022, https://www.sciencedirect.com/science/article/pii/S0950705114001440
Article Google Scholar
Li, X., Xie, H., Lau, R.Y.K., Wong, T., Wang, F.L.: Stock prediction via sentimental transfer learning. IEEE Access 6, 73110–73118 (2018)
Article Google Scholar
Li, Z., Chen, X., Xie, H., Li, Q., Tao, X.: Emochannelattn: Exploring emotional construction towards multi-class emotion classification. In: 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). https://doi.org/10.1109/WIIAT50758.2020.00036, pp 242–249 (2020)
Li, Z., Xie, H., Cheng, G., Li, Q.: Word-level emotion distribution with two schemas for short text emotion classification. Knowl.-Based Syst., 107163. https://doi.org/10.1016/j.knosys.2021.107163, https://www.sciencedirect.com/science/article/pii/S0950705121004263 (2021)
Loughran, T., Mcdonald, B.: When is a liability not a liability? textual analysis, dictionaries, and 10-ks. J. Financ. 66 (1), 35–65 (2011). https://doi.org/10.1111/j.1540-6261.2010.01625.x
Article Google Scholar
Luo, L., Ao, X., Pan, F., Wang, J., Zhao, T., Yu, N., He, Q.: Beyond polarity: Interpretable financial sentiment analysis with hierarchical query-driven attention. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18. https://doi.org/10.24963/ijcai.2018/590, pp 4244–4250 (2018)
Mai, L., Le, B.: Joint sentence and aspect-level sentiment analysis of product comments. Ann. Oper. Res., 1–21 (2020)
Malkiel, B.G.: The efficient market hypothesis and its critics. J. Econ. Perspect. 17(1), 59–82 (2003). https://doi.org/10.1257/089533003321164958
Article Google Scholar
Mikolov, T., Grave, E., Bojanowski, P., Puhrsch, C., Joulin, A.: Advances in pre-training distributed word representations. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). https://www.aclweb.org/anthology/L18-1008, pp 52–55. European Language Resources Association (ELRA) (2018)
Mohammad, S.M., Kiritchenko, S.: Using hashtags to capture fine emotion categories from tweets. Comput. Intell. 31(2), 301–326 (2015). https://doi.org/10.1111/coin.12024
Article MathSciNet Google Scholar
Mohammad, S.M., Turney, P.D.: Nrc emotion lexicon. National Research Council, Canada, pp. 1–234 (2013)
Mowlaei, M.E., Saniee Abadeh, M., Keshavarz, H.: Aspect-based sentiment analysis using adaptive aspect-based lexicons. Expert Syst. Appl. 148, 113234 (2020). https://doi.org/10.1016/j.eswa.2020.113234, https://www.sciencedirect.com/science/article/pii/S0957417420300609
Article Google Scholar
Ramos, J., et al.: Using tf-idf to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning (2003)
Sabherwal, S., Sarkar, S.K., Zhang, Y.: Do internet stock message boards influence trading? evidence from heavily discussed stocks with no fundamental news. J. Bus. Financ. Account. 38(9-10), 1209–1237 (2011). https://doi.org/10.1111/j.1468-5957.2011.02258.x
Article Google Scholar
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)
Article Google Scholar
Stone, P.J., Dunphy, D.C., Smith, M.S.: The general inquirer: A computer approach to content analysis (1966)
Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1422–1432 (2015)
Wang, G., Wang, T., Wang, B., Sambasivan, D., Zhang, Z., Zheng, H., Zhao, B.Y.: Crowds on wall street: Extracting value from collaborative investing platforms. In: Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work &; Social Computing, CSCW ’15. https://doi.org/10.1145/2675133.2675144, pp 17–30. Association for Computing Machinery, New York, NY, USA (2015)
Wang, Q., Lau, R.Y.K.: The impact of investors’ surprise emotion on post-m&a performance: A social media analytics approach. In: 40th International Conference on Information Systems (ICIS 2019). Association for Information Systems (2019)
Xing, F., Malandri, L., Zhang, Y., Cambria, E.: Financial sentiment analysis: An investigation into common mistakes and silver bullets. In: Proceedings of the 28th International Conference on Computational Linguistics. https://doi.org/10.18653/v1/2020.coling-main.85, pp 978–987. International Committee on Computational Linguistics, Barcelona Spain (Online) (2020)
Xing, F.Z., Cambria, E., Welsch, R.E.: Natural language based financial forecasting: a survey. Artif. Intell. Rev. 50(1), 49–73 (2018)
Article Google Scholar
Xu, J., Cai, Y., Wu, X., Lei, X., Huang, Q., Leung, H.F., Li, Q.: Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing 386, 42–53 (2020)
Article Google Scholar
Yuan, H., Tang, Y., Xu, W., Lau, R.Y.K.: Exploring the influence of multimodal social media data on stock performance: an empirical perspective and analysis. Internet Res. (2021)
Zhang, X., Zhao, J.J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, pp 649–657 (2015)
Zubiaga, A.: Exploiting Class Labels to Boost Performance on Embedding-Based Text Classification. Association for Computing Machinery, New York, NY USA. https://doi.org/10.1145/3340531.3417444 (2020)

Download references

Acknowledgements

The work of this paper has been supported by the Hong Kong Research Grants Council under the General Research Fund (Project No. 15200021), the Lam Woo Research Fund (Project No. LWI20011) and the Faculty Research Grant (Project No. DB21B6) of Lingnan University, Hong Kong, the One-off Special Fund from Central and Faculty Fund in Support of Research from 2019/20 to 2021/22 (Project No. MIT02/19-20) and the Research Cluster Fund (Project No. RG 78/2019-2020R) of The Education University of Hong Kong. Lau’s work was supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. CityU 11507219). Dian Zhang’s work was supported by NSFC (Project No. 61872247).

Author information

Authors and Affiliations

Department of Finance, The Chinese University of Hong Kong, Shatin, Hong Kong
Han Zhang
School of Science and Technology, Hong Kong Metropolitan University, Ho Man Tin, Hong Kong
Zongxi Li
Department of Computing and Decision Sciences, Lingnan University, Tuen Mun, Hong Kong
Haoran Xie
Department of Information Systems, City University of Hong Kong, Kowloon, Hong Kong
Raymond Y. K. Lau
Department of Mathematics and Information Technology, The Education University of Hong Kong, Tai Po, Hong Kong
Gary Cheng
Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
Qing Li
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Dian Zhang

Authors

Han Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zongxi Li
View author publications
You can also search for this author in PubMed Google Scholar
Haoran Xie
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Y. K. Lau
View author publications
You can also search for this author in PubMed Google Scholar
Gary Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Qing Li
View author publications
You can also search for this author in PubMed Google Scholar
Dian Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zongxi Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Web Intelligence =Artificial Intelligence in the Connected World Guest Editors: Yuefeng Li, Amit Sheth, Athena Vakali, and Xiaohui Tao

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Li, Z., Xie, H. et al. Leveraging statistical information in fine-grained financial sentiment analysis. World Wide Web 25, 513–531 (2022). https://doi.org/10.1007/s11280-021-00993-1

Download citation

Received: 21 July 2021
Revised: 29 September 2021
Accepted: 21 December 2021
Published: 05 February 2022
Issue Date: March 2022
DOI: https://doi.org/10.1007/s11280-021-00993-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Leveraging statistical information in fine-grained financial sentiment analysis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis of news sentiments using natural language processing and deep learning

Chinese fine-grained financial sentiment analysis with large language models

Financial sentiment analysis model utilizing knowledge-base and domain-specific representation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Leveraging statistical information in fine-grained financial sentiment analysis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis of news sentiments using natural language processing and deep learning

Chinese fine-grained financial sentiment analysis with large language models

Financial sentiment analysis model utilizing knowledge-base and domain-specific representation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation