Skip to main content
Log in

Financial sentiment analysis model utilizing knowledge-base and domain-specific representation

  • 1209: Recent Advances on Social Media Analytics and Multimedia Systems: Issues and Challenges
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Financial sentiment analysis is a very challenging problem because the market is influenced by various factors, such as company-specific/political news, sentiment/opinions of users, and other regional financial market. Good news can drive the market to grow positively, while negative news can drag the market downwards. For this reason, it is crucial to understand the impacts of news and social media on the stock market trends. Motivated by this, this paper focuses on developing an effective and efficient company-specific financial sentiment analysis model which can detect the trends of a company’s stock price. More specifically, we develop a novel neural network model that transforms pretrained general word embeddings into domain-specific embeddings. In addition, we use a knowledge-base to enrich the training vocabulary, and thus extend the domain-specific embedding space. The main challenge for natural language processing (NLP) applications is to learn the representation for the rare and unseen words. Another challenge for financial sentiment analysis models addressed in this paper is to deal with words that change their polarities depending upon the domain in which they are used. We thoroughly evaluate the performance of the proposed model on a benchmark dataset of SemEval-2017 shared task on financial sentiment analysis. The experimental results show that the proposed model delivers state-of-the-art performance when applied on Twitter and news headlines datasets, thus demonstrating its feasibility and effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www-nlp.stanford.edu/projects/glove/

  2. https://code.google.com/archive/p/word2vec/

References

  1. Akhtar MS, Kumar A, Ghosal D, Ekbal A, Bhattacharyya P (2017) A multilayer perceptron based ensemble technique for fine-grained financial sentiment analysis. In: Proceedings of the international conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, pp 540–546

  2. Bach NX, Hai VT, Phuong TM (2016) Cross-domain sentiment classification with word embeddings and canonical correlation analysis. In: Proceedings of the seventh symposium on information and communication technology, SoICT’16, pp 159–166

  3. Bahdanau D, Bosc T, Jastrzebski S, Grefenstette E, Vincent P, Bengio Y Learning to compute word embeddings on the fly, arXiv:https://arxiv.org/abs/1706.00286

  4. Cabanski T, Romberg J, Conrad S (2017) HHU at semeval-2017 task 5: fine-grained sentiment analysis on financial data using machine learning methods. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 832–836

  5. Chenlo JM, Losada DE (2014) An empirical study of sentence features for subjectivity and polarity classification. Inform Sci 280:275–288

    Article  Google Scholar 

  6. Cortis K, Freitas A, Daudert T, Huerlimann M, Zarrouk M, Davis B (2017) SemEval-2017 task 5: fine-grained sentiment analysis on financial microblogs and news. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 519–535

  7. Deborah AS, Rajendram SM, Mirnalinee TT (2017) SSN_MLRG1 at SemEval-2017 task 5: fine-grained sentiment analysis using multiple kernel Gaussian process regression model. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 823–826

  8. de Kauter MV, Breesch D, Hoste V (2015) Fine-grained analysis of explicit and implicit sentiment in financial news articles. Exp Syst Applic 42 (11):4999–5010

    Article  Google Scholar 

  9. Geethapriya A, Valli S (2021) An enhanced approach to map domain-specific words in cross-domain sentiment analysis. Inf Syst Front 23:791–805

    Article  Google Scholar 

  10. Ghosal D, Bhatnagar S, Akhtar MS, Ekbal A, Bhattacharyya P (2017) IITP at SemEval-2017 task 5: an ensemble of deep learning and feature based models for financial sentiment analysis. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 899–903

  11. Gombar P, Medic Z, Alagic D, Snajder J (2017) Debunking sentiment lexicons: a case of domain-specific sentiment classification for Croatian. In: Proceedings of the 6th Workshop on Balto-Slavic natural language processing (BSNLP@EACL). Association for Computational Linguistics, pp 54–59

  12. Hamilton WL, Clark K, Leskovec J, Jurafsky D (2016) Inducing domain-specific sentiment lexicons from unlabeled corpora. In: Proceedings of the 2016 conference on empirical methods in natural language processing (EMNLP 2016). Association for Computational Linguistics, pp 595–605

  13. Han W, Chen H, Poria S (2021) Improving multimodal fusion with hierarchical mutual information maximization for multimodal sentiment analysis. In: Proceedings of the empirical methods in natural language processing (EMNLP-2021)

  14. Jaech A, Heck L, Ostendorf M (2016) Domain adaptation of recurrent neural networks for natural language understanding. In: Proceedings of the 17th Annual conference of the international speech communication association (INTERSPEECH 2016), pp 690–694

  15. Jiang M, Lan M, Wu Y (2017) ECNU at SemEval-2017 task 5: an ensemble of regression algorithms with effective features for fine-grained sentiment analysis in financial domain. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 888–893

  16. Kar S, Maharjan S, Solorio T (2017) Ritual-UH at SemEval-2017 task 5: sentiment analysis on financial data using neural networks. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 877–882

  17. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the international conference on empirical methods in natural language processing (EMNLP). Association for Computational Linguistics, pp 1746–1751

  18. Kumar A, Sethi A, Akhtar MS, Ekbal A, Biemann C, Bhattacharyya P (2017) IITPB at SemEval-2017 task 5: sentiment prediction in financial text. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 894–898

  19. Kunar A, Garg G (2019) Sentiment analysis of multimodal twitter data. Multimed Tools Appl 78:24103–24119

    Article  Google Scholar 

  20. Liu B (2012) Sentiment analysis and opinion mining. Morgan & Claypool Publishers

  21. Madhyastha PS, Bansal M, Gimpel K, Livescu K (2017) Mapping unseen words to task-trained embedding spaces. In: Proceedings of the 1st workshop on representation learning for (NLP). Association for Computational Linguistics, pp 100–110

  22. Mansar Y, Gatti L, Ferradans S, Guerini M, Staiano J (2017) Fortia-FBK at SemEval-2017 task 5: bullish or bearish? inferring sentiment towards brands from financial news headlines. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 817–822

  23. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems - volume 2, NIPS’13. Curran Associates Inc., pp 3111–3119

  24. Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41

    Article  Google Scholar 

  25. Moore A, Rayson P (2017) Lancaster A at SemEval-2017 task 5: evaluation metrics matter: predicting sentiment from financial news headlines. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 581–585

  26. Nandwani P, Verma R A review on sentiment analysis and emotion detection from text. Social Network Analysis and Mining, 11(81)

  27. Nardo M, Petracco-Giudici M, Naltsidis M (2016) Walking down wall street with a tablet: a survey of stock market predictions using the web. J Econ Surv 30(2):356–369

    Article  Google Scholar 

  28. Nasim Z (2017) IBA-Sys at SemEval-2017 task 5: fine-grained sentiment analysis on financial microblogs and news. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 827–831

  29. Nassirtoussi AK, Aghabozorgi S, Wah TY, Ngo DCL (2014) Text mining for market prediction: a systematic review. Exp Syst Applic 41(16):7653–7670

    Article  Google Scholar 

  30. Nassirtoussi AK, Aghabozorgi S, Wah TY, Ngo DCL (2015) Text mining of news-headlines for FOREX market prediction: a multi-layer dimension reduction algorithm with semantics and sentiment. Exp Syst Applic 42(1):306–324

    Article  Google Scholar 

  31. Nuij W, Milea V, Hogenboom F, Frasincar F, Kaymak U (2014) An automated framework for incorporating news into stock trading strategies. IEEE Trans Knowl Data Eng 26(4):823–835

    Article  Google Scholar 

  32. O’Hare N, Davy M, Bermingham A, Ferguson P, Sheridan P, Gurrin C, Smeaton AF (2009) Topic-dependent sentiment analysis of financial blogs. In: Proceedings of the 1st international CIKM workshop on topic-sentiment analysis for mass opinion, TSA’09. ACM, pp 9–16

  33. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP 2014). Association for Computational Linguistics, pp 1532–1543

  34. Pilehvar MT, Collier N (2016) Improved semantic representation for domain-specific entities. In: Proceedings of the 15th workshop on biomedical natural language processing. Association for Computational Linguistics, pp 12–16

  35. Pilehvar MT, Collier N (2017) Inducing embeddings for rare and unseen words by leveraging lexical resources. In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics. Association for Computational Linguistics, pp 388–393

  36. Ravi K, Ravi V, Prasad PSRK (2017) Fuzzy formal concept analysis based opinion mining for CRMin financial services. Appl Soft Comput 60:786–807

    Article  Google Scholar 

  37. Ren Y, Wang R, Ji D (2016) A topic-enhanced word embedding for Twitter sentiment classification. Inform Sci 369:188–198

    Article  Google Scholar 

  38. Rotim L, Tutek M, Śnajder J (2017) TakeLab at SemEval-2017 task 5: linear aggregation of word embeddings for fine-grained sentiment analysis on financial news. In: Proceedings of the 11th international workshop on semantic evaluations (SemEval-2017). Association for Computational Linguistics, pp 866–871

  39. Roy A, Park Y, Pan S Learning domain-specific word embeddings from sparse cybersecurity texts, arXiv:https://arxiv.org/abs/1709.07470

  40. Si J, Mukherjee A, Liu B, Li Q, Li H, Deng X (2013) Exploiting topic based twitter sentiment for stock prediction. In: Proceedings of 51st annual meeting of the association for computational linguistics, ACL (2013). Association for Computational Linguistics (ACL), pp 24–29

  41. Sohangir S, Wang D, Pomeranets A, Khoshgoftaar TM Big Data: Deep Learning For Financial sentiment analysis. J Big Data 5(1)

  42. Tafforeau J, Artières T, Favre B, Béchet F (2015) Adapting lexical representation and OOV handling from written to spoken language with word embedding. In: Proceedings of the 16th annual conference of the international speech communication association. (INTERSPEECH 2015), pp 1408–1412

  43. Tsai M-F, Wang C-J, Chien P-C (2016) Discovering finance keywords via continuous-space language models. ACM Trans Manage Inf Syst 7(3):1–17

    Article  Google Scholar 

  44. Wang J, Wang Z, Zhang D, Yan J (2017) Combining knowledge with deep convolutional neural networks for short text classification. In: Proceedings of the 26th international joint conference on artificial intelligence, IJCAI’17. AAAI Press, pp 2915–2921

  45. Zipf GK (1949) Human behavior and the principle of least effort: an introduction to human ecology. Addison Wesley, Cambridge

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Basant Agarwal.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Agarwal, B. Financial sentiment analysis model utilizing knowledge-base and domain-specific representation. Multimed Tools Appl 82, 8899–8920 (2023). https://doi.org/10.1007/s11042-022-12181-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12181-y

Keywords

Navigation