Skip to main content
Log in

Sentiment analysis of Chinese stock reviews based on BERT model

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

A large number of stock reviews are available on the Internet. Sentiment analysis of stock reviews has strong significance in research on the financial market. Due to the lack of a large amount of labeled data, it is difficult to improve the accuracy of Chinese stock sentiment classification using traditional methods. To address this challenge, in this paper, a novel sentiment analysis model for Chinese stock reviews based on BERT is proposed. This model relies on a pre-trained model to improve the accuracy of classification. The model use a BERT pre-training language model to perform representation of stock reviews on the sentence level, and subsequently feed the obtained feature vector into the classifier layer for classification. In the experiments, we demonstrate that our method has higher precision, recall, and F1 than TextCNN, TextRNN, Att-BLSTM and TextCRNN. Our model can obtain the best results which are indicated to be effective in Chinese stock review sentiment analysis. Meanwhile, Our model has powerful generalization capacity and can perform sentiment analysis in many fields.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Availability of data and materials

Experiment data URL is https://github.com/algosenses/Stock_Market_Sentiment_Analysis/tree/master/data

Abbreviations

BERT:

Bidirectional Encoder Representations from Transformers

SVM:

Support Vector Machine

CNN:

Convolutional Neural Network

RNN:

Recurrent Neural Network

LSTM:

Long Short-Term Memory

References

  1. Sheu H-J, Lu Y-C, Wei Y-C (2010) Causalities between sentiment indicators and stock market returns under different market scenarios. Int. J Bus Fin Res 4(1):159–171

    Google Scholar 

  2. Wawre SV, Deshmukh SN (2016) Sentiment classification using machine learning techniques. Int J Sci Res (IJSR) 5(4):819–821

    Article  Google Scholar 

  3. Feng S, Fu Y, Yang F, Wang D, Zhang D (2012) Blog sentiment orientation analysis based on dependency parsing. J Comput Res Dev 49(11):2395–2406

    Google Scholar 

  4. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up sentiment classification using machine learning techniques. Empir Methods Nat Lang Process:79–86

  5. J Devlin, M-W Chang, K Lee ,K Toutanova(2019) BERT: Pre-training of deep bidirectional transformers for language Understanding, Proc. NAACL-HLT, pp.4171–4186 2019.

  6. Kim Y (2014) Convolutional neural networks for sentence classification. proceedings of EMNLP,  Oct. 2014.

  7. Liu P, Qiu X, Huang X (2016) Recurrent neural network for text classification with multi-task learning. In IJCAI, 2873–2879. AAAI Press.

  8. Zhou P, Shi W, Tian J, Qi Z ,Xu B (2016) Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016), page 207.

  9. Wang R, Li Z , Cao J,Chen, T,Wang L (2019). Convolutional Recurrent Neural Networks for Text Classification. 2019 International Joint Conference on Neural Networks (IJCNN) pp.1-6.

  10. Abualigah L, Alfar HE, Shehab M, Hussein AMA (2020) Sentiment analysis in healthcare: a brief review. In: Abd Elaziz M, Al-qaness M, Ewees A, Dahou A (eds) Recent advances in NLP: the case of Arabic language. Studies in computational intelligence, vol 874. Springer, Cham. https://doi.org/10.1007/978-3-030-34614-0_7

    Chapter  Google Scholar 

  11. Zubair AM, Aurangzeb K, Shakeel A et al (2017) Lexicon-enhanced sentiment analysis framework using rule-based classification scheme. PLoS One 12(2):e0171649

  12. Liu Y, Bi J W, Fan Z P (2017). A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algorithm. Information Sciences, 394, 38–52.

  13. Zhang L, Wang S, Liu B (2018). Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1253.

  14. ​Wang Y, Wang M, Fujita H.(2020)Word sense disambiguation: A comprehensive knowledge exploitation framework. Knowledge-Based Systems 190 (2020): 105030.

  15. Pranckevičius T, Marcinkevičius V (2017) Comparison of naive bayes, random forest, decision tree, support vector machines, and logistic regression classifiers for text reviews classification. Baltic J Mod Comput 5(2):221

    Article  Google Scholar 

  16. Zhang, D, Xu, H, Su, Z,  Xu, Y. (2015) Chinese comments sentiment classification based on word2vec and SVMperf. Expert Systems with Applications, 42(4), 1857–1863.

  17. Jeevanandam J, Koteeswaran S (2015) Decision Tree Based Feature Selection and Multilayer Perceptron for Sentiment Analysis. ARPN J Eng Appl Sci, ISSN 1819–6608 10(14):5883–5894

    Google Scholar 

  18. Abid F, Alam M, Yasir M, Li C (2019) Sentiment analysis through recurrent variants latterly on convolutional neural network of Twitter. Futur Gener Comput Syst 95:292–308. https://doi.org/10.1016/j.future.2018.12.018

    Article  Google Scholar 

  19. Santos D, Gatti M (2014) Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts, in Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics. Technical Papers, Dublin, Ireland, pp 69–78 [Online]. Available: https://www.aclweb.org/anthology/C14-1008

    Google Scholar 

  20. Wang X, Liu Y, Sun C, Wang B, Wang X (2015) Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory, in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China, pp. 1343–1353, doi: https://doi.org/10.3115/v1/P15-1130

  21. Wu X, Chen H, Wang J, Troiano L, Loia V, Fujita H (2020) Adaptive stock trading strategies with deep reinforcement learning methods. Inf Sci 538:142–158

    Article  MathSciNet  Google Scholar 

  22. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I(2017). Attention is all you need. Advances in neural information processing systems, 30, 5998–6008.

  23. Hu X, Bing L, Lei S, Philip YS (2019) BERT post-training for review reading comprehension and aspect-based sentiment analysis. Proc NAACL:2324–2335

  24. Catelli R, Gargiulo F, Casola V, De Pietro G, Fujita H, Esposito M (2020). Crosslingual named entity recognition for clinical de-identification applied to a COVID-19 Italian data set. Applied Soft Computing, 97, 106779.

  25. Yu J,Wei Y,Zhang Y(2019) automatic ancient Chinese texts segmentation based on BERT. J Chin Inf Process 33(11):57–63

  26. Duan D, Tang J, Wen Y, Yuan K(2019) BERT Based Research on Classification of Short Chinese Text; Computer Engineering; https://doi.org/10.19678/j.issn.1000-3428.0056222

  27. Sun C, Qiu X, Xu Y, Huang X(2019) How to fine-tune bert for text classification?. In China National Conference on Chinese Computational Linguistics. Springer, Cham.

  28. Hochreiter S, Schmidhuber J (1997). Long short-term memory. Neural computation, 9(8), 1735–1780.

  29. Li Y X, Tan C L, Ding X, Liu C(2004) Contextual post-processing based on the confusion matrix in offline handwritten Chinese script recognition. Pattern Recognition, 37(9), 1901–1912.

  30. Abualigah L M, Khader A T, Hanandeh E S (2018). Hybrid clustering analysis using improved krill herd algorithm. Applied Intelligence, 48(11), 4047–4071.

  31. Meng X (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Comput Rev 60(8):318–318

  32. Abualigah L M , Khader A T , Hanandeh E S (2017) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci:S1877750316305002

  33. Esposito M, Damiano E, Minutolo A, De Pietro G,  Fujita H (2020). Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering. Information Sciences, 514, 88-105.

  34. Rezaeinia SM, Rahmani R, Ghodsi A, Veisi H (2019) Sentiment analysis based on improved pre-trained word embeddings. Expert Sys Appl 117:139–147. https://doi.org/10.1016/j.eswa.2018.08.044

    Article  Google Scholar 

  35. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality, In Proc. Adv Neural Inf Process Syst Nevada, NV, USA, pp. 3111–3119

Download references

Acknowledgements

This work is supported by the Industry-University Cooperation Cooperative Education Program of the Department of Higher Education, Ministry of Education of China with Grant No. 201901148034.

Funding

Industry-University Cooperation Cooperative Education Program of the Department of Higher Education, Ministry of Education of China.

NO. 201901148034.

Author information

Authors and Affiliations

Authors

Contributions

Mingzheng Li:, Conceptualization, Methodology, Investigation Writing - review & editing, Supervision. Lei Chen: Resources, review & editing, Supervision. Jing Zhao: Data curation, Writing – review & editing. Qiang Li: Software, Visualization, Writing - review & editing.

Corresponding author

Correspondence to Mingzheng Li.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, M., Chen, L., Zhao, J. et al. Sentiment analysis of Chinese stock reviews based on BERT model. Appl Intell 51, 5016–5024 (2021). https://doi.org/10.1007/s10489-020-02101-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-02101-8

Keywords

Navigation