Abstract
The construction of domain-specific sentiment lexicon has become an important direction to improve the performance of sentiment analysis in recent years. As one of the important application areas of sentiment analysis, the stock market also has some related researches. However, when considering the heterogeneity of the stock market relative to other fields, these studies ignore the heterogeneity of the stock market under different market conditions. At the same time, the annotated corpus is also indispensable for these studies, but the annotated corpus, especially the social media corpus that is not standardized, domain-specific and large in volume, is very difficult to obtain, manually labeling or automatic labeling has certain limitations. Besides, in the evaluation of the stock market sentiment lexicon, it is still based on the general classification algorithm evaluation criteria, but ignores the final application purpose of the sentiment analysis in the stock market: helping the stock market participants make investment decisions, that is, to achieve the highest profit. To address those problems, this paper proposes an unsupervised new method of constructing the stock market sentiment lexicon which based on the heterogeneity of the stock market, and an evaluation method of stock market sentiment lexicon. Subsequently, we selected four commonly used Chinese sentiment dictionaries as benchmark lexicons, and verified the method with an unlabeled Eastmoney stock posting corpus containing 15,733,552 posts about 2400 Chinese A-share listed companies. Finally, under our lexicon evaluation framework which based on the portfolio annualized return, the stock market sentiment lexicon constructed in this paper has achieved the best performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Antweiler, W., Frank, M.Z.: Is all that talk just noise? The information content of internet stock message boards. J. Finan. 59(3), 1259–1294 (2004). https://doi.org/10.1111/j.1540-6261.2004.00662.x
Bollen, J., et al.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011). https://doi.org/10.1016/j.jocs.2010.12.007
Challa, M.L., et al.: Forecasting risk using auto regressive integrated moving average approach: an evidence from S&P BSE Sensex. Finan. Innov. 4(1), 24 (2018). https://doi.org/10.1186/S40854-018-0107-Z
Koppel, M., Shtrimberg, I.: Good news or bad news? Let the market decide. In: Shanahan, J.G., et al. (eds.) Computing Attitude and Affect in Text: Theory and Applications, pp. 297–301. Springer, Dordrecht (2006). https://doi.org/10.1007/1-4020-4102-0_22
Li, Q., et al.: Media-aware quantitative trading based on public Web information. Decis. Support Syst. 61, 93–105 (2014). https://doi.org/10.1016/j.dss.2014.01.013
Li, Q., et al.: The effect of news and public mood on stock movements. Inf. Sci. 278, 826–840 (2014). https://doi.org/10.1016/j.ins.2014.03.096
Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012). https://doi.org/10.2200/S00416ED1V01Y201204HLT016
Liu, Y., et al.: A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algorithm. Inf. Sci. 394–395, 38–52 (2017). https://doi.org/10.1016/j.ins.2017.02.016
Liu, Y., et al.: A method for ranking products through online reviews based on sentiment classification and interval-valued intuitionistic fuzzy TOPSIS. Int. J. Inf. Tech. Decis. Making 16(6), 1497–1522 (2017). https://doi.org/10.1142/S021962201750033X
Loughran, T., Mcdonald, B.: When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J. Finan. 66(1), 35–65 (2011). https://doi.org/10.1111/j.1540-6261.2010.01625.x
Mahendhiran, P.D., Kannimuthu, S.: Deep learning techniques for polarity classification in multimodal sentiment analysis. Int. J. Inf. Tech. Decis. Making 17(3), 883–910 (2018). https://doi.org/10.1142/S0219622018500128
Mao, H., et al.: Automatic construction of financial semantic orientation lexicon from large-scale Chinese news corpus. Institut Louis Bachelier 20(2), 1–18 (2014)
Nayak, S.C., Misra, B.B.: Estimating stock closing indices using a GA-weighted condensed polynomial neural network. Finan. Innov. 4(1), 21 (2018). https://doi.org/10.1016/j.dss.2016.02.013
Oliveira, N., et al.: Stock market sentiment lexicon acquisition using microblogging data and statistical measures. Decis. Support Syst. 85, 62–73 (2016). https://doi.org/10.1186/S40854-018-0104-2
Pang, B., Lee, L.: Opinion mining and sentiment analysis. Comput. Linguist. 35(2), 311–312 (2009). https://doi.org/10.1162/coli.2009.35.2.311
Rashid, A., Jabeen, N.: Financial frictions and the cash flow – external financing sensitivity: evidence from a panel of Pakistani firms. Finan. Innov. 4(1), 15 (2018). https://doi.org/10.1186/S40854-018-0100-6
Rosenthal, S., et al.: SemEval-2014 task 9: sentiment analysis in Twitter. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 73–80. Association for Computational Linguistics (2015). https://doi.org/10.3115/V1/S14-2009
Schumaker, R.P., et al.: Evaluating sentiment in financial news articles. Decis. Support Syst. 53(3), 458–464 (2012). https://doi.org/10.1016/j.dss.2012.03.001
Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans. Inf. Syst. 27, 29 (2009)
Shleifer, A., Summers, L.H.: The noise trader approach to finance. J. Econ. Perspect. 4(2), 19–33 (1990). https://doi.org/10.1257/jep.4.2.19
da Silva, N.F.F., et al.: Tweet sentiment analysis with classifier ensembles. Decis. Support Syst. 66, 170–179 (2014). https://doi.org/10.1016/j.dss.2014.07.003
Song, Y., et al.: Sustainable strategy for corporate governance based on the sentiment analysis of financial reports with CSR. Finan. Innov. 4(1), 2 (2018). https://doi.org/10.1186/S40854-018-0086-0
Sun, Y., et al.: A novel stock recommendation system using Guba sentiment analysis. Pers. Ubiquit. Comput. 22(3), 575–587 (2018). https://doi.org/10.1007/s00779-018-1121-x
Turney, P.D., Littman, M.L.: Measuring praise and criticism: inference of semantic orientation from association. ACM Trans. Inf. Syst. 21(4), 315–346 (2003). https://doi.org/10.1145/944012.944013
Wang, N., et al.: Textual sentiment of Chinese microblog toward the stock market. Int. J. Inf. Technol. Decis. Making (IJITDM) 18(02), 649–671 (2019). https://doi.org/10.1142/S0219622019500068
Yousaf, I., et al.: Herding behavior in Ramadan and financial crises: the case of the Pakistani stock market. Finan. Innov. 4(1), 16 (2018). https://doi.org/10.1186/S40854-018-0098-9
Yuen, R.W.M., et al.: Morpheme-based derivation of bipolar semantic orientation of Chinese words. In: Proceedings of the 20th International Conference on Computational Linguistics. Association for Computational Linguistics, Stroudsburg (2004). https://doi.org/10.3115/1220355.1220500
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Liu, Y., Alsaadi, F.E. (2020). A Novel Way to Build Stock Market Sentiment Lexicon. In: He, J., et al. Data Science. ICDS 2019. Communications in Computer and Information Science, vol 1179. Springer, Singapore. https://doi.org/10.1007/978-981-15-2810-1_34
Download citation
DOI: https://doi.org/10.1007/978-981-15-2810-1_34
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-2809-5
Online ISBN: 978-981-15-2810-1
eBook Packages: Computer ScienceComputer Science (R0)