Abstract
The aim of this work is to reproduce the approach to detecting semantic orientations in economic texts that was presented in the paper Good Debt or Bad Debt: Detecting Semantic Orientations in Economic Texts by Malo et al. The approach employs the Linearized Phrase Structure model for sentence level classification of short economic texts into a positive, negative or neutral category from investor’s perspective and yields state-of-the-art results. The proposed method employs both rule based linguistic models and machine learning. Where possible we follow the same approach as described in the original paper, with some documented modifications. Our solution is simplified in at least two aspects, but its performance is comparable to the original and overall remains better than the reported results of other benchmark algorithms mentioned in the original paper. The differences between the two models and results are described in detail and lead to conclusion that the original approach is to a large extent repeatable and that our simplified version does not overly sacrifice performance for generalizability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The financial phrase bank was annotated by 16 annotators and is split into 4 subsets depending on the level of annotator agreement (100%, >75%, >66%, >50%).
- 2.
Available at https://www3.nd.edu/~mcdonald/Word_Lists.html.
- 3.
- 4.
Our list is online: http://kt.ijs.si/data/finentities/financial_entities.zip.
- 5.
We tested the effect of a smaller set of entities by experimenting with randomly halved set, which on average (10 runs) caused the accuracy to drop for 6.2%.
References
Oh, C., Sheng, O.: Investigating predictive power of stock micro blog sentiment in forecasting future stock price directional movement. In: ICIS (2011)
Bollen, J., Mao, H., Pepe, A.: Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In: ICWSM 2011, pp. 450–453 (2011)
Smailović, J., Grčar, M., Lavrač, N., Žnidaršič, M.: Stream-based active learning for sentiment analysis in the financial domain. Inf. Sci. 285, 181–203 (2014)
Cortis, K., Freitas, A., Daudert, T., Huerlimann, M., Zarrouk, M., Handschuh, S., Davis, B.: SemEval-2017 Task 5: Fine-grained sentiment analysis on financial microblogs and news. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pp. 519–535 (2017)
Mitra, G., Mitra, L.: The Handbook of News Analytics in Finance, vol. 596. Wiley, Hoboken (2011)
Loughran, T., McDonald, B.: When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J. Finan. 66(1), 35–65 (2011)
Malo, P., Sinha, A., Korhonen, P., Wallenius, J., Takala, P.: Good debt or bad debt: Detecting semantic orientations in economic texts. J. Assoc. Inf. Sci. Technol. 65(4), 782–796 (2014)
Moilanen, K., Pulman, S.: Sentiment composition. In: Proceedings of RANLP, vol. 7, pp. 378–382 (2007)
Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. Lang. Resour. Eval. 39(2), 165–210 (2005)
Wilson, T.A.: Fine-grained subjectivity and sentiment analysis: Recognizing the intensity, polarity, and attitudes of private states. University of Pittsburgh (2008)
Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity: an exploration of features for phrase-level sentiment analysis. Comput. Linguist. 35(3), 399–433 (2009)
Bodnaruk, A., Loughran, T., McDonald, B.: Using 10-K text to gauge financial constraints. J. Financ. Quant. Anal. 50(4), 623–646 (2015)
Acknowledgments
We acknowledge financial support from the Slovenian Research Agency for research core funding No. P2-0103 and the research project Influence of formal and informal corporate communications on capital markets, No. J5-7387.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Štihec, J., Žnidaršič, M., Pollak, S. (2018). Simplified Hybrid Approach for Detection of Semantic Orientations in Economic Texts. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds) Advances in Information Retrieval. ECIR 2018. Lecture Notes in Computer Science(), vol 10772. Springer, Cham. https://doi.org/10.1007/978-3-319-76941-7_64
Download citation
DOI: https://doi.org/10.1007/978-3-319-76941-7_64
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76940-0
Online ISBN: 978-3-319-76941-7
eBook Packages: Computer ScienceComputer Science (R0)