Abstract
Sentiment analysis is a hot research area with several applications including analysis of political opinions, classifying comments, movie reviews, news reviews and product reviews. To employ rule based sentiment analysis, sentiment lexicon is required. However, manual construction of a sentiment lexicon is time consuming and costly for resource-limited languages. To reduce development time and costs, we propose an algorithm for constructing Amharic sentiment lexicons. The proposed approach transfers sentiment labels from a one language (e.g. English) to resource-limited language (e.g. Amharic) relying on Amharic-English dictionary. Using Bilingual/Monolingual dictionaries as a bridge, two Amharic sentiment lexicons are automatically generated the first based on SO-CAL polarity lexicon, the second on SentiWordNet 3.0. For each Amharic word, the algorithm finds the meaning of the corresponding English word(s). For these English words, sentiment information is searched from the aforementioned sentiment lexicon(s). The weighted average of returned sentiment values, part of speech and gloss information is assigned to the Amharic word. Lexicons of 5683 and 13679 words, respectively, are generated automatically and evaluated subsequently.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., Stede, M.: Lexicon-based methods for sentiment analysis. Comput. Linguist. 37(2), 267–307 (2011)
Medagoda, N., Shanmuganathan, S., Whalley, J.: Sentiment lexicon construction using sentiwordnet 3.0. In: Proceedings of the 11th International Conference on Natural Computation (ICNC), pp. 802–807, IEEE (2015)
Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Language Resources and Evaluation (LREC), vol. 10, pp. 2200–2204 (2010)
Lexical Data Repository of the Ge’ez Frontier Foundation. https://github.com/geezorg/data. Accessed 15 Feb 2017
Project teams from Addis Ababa University, Masarykova univerzita, Norges teknisk-naturvitenskapelige universitet, The University of Oslo, Hawassa University, 7F14047 HaBiT - Harvesting big text data for under-resourced languages. http://habit-project.eu/wiki/HabitSystemFinal. Accessed 05 May 2018
Gebremeskel, S.: Sentiment mining model for opinionated Amharic texts, Unpublished Masters thesis, Department of Computer Science, Addis Ababa University, Addis Ababa (2010)
Tilahun, T.: Linguistic localization of opinion mining from Amharic blogs. Int. J. Inf. Technol. Comput. Sci. Perspect. 3(1), 890 (2014)
Denecke, K.: Using SentiWordNet for multilingual sentiment analysis. In: Data Engineering Workshop. pp. 507–512, IEEE (2008)
Piao, S., Rayson, P.: Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages. In: European Language Resources Association (ELRA), pp. 2614–2619 (2016)
Christopher Potts: Sentiment Symposium Tutorial Lexicons, Stanford Linguistics. http://sentiment.christopherpotts.net/lexicons.html. Accessed 10 Jan (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A. Sample of Amharic Stemmed Words and Its Variant Word Forms
Appendix A. Sample of Amharic Stemmed Words and Its Variant Word Forms
Three sample words are selected. These are from verb, noun and adjective categories. The base forms of these sample words include /seBeRe/means ‘he break something’, /Beet/means ‘home’ and /qonJo/means ‘beautiful’, respectively. Table 5 below shows the different forms of these words and their corresponding stems or roots.
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Alemneh, G.N., Rauber, A., Atnafu, S. (2019). Dictionary Based Amharic Sentiment Lexicon Generation. In: Mekuria, F., Nigussie, E., Tegegne, T. (eds) Information and Communication Technology for Development for Africa. ICT4DA 2019. Communications in Computer and Information Science, vol 1026. Springer, Cham. https://doi.org/10.1007/978-3-030-26630-1_27
Download citation
DOI: https://doi.org/10.1007/978-3-030-26630-1_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26629-5
Online ISBN: 978-3-030-26630-1
eBook Packages: Computer ScienceComputer Science (R0)