Abstract
Sentiment lexicon is used to judge the sentiments of words and plays a significant role in sentiment analysis. Existing sentiment lexicons ignore the sentimental ambiguity of words in different contexts and only assign sentiment positive or negative polarity for words. In this paper, we propose an automatic method for the construction of the domain-specific sentiment lexicon (SDS-lex) to avoid sentimental ambiguity, which incorporates the sentiment information not only from the existing lexicons but also from the corpus by using our improved TF-IDF algorithm (ITF-IDF). The ITF-IDF algorithm calculates the sentiment of words by considering both the importance of words and the distribution of different part-of-speech (POS) in a corpus labeled with different sentiment tendencies. Experiments on real-world datasets show that our constructed lexicon improves the sentimental ambiguity and outperforms many existing lexicons in terms of the coverage and the accuracy when performing text sentiment classification tasks.
Similar content being viewed by others
References
Assiri A, Emam A, Al-Dossari H (2018) Towards enhancement of a lexicon-based approach for Saudi dialect sentiment analysis. J Inf Sci 44(2):184–202
Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: An Enhanced Lexical Resource For Sentiment Analysis and Opinion Mining. In: International conference on language resources and evaluation, Valletta
Bucar J, Znidarsic M, Povh J (2018) Annotated news corpora and a lexicon for sentiment analysis in Slovene. Lang Resour Eval 52(3):895–919
Cambria E, Poria S, Hazarika D, Kwok K (2018) Senticnet 5: Discovering conceptual primitives for sentiment analysis by means of context embeddings In: Proc. 32th Int. Conf. Assoc. Adv. Artif. Intell, pp 1795–1802
Deng D, Jing L P, Yu J, Sun S. L., Ng M K (2019) Sentiment lexicon construction with hierarchical supervision Topic Model. IEEE-ACM Trans Audio Speech Lang Process 27(4):704–718
Denecke K (2008) Using sentiwordnet for multilingual sentiment analysis. In: 2008 IEEE 24th international conference on data engineering workshop, pp 507–512
Dey A, Jenamani M, Thakkar J J (2018) Senti-n-gram: An n-gram lexicon for sentiment analysis. Expert Syst Appl 103:92–105
Esuli A, Sebastiani F (2006) SentiWordNet: a publicly available lexical resource for opinion mining. In: International Conference on Language Resources and Evaluation (LREC-2006), pp 417-422
Feng J, Gong C, Li XD, Lau RYK (2018) Automatic approach of sentiment lexicon generation for mobile shopping reviews. Wireless Communications and Mobile Computing
Gatti L, Guerini M, Turchi M (2016) Sentiwords: deriving a high precision and high coverage lexicon for sentiment analysis. IEEE Trans Affect Comput 7(4):409–421
Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. In: Final Projects from CS224N for Spring 2008/2009 at The Stanford Natural Language Processing Group
Han H Y, Zhang J P, Yang J, Shen Y R, Zhang Y S (2018) Generate domain-specific sentiment lexicon for review sentiment analysis. Multimed Tools Appl 77(16):21265–21280
Hegazy A E, Makhlouf M A, El-Tawel G S (2019) Feature selection using chaotic salp swarm algorithm for data classification. Arab J Sci Eng 44(4):3801–3816
Hu MQ, Liu B (2004) Mining and summarizing customer reviews. In: ACM SIGKDD, pp 168-177
Khoo C S G, Johnkhan S B (2018) Lexicon-based sentiment analysis: comparative evaluation of six sentiment lexicons. J Inf Sci 44(4):491–511
Kamps J, Marx M, Mokken R J, De RM (2004) Using wordNet to measure semantic orientations of adjectives. In: Proc. 4th Int. Conf. Lang. Resour. Eval, vol 4, pp 1115–1118
Kiritchenko S, Zhu X, Mohammad S M (2014) Sentiment analysis of short informal texts. J Artif Intell Res 50:723–762
Kumari P, Haider M T U (2020) Sentiment analysis on aadhaar for Twitter data-a hybrid classification approach. Proceeding of International Conference on Computational Science and Applications: ICCSA 2019. Springer Nature, pp 309–318
Liu J, Yan M, Luo J (2016) Research on the construction of sentiment lexicon based on chinese microblog. In: International Conference on Intelligent Human-Machine Systems and Cybernetics, Hangzhou, pp 56–59
Liu J, Fu X, Liu J, et al. (2018) Analyzing and assessing reviews on JD.com. Intell Autom Soft Comput 24(1):73–79
Lu Y, Castellanos M, Dayal U, Zhai C (2011) Automatic construction of a context-aware sentiment lexicon: an optimization approach. In: Proc. 20th Int. Conf. World Wide Web (WWW), pp 347–356
Maas A L, Daly R E, Pham P T et al (2011) Learning word vectors for sentiment analysis. In: Meeting of the association for computational linguistics: Human language technologies, Portland, pp 142–150
Mandal S, Singh G K, Pal A (2020) Text summarization technique by sentiment analysis and cuckoo search Algorithm. Computing in Engineering and Technology. Springer, Singapore, pp 357–366
Mohammad S M, Kiritchenko S, Zhu X (2013) NRC-canada: building the state-of-the-art in sentiment analysis of tweets. In: Second Joint Conference on Lexical and Computational Semantics, (SEM). Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), vol 2. Atlanta, pp 321–327
Al-Moslmi T, Albared M, Al-Shabi A, Omar N, Abdullah S (2018) Arabic senti-lexicon: constructing publicly available language resources for Arabic sentiment analysis. J Inf Sci 44(3):345–362
Chul-won NA, Choi M (2018) KNU Korean Sentiment lexicon: bi-LSTM-based method for building a Korean sentiment lexicon. J Intell Inf Syst 24(4):219–240
Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Meeting on Association for Computational Linguistics. Ann Arbor, pp 115–124
Rani S, Kumar P (2019) Deep learning based sentiment analysis using convolution neural network. Arab J Sci Eng 44(4):3305–3314
Riloff E, Wiebe J (2003) Learning extraction patterns for subjective expressions. In: EMNLP, pp 105–112
Saif H, Fernandez M, Kastler L, Alani H (2017) Sentiment lexicon adaptation with context and semantics for the social web. Semant Web 8(5):643–665
Stone P J, Dunphy D C, Smith M S (1966) The general inquirer: a computer approach to content analysis. Inf Storage Retriev 4(4):375–376
Tang D, Wei F, Qin B et al (2014) Building large-scale Twitter-specific sentiment lexicon: a representation learning approach. In: COLING, pp 172–182
Tao W, Liu T, Yu W et al (2018) Building ontology for different emotional contexts and multilingual environment in opinion mining. Intell Autom Soft Comput 24(1):65–71
Tran T K, Phan T T (2018) A hybrid approach for building a Vietnamese sentiment dictionary. J Intell Fuzzy Syst 35(1):967–978
Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of Annual Meeting of the Association for Computational Linguistics, pp 417–424
Vo D T, Zhang Y (2016) Don’t count, predict! An automatic approach to learning sentiment lexicons for short text. In: Proc 54th Annual. Meeting Assoc Comput. Linguist, pp 219–224
Wang K, Xia R (2016) A survey on automatical construction methods of sentiment lexicons. Acta Autom Sin 42(4):495–511
Wang Y, Zhang Y, Liu B (2017) Sentiment lexicon expansion based on neural PU learning, double dictionary lookup, and polarity association. In: Proc. Conf. Empirical Methods Natural Lang Process, pp 553–563
Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: HLT ’05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, British Columbia, pp–354
Wu F, Huang Y, Song Y, et al. (2016) Towards building a high-quality microblog-specific Chinese sentiment lexicon. Decis Support Syst 87(C):39–49
Wu S, Wu F, Chang Y, Wu C, Huang Y (2019) Automatic construction of target-specific sentiment lexicon. Expert Syst Appl 116:285–298
Yang X P, Zhang Z X, Wang L, et al. (2017) Automatic construction and optimization of sentiment lexicon based on Word2Vec. Comput Sci 74(1):42–47
Zabha N I, Ayop Z, Anawar S, Hamid E, Abidin Z Z (2019) Developing cross-lingual sentiment analysis of Malay twitter data using lexicon-based approach. Int J Adv Comput Sci Appl 10(1):346–351
Zhao C J, Wang S G, Li D Y (2019) Exploiting social and local contexts propagation for inducing Chinese microblog-specific sentiment lexicons. Comput Speech Lang 55:57–81
Acknowledgements
This work was supported by the National Natural Science Foundation of China (No. 61801440), the High-quality and Cutting-edge Disciplines Construction Project for Universities in Beijing (Internet Information, Communication University of China) and the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Y., Yin, F., Liu, J. et al. Automatic construction of domain sentiment lexicon for semantic disambiguation. Multimed Tools Appl 79, 22355–22373 (2020). https://doi.org/10.1007/s11042-020-09030-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09030-1