Skip to main content
Log in

Automatic construction of domain sentiment lexicon for semantic disambiguation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Sentiment lexicon is used to judge the sentiments of words and plays a significant role in sentiment analysis. Existing sentiment lexicons ignore the sentimental ambiguity of words in different contexts and only assign sentiment positive or negative polarity for words. In this paper, we propose an automatic method for the construction of the domain-specific sentiment lexicon (SDS-lex) to avoid sentimental ambiguity, which incorporates the sentiment information not only from the existing lexicons but also from the corpus by using our improved TF-IDF algorithm (ITF-IDF). The ITF-IDF algorithm calculates the sentiment of words by considering both the importance of words and the distribution of different part-of-speech (POS) in a corpus labeled with different sentiment tendencies. Experiments on real-world datasets show that our constructed lexicon improves the sentimental ambiguity and outperforms many existing lexicons in terms of the coverage and the accuracy when performing text sentiment classification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://ai.stanford.edu/~amaas//data/sentiment/

  2. http://www.cs.cornell.edu/people/pabo/movie-review-data

References

  1. Assiri A, Emam A, Al-Dossari H (2018) Towards enhancement of a lexicon-based approach for Saudi dialect sentiment analysis. J Inf Sci 44(2):184–202

    Google Scholar 

  2. Baccianella S, Esuli A, Sebastiani F (2010) SentiWordNet 3.0: An Enhanced Lexical Resource For Sentiment Analysis and Opinion Mining. In: International conference on language resources and evaluation, Valletta

  3. Bucar J, Znidarsic M, Povh J (2018) Annotated news corpora and a lexicon for sentiment analysis in Slovene. Lang Resour Eval 52(3):895–919

    Google Scholar 

  4. Cambria E, Poria S, Hazarika D, Kwok K (2018) Senticnet 5: Discovering conceptual primitives for sentiment analysis by means of context embeddings In: Proc. 32th Int. Conf. Assoc. Adv. Artif. Intell, pp 1795–1802

  5. Deng D, Jing L P, Yu J, Sun S. L., Ng M K (2019) Sentiment lexicon construction with hierarchical supervision Topic Model. IEEE-ACM Trans Audio Speech Lang Process 27(4):704–718

    Google Scholar 

  6. Denecke K (2008) Using sentiwordnet for multilingual sentiment analysis. In: 2008 IEEE 24th international conference on data engineering workshop, pp 507–512

  7. Dey A, Jenamani M, Thakkar J J (2018) Senti-n-gram: An n-gram lexicon for sentiment analysis. Expert Syst Appl 103:92–105

    Google Scholar 

  8. Esuli A, Sebastiani F (2006) SentiWordNet: a publicly available lexical resource for opinion mining. In: International Conference on Language Resources and Evaluation (LREC-2006), pp 417-422

  9. Feng J, Gong C, Li XD, Lau RYK (2018) Automatic approach of sentiment lexicon generation for mobile shopping reviews. Wireless Communications and Mobile Computing

  10. Gatti L, Guerini M, Turchi M (2016) Sentiwords: deriving a high precision and high coverage lexicon for sentiment analysis. IEEE Trans Affect Comput 7(4):409–421

    Google Scholar 

  11. Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. In: Final Projects from CS224N for Spring 2008/2009 at The Stanford Natural Language Processing Group

  12. Han H Y, Zhang J P, Yang J, Shen Y R, Zhang Y S (2018) Generate domain-specific sentiment lexicon for review sentiment analysis. Multimed Tools Appl 77(16):21265–21280

    Google Scholar 

  13. Hegazy A E, Makhlouf M A, El-Tawel G S (2019) Feature selection using chaotic salp swarm algorithm for data classification. Arab J Sci Eng 44(4):3801–3816

    Google Scholar 

  14. Hu MQ, Liu B (2004) Mining and summarizing customer reviews. In: ACM SIGKDD, pp 168-177

  15. Khoo C S G, Johnkhan S B (2018) Lexicon-based sentiment analysis: comparative evaluation of six sentiment lexicons. J Inf Sci 44(4):491–511

    Google Scholar 

  16. Kamps J, Marx M, Mokken R J, De RM (2004) Using wordNet to measure semantic orientations of adjectives. In: Proc. 4th Int. Conf. Lang. Resour. Eval, vol 4, pp 1115–1118

  17. Kiritchenko S, Zhu X, Mohammad S M (2014) Sentiment analysis of short informal texts. J Artif Intell Res 50:723–762

    Google Scholar 

  18. Kumari P, Haider M T U (2020) Sentiment analysis on aadhaar for Twitter data-a hybrid classification approach. Proceeding of International Conference on Computational Science and Applications: ICCSA 2019. Springer Nature, pp 309–318

  19. Liu J, Yan M, Luo J (2016) Research on the construction of sentiment lexicon based on chinese microblog. In: International Conference on Intelligent Human-Machine Systems and Cybernetics, Hangzhou, pp 56–59

  20. Liu J, Fu X, Liu J, et al. (2018) Analyzing and assessing reviews on JD.com. Intell Autom Soft Comput 24(1):73–79

    Google Scholar 

  21. Lu Y, Castellanos M, Dayal U, Zhai C (2011) Automatic construction of a context-aware sentiment lexicon: an optimization approach. In: Proc. 20th Int. Conf. World Wide Web (WWW), pp 347–356

  22. Maas A L, Daly R E, Pham P T et al (2011) Learning word vectors for sentiment analysis. In: Meeting of the association for computational linguistics: Human language technologies, Portland, pp 142–150

  23. Mandal S, Singh G K, Pal A (2020) Text summarization technique by sentiment analysis and cuckoo search Algorithm. Computing in Engineering and Technology. Springer, Singapore, pp 357–366

  24. Mohammad S M, Kiritchenko S, Zhu X (2013) NRC-canada: building the state-of-the-art in sentiment analysis of tweets. In: Second Joint Conference on Lexical and Computational Semantics, (SEM). Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), vol 2. Atlanta, pp 321–327

  25. Al-Moslmi T, Albared M, Al-Shabi A, Omar N, Abdullah S (2018) Arabic senti-lexicon: constructing publicly available language resources for Arabic sentiment analysis. J Inf Sci 44(3):345–362

    Google Scholar 

  26. Chul-won NA, Choi M (2018) KNU Korean Sentiment lexicon: bi-LSTM-based method for building a Korean sentiment lexicon. J Intell Inf Syst 24(4):219–240

    Google Scholar 

  27. Pang B, Lee L (2005) Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Meeting on Association for Computational Linguistics. Ann Arbor, pp 115–124

  28. Rani S, Kumar P (2019) Deep learning based sentiment analysis using convolution neural network. Arab J Sci Eng 44(4):3305–3314

    Google Scholar 

  29. Riloff E, Wiebe J (2003) Learning extraction patterns for subjective expressions. In: EMNLP, pp 105–112

  30. Saif H, Fernandez M, Kastler L, Alani H (2017) Sentiment lexicon adaptation with context and semantics for the social web. Semant Web 8(5):643–665

    Google Scholar 

  31. Stone P J, Dunphy D C, Smith M S (1966) The general inquirer: a computer approach to content analysis. Inf Storage Retriev 4(4):375–376

    Google Scholar 

  32. Tang D, Wei F, Qin B et al (2014) Building large-scale Twitter-specific sentiment lexicon: a representation learning approach. In: COLING, pp 172–182

  33. Tao W, Liu T, Yu W et al (2018) Building ontology for different emotional contexts and multilingual environment in opinion mining. Intell Autom Soft Comput 24(1):65–71

    Google Scholar 

  34. Tran T K, Phan T T (2018) A hybrid approach for building a Vietnamese sentiment dictionary. J Intell Fuzzy Syst 35(1):967–978

    Google Scholar 

  35. Turney PD (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of Annual Meeting of the Association for Computational Linguistics, pp 417–424

  36. Vo D T, Zhang Y (2016) Don’t count, predict! An automatic approach to learning sentiment lexicons for short text. In: Proc 54th Annual. Meeting Assoc Comput. Linguist, pp 219–224

  37. Wang K, Xia R (2016) A survey on automatical construction methods of sentiment lexicons. Acta Autom Sin 42(4):495–511

    Google Scholar 

  38. Wang Y, Zhang Y, Liu B (2017) Sentiment lexicon expansion based on neural PU learning, double dictionary lookup, and polarity association. In: Proc. Conf. Empirical Methods Natural Lang Process, pp 553–563

  39. Wilson T, Wiebe J, Hoffmann P (2005) Recognizing contextual polarity in phrase-level sentiment analysis. In: HLT ’05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, British Columbia, pp–354

  40. Wu F, Huang Y, Song Y, et al. (2016) Towards building a high-quality microblog-specific Chinese sentiment lexicon. Decis Support Syst 87(C):39–49

    Google Scholar 

  41. Wu S, Wu F, Chang Y, Wu C, Huang Y (2019) Automatic construction of target-specific sentiment lexicon. Expert Syst Appl 116:285–298

    Google Scholar 

  42. Yang X P, Zhang Z X, Wang L, et al. (2017) Automatic construction and optimization of sentiment lexicon based on Word2Vec. Comput Sci 74(1):42–47

    Google Scholar 

  43. Zabha N I, Ayop Z, Anawar S, Hamid E, Abidin Z Z (2019) Developing cross-lingual sentiment analysis of Malay twitter data using lexicon-based approach. Int J Adv Comput Sci Appl 10(1):346–351

    Google Scholar 

  44. Zhao C J, Wang S G, Li D Y (2019) Exploiting social and local contexts propagation for inducing Chinese microblog-specific sentiment lexicons. Comput Speech Lang 55:57–81

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 61801440), the High-quality and Cutting-edge Disciplines Construction Project for Universities in Beijing (Internet Information, Communication University of China) and the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fulian Yin.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Yin, F., Liu, J. et al. Automatic construction of domain sentiment lexicon for semantic disambiguation. Multimed Tools Appl 79, 22355–22373 (2020). https://doi.org/10.1007/s11042-020-09030-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09030-1

Keywords

Navigation