Skip to main content
Log in

Cross-domain sentiment aware word embeddings for review sentiment analysis

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Learning low-dimensional vector representations of words from a large corpus is one of the basic tasks in natural language processing (NLP). The existing universal word embedding model learns word vectors mainly through grammar and semantic information from the context, while ignoring the sentiment information contained in the words. Some approaches, although they model sentiment information in the reviews, do not consider certain words in different domains. In a case where the emotion changes, if the general word vector is directly applied to the review sentiment analysis task, then this will inevitably affect the performance of the sentiment classification. To solve this problem, this paper extends the CBoW (continuous bag-of-words) word vector model and proposes a cross-domain sentiment aware word embedding learning model, which can capture the sentiment information and domain relevance of a word at the same time. This paper conducts several experiments on Amazon user review data in different domains to evaluate the performance of the model. The experimental results show that the proposed model can obtain a nearly 2% accuracy improvement compared with the general word vector when modeling only the sentiment information of the context. At the same time, when the domain information and the sentiment information are both included, the accuracy and Macro-F1 value of the sentiment classification tasks are significantly improved compared with existing sentiment word embeddings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Hu S, Zou L, Yu J, Wang H (2018) Answering natural language questions by subgraph matching over knowledge graphs. IEEE Trans Knowl Data Eng 30(5):824–837

    Article  Google Scholar 

  2. Mikolov T, Sutskever I, Chen K et al (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst Stateline Curran Assoc 26:3111–3119

    Google Scholar 

  3. Moreno E, Gonzalez R (2016) Automatic algorithm to classify and locate research papers using natural language. IEEE Latin Am Trans 14(3):1367–1371

    Article  Google Scholar 

  4. Almuhareb A, Alsanie W (2019) Arabic word segmentation with long short-term memory neural networks and word embedding. IEEE Access 7:12879–12887

    Article  Google Scholar 

  5. Mills M, Bourbakis N (2014) Graph-based methods for natural language processing and understanding—a survey and analysis. IEEE Trans Syst Man Cybern Syst 44(1):59–71

    Article  Google Scholar 

  6. Bollegala D, Mu T, Goulermas JY (2016) Cross-domain sentiment classification using sentiment sensitive embeddings. IEEE Trans Knowl Data Eng 28(2):398–410

    Article  Google Scholar 

  7. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  8. Le A, Clanuwat T, Kitamoto A (2019) A human-inspired recognition system for pre-modern japanese historical documents. IEEE Access 7:84163–84169

    Article  Google Scholar 

  9. Dong L, Wei F, Xu K, Liu S, Zhou M (2016) Adaptive multi-compositionality for recursive neural network models. IEEE Trans Audio Speech Lang Process 24(3):422–431

    Article  Google Scholar 

  10. Hassan A, Mahmood A (2018) Convolutional recurrent deep learning model for sentence classification. IEEE Access 6:13949–13957

    Article  Google Scholar 

  11. Schouten K, Frasincar F (2016) Survey on aspect-level sentiment analysis. IEEE Trans Knowl Data Eng 28(3):813–830

    Article  Google Scholar 

  12. Er MJ, Zhang Y, Wang N et al (2016) Attention pooling-based convolutional neural network for sentence modelling. Inf Sci 373:388–403

    Article  Google Scholar 

  13. Tang D, Wei F, Qin B, Yang N (2016) Sentiment embeddings with applications to sentiment analysis. IEEE Trans Knowl Data Eng 28(2):496–509

    Article  Google Scholar 

  14. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  Google Scholar 

  15. Bengio Y, Ducharme R, Vincent P et al (2003) A neural probabilistic language model. J Mach Learn Res 3(Feb):1137–1155

  16. Mikolov T, Chen K, Corrado G et al (2013) Efficient estimation of word representations in vector space. https://arxiv.org/abs/1301.3781.

  17. Dong X, Dong J (2018) The visual word booster: a spatial layout of words descriptor exploiting contour cues. IEEE Trans Image Process 27(8):3904–3917

    Article  MathSciNet  Google Scholar 

  18. Duyu T, Furu W, Bing Q et al (2016) Sentiment embeddings with applications to sentiment analysis. IEEE Trans Knowl Data Eng 28(2):496–509

    Article  Google Scholar 

  19. Deng D, Jing L, Yu J, Sun S (2019) Sparse self-attention LSTM for sentiment lexicon construction. IEEE/ACM Trans Audio Speech Lang Process 27(11):704–718

    Article  Google Scholar 

  20. Rida-E-Fatima S, Javed A, Banjar A (2019) A multi-layer dual attention deep learning model with refined word embeddings for aspect-based sentiment analysis. IEEE Access 7:114795–114807

    Article  Google Scholar 

  21. Sarma PK, Liang Y, Sethares WA (2018) Domain adapted word embeddings for improved sentiment classification. In: Proceedings of the 56th Annual Meeting of the Association for computational linguistics (short Papers). ACL Press, Melbourne, pp 534–539

  22. Y. Hao, T. Mu, R. Hong, M. Wang (2019) Cross-domain sentiment encoding through stochastic word embedding. IEEE Trans Knowl Data Eng

  23. Minmin C (2017) Efficient vector representation for documents through corruption. https://arxiv.org/abs/1707.02377

  24. Lu W, Hai LC, Lofgren J (2016) A general regularization framework for domain adaptation. In: Proceedings of the 2016 Conference on empirical methods in natural language processing. ACL Press, Austin, pp 950–954

  25. McAuley J, Targett C, Shi Q et al (2015) Image-based recommendations on styles and substitutes. In: Proceedings of the 38th International ACM SIGIR conference on research and development in information retrieval. ACM Press, Shanghai, pp 43–52

  26. Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605

    MATH  Google Scholar 

  27. Xiong S, Lv H, Zhao W et al (2018) Towards Twitter sentiment classification by multi-level sentiment-enriched word embeddings. Neurocomputing 278:2459–2466

    Article  Google Scholar 

  28. Lin M, Xu Z, Yao Z (2018) Multi-attribute group decision-making under probabilistic uncertain linguistic environment. J Oper Res Soc 69(2):157–170

    Article  Google Scholar 

  29. Lin M, Chen Z, Liao H, Xu Z (2019) ELECTRE II method to deal with probabilistic linguistic term sets and its application to edge computing. Nonlinear Dyn 96(3):2125–2143

    Article  Google Scholar 

  30. Garg H, Kumar K (2019) Prioritized aggregation operators based on linguistic connection number for multiple attribute group decision-making under linguistic intuitionistic fuzzy environment. ICSES Trans Neural Fuzzy Comput 2(3):1–15

    Google Scholar 

  31. Wu XL, Liao HC (2019) Comparison analysis between DNMA method and other MCDM methods. ICSES Trans Neural Fuzzy Comput 2(1):4–10

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Chongqing Research Program of Basic Research and Frontier Technology (Grant No. cstc2017jcyjAX0270) and the National Natural Science Foundation of China (Grant No. 61772099, and No. 61872086).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Zheng, S., Xu, G. et al. Cross-domain sentiment aware word embeddings for review sentiment analysis. Int. J. Mach. Learn. & Cyber. 12, 343–354 (2021). https://doi.org/10.1007/s13042-020-01175-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01175-7

Keywords

Navigation