Skip to main content
Log in

A New Method for Sentiment Analysis Using Contextual Auto-Encoders

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Sentiment analysis, a hot research topic, presents new challenges for understanding users’ opinions and judgments expressed online. They aim to classify the subjective texts by assigning them a polarity label. In this paper, we introduce a novel machine learning framework using auto-encoders network to predict the sentiment polarity label at the word level and the sentence level. Inspired by the dimensionality reduction and the feature extraction capabilities of the auto-encoders, we propose a new model for distributed word vector representation “PMI-SA” using as input pointwise-mutual-information “PMI” word vectors. The resulted continuous word vectors are combined to represent a sentence. An unsupervised sentence embedding method, called Contextual Recursive Auto-Encoders “CoRAE”, is also developed for learning sentence representation. Indeed, CoRAE follows the basic idea of the recursive auto-encoders to deeply compose the vectors of words constituting the sentence, but without relying on any syntactic parse tree. The CoRAE model consists in combining recursively each word with its context words (neighbors’ words: previous and next) by considering the word order. A support vector machine classifier with fine-tuning technique is also used to show that our deep compositional representation model CoRAE improves significantly the accuracy of sentiment analysis task. Experimental results demonstrate that CoRAE remarkably outperforms several competitive baseline methods on two databases, namely, Sanders twitter corpus and Facebook comments corpus. The CoRAE model achieves an efficiency of 83.28% with the Facebook dataset and 97.57% with the Sanders dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Fu X, Xu Y. Recursive autoencoder with HowNet lexicon for sentence-level sentiment analysis. In Proc. ASE BigData and Social Informatics, Oct. 2015, Article No. 20.

  2. Ameur H, Jamoussi S. Dynamic construction of dictionaries for sentiment classification. In Proc. the 13th IEEE International Conference on Data Mining Workshops, Dec. 2013, pp.896-903.

  3. Cambria E, Schuller B, Xia Y, Havasi C. New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 2013, 28(2): 15-21.

    Article  Google Scholar 

  4. Socher R, Perelygin A, Wu J, Chuang J, Manning C D, Andrew Y N, Christopher P. Recursive deep models for semantic compositionality over a sentiment treebank. In Proc. Conference on Empirical Methods in Natural Language Processing, Oct. 2013, pp.1631-1642.

  5. Yin H, Zhang C, Zhu Y, Ji Y. Representing sentence with unfolding recursive autoencoders and dynamic average pooling. In Proc. IEEE International Conference on Data Science and Advanced Analytics, Oct. 2014, pp.413-419.

  6. Ameur H, Jamoussi S, Hamadou A B. Sentiment lexicon enrichment using emotional vector representation. In Proc. the 14th IEEE/ACS International Conference on Computer Systems and Applications, Oct. 2017, pp.951-958.

  7. Rong W, Nie Y, Ouyang Y, Peng B, Xiong Z. Auto-encoder based bagging architecture for sentiment analysis. Journal of Visual Languages and Computing, 2014, 25(6): 840-849.

    Article  Google Scholar 

  8. Pang B, Lee L, Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques. In Proc. ACL-02 Conference on Empirical Methods in Natural Language Processing, Volume 10, Jul. 2002, pp.79-86.

  9. Bengio Y. Learning deep architectures for AI. Foundations and Trends® in Machine Learning, 2009, 2(1): 1-127.

    Article  Google Scholar 

  10. Blacoe W, Lapata M. A comparison of vector-based representations for semantic composition. In Proc. Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jul. 2012, pp.546-556.

  11. Socher R, Pennington J, Huang E H, Ng A Y, Manning C D. Semi-supervised recursive autoencoders for predicting sentiment distributions. In Proc. the 11th Conference on Empirical Methods in Natural Language Processing, Jul. 2011, pp.151-161.

  12. Poirier D. Des textes communautaires á la recommandation [Ph.D. Thesis], Orleans University, 2011. (in French)

  13. Martineau J, Finin T. Delta TFIDF: An improved feature space for sentiment analysis. In Proc. the 3rd AAAI International Conference on Weblogs and Social Media, May 2009, pp.258-261.

  14. Turney P D., Pantel P. From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 2010, 37(1): 141-188.

    Article  MathSciNet  Google Scholar 

  15. Chen L. Curse of dimensionality. In Encyclopedia of Database Systems, Liu L, Özsu M T (eds.), Springer, 2009, pp.545-546.

  16. Mikolov T, Yih S W, Zweig G. Linguistic regularities in continuous space word representations. In Proc. the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, May 2013, pp.746-751.

  17. Zhang P, Komachi M. Japanese sentiment classification with stacked denoising autoencoder using distributed word representation. In Proc. the 29th Pacific Asia Conference on Language, Information and Computation, Oct. 2015, pp.150-159.

  18. Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder decoder for statistical machine translation. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, Oct. 2014, pp.1724-1734.

  19. Zhang Y, Er M J, Venkatesan R, Wang N, Pratama M. Sentiment classification using comprehensive attention recurrent models. In Proc. International Joint Conference on Neural Networks, July 2016, pp.1562-1569.

  20. Kim Y. Convolutional neural networks for sentence classification. In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing, Oct. 2014, pp.1746-1751.

  21. Severyn A, Moschitti A. Twitter sentiment analysis with deep convolutional neural networks. In Proc. the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, Aug. 2015, pp.959-962.

  22. Sun X, Li C, Ren F. Sentiment analysis for Chinese microblog based on deep neural networks with convolutional extension features. Neurocomputing, 2016, 210: 227-236.

    Article  Google Scholar 

  23. Zhao R, Mao K. Topic-aware deep compositional models for sentence classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017, 25(2): 248-260.

    Article  Google Scholar 

  24. Bengio Y. Deep learning of representations: Looking forward. arXiV:1305.0445, 2013. https://arXiv.org/abs/1305.0445/, May 2018.

  25. Kumar V. Sentiment analysis using semi-supervised recursive autoencoder. In Proc. the International Joint Conferences on Web Intelligence and Intelligent Agent Technologies, December 2015.

  26. Tang, Y. Deep learning using support vector machines. arXiV:1306.0239, 2013. https://arXiv.org/abs/1306.0239V/, May 2018.

  27. Ebert S, Vu T N, Schütze H. CIS-positive: A combination of convolutional neural networks and support vector machines for sentiment analysis in Twitter. In Proc. the 9th International Workshop on Semantic Evaluation, Jun. 2015, pp.527-532.

  28. Turian J, Ratinov L, Bengio Y. Word representations: A simple and general method for semi-supervised learning. In Proc. the 48th Annual Meeting of the Association for Computational Linguistics, Jul. 2010, pp.384-394.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hanen Ameur.

Electronic supplementary material

ESM 1

(PDF 420 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ameur, H., Jamoussi, S. & Hamadou, A.B. A New Method for Sentiment Analysis Using Contextual Auto-Encoders. J. Comput. Sci. Technol. 33, 1307–1319 (2018). https://doi.org/10.1007/s11390-018-1889-1

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-018-1889-1

Keywords

Navigation