Skip to main content
Log in

Leveraging semantics for sentiment polarity detection in social media

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

With the increase use of microblogs and social media platforms as forms of on-line communication, we now have a huge amount of opinionated data reflecting people’s opinions and attitudes in form of reviews, forum discussions, blogs and tweets. This has recently brought great interest to sentiment analysis and opinion mining field that analyzes people’s feelings and attitudes from written language. Most of the existing approaches on sentiment analysis rely mainly on the presence of affect words that explicitly reflect sentiment. However, these approaches are semantically weak, that is, they do not take into account the semantics of words when detecting their sentiment in text. Only recently a few approaches (e.g. sentic computing) started investigating towards this direction. Following this trend, this paper investigates the role of semantics in sentiment analysis of social media. To this end, frame semantics and lexical resources such as BabelNet are employed to extract semantic features from social media that lead to more accurate sentiment analysis models. Experiments are conducted with different types of semantic information by assessing their impact in four social media datasets which incorporate tweets, blogs and movie reviews. A tenfold cross-validation shows that F1 measure increases significantly when using semantics in sentiment analysis in social media. Results show that the proposed approach considering word’s semantics for sentiment analysis surpasses non-semantic approaches for the considered datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. https://www.statista.com/statistics/278414/number-of-worldwide-social-network-users/.

  2. https://twitter.com/.

  3. http://www.tumblr.com/.

  4. http://www.facebook.com/.

  5. http://sentic.net/.

  6. http://babelnet.org/.

  7. https://en.wikipedia.org/wiki/SemEval.

  8. http://2014.eswc-conferences.org/.

  9. http://2015.eswc-conferences.org/.

  10. http://2016.eswc-conferences.org/.

  11. http://www.wikipedia.org/.

  12. https://www.kaggle.com/deepmatrix/imdb-5000-movie-dataset.

  13. https://inclass.kaggle.com/c/si650winter11.

  14. http://alt.qcri.org/semeval2015/task10/.

  15. http://alt.qcri.org/semeval2015/task11/.

  16. https://github.com/stanfordnlp/CoreNLP/blob/master/data/edu/stanford/nlp/patterns/surface/stopwords.txt.

  17. An abbreviation for retweet, which means citation or reposting of a message.

  18. http://lipn.univ-paris13.fr/framester/en/wfd_html/.

  19. http://www.wikipedia.org/.

  20. http://wordnetweb.princeton.edu/perl/webwn.

  21. http://sentiwordnet.isti.cnr.it/.

  22. Post may refer to a movie review, a tweet, or a sentence from social blogs.

  23. Please contact authors for open access to datasets.

  24. https://github.com/ptnplanet/Java-Naive-Bayes-Classifier/tree/master/src/main/java/de/daslaboratorium/machinelearning/classifier.

References

  1. Agarwal A, Xie B, Vovsha I, Rambow O, Passonneau R (2011) Sentiment analysis of twitter data. In: Proc. LSM ’11 Proceedings of the Workshop on Languages in Social Media, pp 30–38

  2. Allan K (2001) Natural language semantics. Blackwell Publishers Ltd, Oxford. ISBN 0-631-19296-4

  3. Barbosa L, Feng J (2010) Robust sentiment detection on twitter from biased and noisy data. In: Proceedings of COLING, pp 36–44

  4. Cambria E, Grassi M, Hussain A, Havasi C (2012) Sentic computing for social media marketing. Multimed Tools Appl 59(2):557–577

    Article  Google Scholar 

  5. Cambria E, Poria S, Bajpai R, Schuller BW (2016) SenticNet 4: a semantic resource for sentiment analysis based on conceptual primitives. Proc COLING 2016:2666–2677

    Google Scholar 

  6. Da Silva NFF, Hruschka ER, Hruschka ER (2014) Tweet sentiment analysis with classifier ensembles. Decis Support Syst 66:170–179

    Article  Google Scholar 

  7. Dragoni M, Recupero DR, Denecke K, Deng Y, Declerck T (2016) Joint Proceedings of the 2nd Workshop on Emotions, Modality, Sentiment Analysis and the Semantic Web and the 1st International Workshop on Extraction and Processing of Rich Semantics from Medical Texts co-located with ESWC 2016, Heraklion, Greece

  8. Gangemi A, Alam M, Asprino L, Presutti V, Recupero DR (2016) Framester: a wide coverage linguistic linked data hub. Proce EKAW 2016:239–254

    Google Scholar 

  9. Gangemi A, Presutti V, Recupero DR (2014) Frame-based detection of opinion holders and topics: a model and a tool. IEEE Comp Int Mag 9(1):20–30

    Article  Google Scholar 

  10. Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224N Project Report, Stanford

  11. Kouloumpis E, Wilson T, Moore J (2011) Twitter sentiment analysis: the good the bad and the OMG! In: Proceedings of the 5th ICWSM, pp 538–541

  12. Kumar A, Kohail S, Kumar A, Ekbal A, Biemann C (2016) IIT-TUDA at SemEval-2016 Task 5: Beyond sentiment lexicon: combining domain dependency and distributional semantics features for aspect based sentiment analysis. In: Proceedings of the 10th International Workshop on Semantic Evaluation, San Diego, CA, USA

  13. Maas AL, Daly RE, Peter PT, Huang D, Ng YA, Potts Ch (2011) Learning word vectors for sentiment analysis. In: Proceeding of 49th ACL, pp 142–150

  14. Maynard D, Funk A (2011) Automatic detection of political opinions in Tweets. In: Proceedings of the 8th International Conference on The Semantic Web, ESWC’11, pp 88–99

  15. Momtazi S (2012) Fine-grained German sentiment analysis on social media. In: Proceedings of the Eight International Conference on Language Resources and Evaluation, LREC’12, pp 23–25

  16. Mukherjee S, Bhattacharyya P (2012) WikiSent: weakly supervised sentiment analysis through extractive summarization with wikipedia. In: proceedings of ECML/PKDD: machine learning and knowledge discovery in databases, pp 774–793

  17. Navigli R, Ponzetto SP (2012) BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. J Artif Intell 193:217–250

    Article  MathSciNet  MATH  Google Scholar 

  18. Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: Proceedings of the International Conference on Language Resources and Evaluation, Valletta, Malta

  19. Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, pp 79–86

  20. Poria S, Cambria E, Winterstein G, Huang GB (2014) Sentic patterns: dependency-based rules for concept-level sentiment analysis. Knowl Based Syst 69:45–63

    Article  Google Scholar 

  21. Pouransari H, Ghili S (2015) Deep learning for sentiment analysis of movie reviews. https://cs224d.stanford.edu/reports/PouransariHadi.pdf

  22. Recupero DR, Presutti V, Consoli S, Gangemi A, Giovanni A, Nuzzolese G (2015) Sentilo: frame-based sentiment analysis. J Cognit Comput 7(2):211–225

    Article  Google Scholar 

  23. Recupero DR, Dragoni M, Presutti V (2015) ESWC 15 challenge on concept-level sentiment analysis. semantic web evaluation challenges—Second SemWebEval Challenge at ESWC 2015, Portorož, Slovenia, May 31–June 4, pp 211–222

  24. Recupero DR, Cambria E (2014) ESWC’14 challenge on concept-level sentiment analysis. semantic web evaluation challenge—SemWebEval 2014 at ESWC 2014, Anissaras, Crete, Greece, May 25–29, 2014, pp 3–20

  25. Rothfels J, Tibshirani J (2010) Unsupervised sentiment classification of English movie reviews using automatic selection of positive and negative sentiment items. Technical report, Stanford University

  26. Saif H, Bashevoy M, Taylor S, Fernández M, Alani H (2016) SentiCircles: a platform for contextual and conceptual sentiment analysis. In: Proceedings of the semantic web—ESWC 2016 Satellite Events, pp 140–145

  27. Saif H, He M, Fernández Y, Alani H (2016) Contextual semantics for sentiment analysis of Twitter. J Inf Process Manag 52(1):5–19

    Article  Google Scholar 

  28. Saif H, Fernández M, He Y, Alani H (2014) SentiCircles for contextual and conceptual semantic sentiment analysis of Twitter. In: Presutti V, dAmato C, Gandon F, dAquin M, Staab S, Tordai A (eds) The semantic web: trends and challenges. ESWC 2014. Lecture Notes in Computer Science, vol 8465, pp 83–98

  29. Saif H, He Y, Fernández M, Alani H (2014) Semantic patterns for sentiment analysis of Twitter. Proc ISWC 2014:324–340

    Google Scholar 

  30. Saif H, He Y, Alani H (2012) Semantic sentiment analysis of Twitter. In: Cudre-Mauroux P et al (eds) The semantic web ISWC 2012. ISWC 2012. Lecture Notes in Computer Science, volume 7649. Springer, Berlin, Heidelberg, pp 508–524

  31. Zagibalov T, Carroll J (2008) Automatic seed word selection for unsupervised sentiment classification of Chinese text. In: Proceedings of the 22nd International Conference on Computational Linguistics, volume 1, pp 1073–1080

  32. Linguistics May Be Clue To Emotions, According To Penn State Research. Penn State. 24 January 2005. https://www.biopsychiatry.com/misc/negative.html

  33. Xie H, Li X, Wang T, Chen L, Li K, Wang FL, Cai Y, Li Q, Min H (2016) Personalized search for social media via dominating verbal context. Neurocomput J 172(C):27–37

  34. Poria S, Cambria E, Devamanyu H, Prateek V (2016) A deeper look into sarcastic tweets using deep convolutional neural networks. In: Proceedings of COLING 2016, 26th International Conference on Computational Linguistics, pp 1601–1612

  35. Poria S, Chaturvedu I, Cambria E, Hussain A (2016) Convolutional MKL based multimodal emotion recognition and sentiment analysis. In: Proceedings of the 16th International Conference on Data Mining, ICDM, pp 439–448

  36. Li X, Xie H, Chen L, Wang J, Deng X (2014) News impact on stock price return via sentiment analysis. Knowl Based Syst 69(1):14–23

    Article  Google Scholar 

  37. Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for-SupportVector Regression. Neural Netw J 67(C):140–150

  38. Gu B, Sun X, Sheng VS (2016) Structural minimax probability machine. IEEE Trans Neural Netw Learn Syst 1:1–11

    Google Scholar 

  39. Tseng C, Patel N, Paranjape H, Lin TY, Teoh S (2012) Classifying twitter data with naive bayes classifier. In: Proceeding of IEEE International Conference on Granular Computing (GrC), pp 294–299

  40. Ayetiran EF, Adeyemo AB (2012) A data mining-based response model for target selection in direct marketing. Int J Inf Technol Comput Sci 1:9–18

    Google Scholar 

  41. Reddy VS, Somayajulu D, Dani AR (2010) Classification of movie reviews using complemented Naive Bayesian classifier. Int J Intel Comput Res (IJICR) 1(4)

Download references

Acknowledgements

This work has been supported by Sardinia Regional Government (P.O.R. Sardegna F.S.E. Operational Programme of the Autonomous Region of Sardinia, European Social Fund 2014–2020—Axis IV Human Resources, Objective l.3, Line of Activity l.3.1). The authors gratefully acknowledge Sardinia Regional Government for the financial support (Convenzione triennale tra la Fondazione di Sardegna e gli Atenei Sardi Regione Sardegna L.R. 7/2007 annualità 2016 DGR 28/21 del 17.05.201, CUP: F72F16003030002). Moreover, the research leading to these results has received funding from the European Union Horizon 2020 the Framework Programme for Research and Innovation (2014–2020) under grant agreement 643808 Project MARIO Managing active and healthy aging with use of caring service robots.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diego Reforgiato Recupero.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dridi, A., Reforgiato Recupero, D. Leveraging semantics for sentiment polarity detection in social media. Int. J. Mach. Learn. & Cyber. 10, 2045–2055 (2019). https://doi.org/10.1007/s13042-017-0727-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-017-0727-z

Keywords

Navigation