skip to main content
10.1145/3126858.3126861acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebmediaConference Proceedingsconference-collections
research-article

A Majority Voting Approach for Sentiment Analysis in Short Texts using Topic Models

Published: 17 October 2017 Publication History

Abstract

Nowadays people can provide feedback on products and services on the web. Site owners can use this kind of information in order to understand more their public preferences. Sentiment Analysis can help in this task, providing methods to infer the polarity of the reviews. In these methods, the classifier can use hints about the polarity of the words and the subject being discussed in order to infer the polarity of the text. However, many of these texts are short and, because of that, the classifier can have difficulties to infer these hints. We here propose a new sentiment analysis method that uses topic models to infer the polarity of short texts. The intuition of this approach is that, by using topics, the classifier is able to better understand the context and improve the performance in this task. In this approach, we first use methods to infer topics such as LDA, BTM and MedLDA in order to represent the review and, then, we apply a classifier (e.g. Linear SVM, Random Forest or Logistic Regression). In this method, we combine the results of classifiers and text representations in two ways: (1) by using single topic representation and multiple classifiers; (2) and using multiple topic representations and a single classifier. We also analyzed the impact of expanding these texts since the topic model methods can have difficulties to deal with the data sparsity present in these reviews. The proposed approach could achieve gains of up to 8.5% compared to our baseline. Moreover, we were able to determine the best classifier (Random Forest) and the best topic detection method (MedLDA).

References

[1]
Paulo Bicalho, Marcelo Pita, Gabriel Pedrosa, Anisio Lacerda, and Gisele L Pappa. 2017. A general framework to expand short text for topic modeling. Information Sciences 393 (2017), 66--81.
[2]
Guanhao Chen, Yan Wan, and Xiaoxin Xu. 2016. An analysis of the sales and consumer preferences of e-cigarettes based on text mining of online reviews. In 2016 3rd International Conference on Systems and Informatics (ICSAI). IEEE. https://doi.org/10.1109/icsai.2016.7811105
[3]
Christine Day. 2015. The Importance of Sentiment Analysis in Social Media Analysis. Internet website. (2015). https://www.linkedin.com/pulse/importance-sentiment-analysis-social-media-christine-day?published=u
[4]
Andrea Esuli and Fabrizio Sebastiani. 2007. SENTIWORDNET: A high-coverage lexical resource for opinion mining. Evaluation (2007), 1--26.
[5]
Svetlana Kiritchenko, Xiaodan Zhu, and Saif M Mohammad. 2014. Sentiment analysis of short informal texts. Journal of Artificial Intelligence Research 50 (2014), 723--762.
[6]
Saif M Mohammad and Peter D Turney. 2013. Crowdsourcing a word--emotion association lexicon. Computational Intelligence 29, 3 (2013), 436--465.
[7]
Gustavo Arroyo Figueroa Obdulia Pichardo Lagunas, Oscar Herrera Alcántara. 2015. Experimental Results. In Advances in Artificial Intelligence and Its Applications: 14th Mexican International Conference on Artificial Intelligence, MICAI 2015, Cuernavaca, Morelos, Mexico, October 25--31, 2015, Proceedings, Parte 2. Springer, 99--100.
[8]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830.
[9]
Gabriel Pedrosa, Marcelo Pita, Paulo Bicalho, Anisio Lacerda, and Gisele L Pappa. 2016. Topic modeling for short texts with co-occurrence frequency-based expansion. In Intelligent Systems (BRACIS), 2016 5th Brazilian Conference on. IEEE, 277--282.
[10]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation. In Empirical Methods in Natural Language Processing (EMNLP). 1532--1543. http://www.aclweb.org/anthology/ D14--1162
[11]
Xuan-Hieu Phan and Cam-Tu Nguyen. 2007. GibbsLDA++: AC/C++ implementation of latent Dirichlet allocation (LDA). Tech. rep. (2007).
[12]
Kumar Ravi and Vadlamani Ravi. 2015. A survey on opinion mining and sentiment analysis: Tasks, approaches and applications. Knowledge-Based Systems 89 (2015), 14--46.
[13]
Renata Lopes Rosa, Demostenes Zegarra Rodriguez, and Graca Bressan. 2013. SentiMeter-Br: A new social web analysis metric to discover consumers' sentiment. In 2013 IEEE International Symposium on Consumer Electronics (ISCE). IEEE. https://doi.org/10.1109/isce.2013.6570158
[14]
Renata L. Rosa, Demsteneso Z. Rodriguez, and Graca Bressan. 2015. Music recommendation system based on user's sentiments extracted from social networks. IEEE Transactions on Consumer Electronics 61, 3 (aug 2015), 359--367. https://doi.org/10.1109/tce.2015.7298296
[15]
Richard Socher, Alex Perelygin, Jean Y Wu, Jason Chuang, Christopher D Manning, Andrew Y Ng, Christopher Potts, et al. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the conference on empirical methods in natural language processing (EMNLP), Vol. 1631. Citeseer, 1642.
[16]
Mike Thelwall. 2013. Heart and soul: Sentiment strength detection in the social web with sentistrength, 2013. (2013).
[17]
HaoWang, Dogan Can, Abe Kazemzadeh, François Bar, and Shrikanth Narayanan. 2012. A system for real-time twitter sentiment analysis of 2012 us presidential election cycle. In Proceedings of the ACL 2012 System Demonstrations. Association for Computational Linguistics, 115--120.
[18]
Xiaohui Yan, Jiafeng Guo, Yanyan Lan, and Xueqi Cheng. 2013. A biterm topic model for short texts. In Proceedings of the 22nd international conference on World Wide Web. ACM, 1445--1456.
[19]
Jun Zhu, Ning Chen, Hugh Perkins, and Bo Zhang. 2013. Gibbs Max-Margin Topic Models with Fast Sampling Algorithms. In ICML (1). 124--132.

Cited By

View all
  • (2023)Hybrid Sentiment Analysis Model with Majority Voting for Un-labeled Arabic Text2023 1st International Conference on Advanced Innovations in Smart Cities (ICAISC)10.1109/ICAISC56366.2023.10085303(1-6)Online publication date: 23-Jan-2023
  • (2021)Detection and Classification of Psychopathic Personality Trait from Social Media Text Using Deep Learning ModelComputational and Mathematical Methods in Medicine10.1155/2021/55122412021(1-10)Online publication date: 9-Apr-2021
  • (2021)A Hybrid CNN-LSTM Model for Psychopathic Class Detection from Tweeter UsersCognitive Computation10.1007/s12559-021-09836-7Online publication date: 10-Mar-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WebMedia '17: Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web
October 2017
522 pages
ISBN:9781450350969
DOI:10.1145/3126858
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

  • SBC: Brazilian Computer Society
  • CNPq: Conselho Nacional de Desenvolvimento Cientifico e Tecn
  • CGIBR: Comite Gestor da Internet no Brazil
  • CAPES: Brazilian Higher Education Funding Council

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. sentiment analysis
  2. text expansion
  3. topic models

Qualifiers

  • Research-article

Conference

Webmedia '17
Sponsor:
  • SBC
  • CNPq
  • CGIBR
  • CAPES
Webmedia '17: Brazilian Symposium on Multimedia and the Web
October 17 - 20, 2017
RS, Gramado, Brazil

Acceptance Rates

WebMedia '17 Paper Acceptance Rate 38 of 138 submissions, 28%;
Overall Acceptance Rate 270 of 873 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Hybrid Sentiment Analysis Model with Majority Voting for Un-labeled Arabic Text2023 1st International Conference on Advanced Innovations in Smart Cities (ICAISC)10.1109/ICAISC56366.2023.10085303(1-6)Online publication date: 23-Jan-2023
  • (2021)Detection and Classification of Psychopathic Personality Trait from Social Media Text Using Deep Learning ModelComputational and Mathematical Methods in Medicine10.1155/2021/55122412021(1-10)Online publication date: 9-Apr-2021
  • (2021)A Hybrid CNN-LSTM Model for Psychopathic Class Detection from Tweeter UsersCognitive Computation10.1007/s12559-021-09836-7Online publication date: 10-Mar-2021
  • (2020)Applying Deep Neural Networks for Predicting Dark Triad Personality Trait of Online Users2020 International Conference on Information Networking (ICOIN)10.1109/ICOIN48656.2020.9016525(102-105)Online publication date: Jan-2020
  • (2019)Quality assessment of Wikipedia content using topic modelsProceedings of the 25th Brazillian Symposium on Multimedia and the Web10.1145/3323503.3360628(249-252)Online publication date: 29-Oct-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media