skip to main content
10.1145/2063576.2063994acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Imbalanced sentiment classification

Published: 24 October 2011 Publication History

Abstract

Sentiment classification has undergone significant development in recent years. However, most existing studies assume the balance between negative and positive samples, which may not be true in reality. In this paper, we investigate imbalanced sentiment classification instead. In particular, a novel clustering-based stratified under-sampling framework and a centroid-directed smoothing strategy are proposed to address the imbalanced class and feature distribution problems respectively. Evaluation across different datasets shows the effectiveness of both the under-sampling framework and the smoothing strategy in handling the imbalanced problems in real sentiment classification applications.

References

[1]
Blitzer J., M. Dredze, and F. Pereira. 2007. Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In Proceedings of ACL-07. 440--447.
[2]
Chawla N., K. Bowyer, L. Hall, and W. Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research. 16(2002), 321--357.
[3]
Juszczak P. and R. Duin. 2003. Uncertainty sampling methods for one-class classifiers. In Proceedings of ICML-03, Workshop on Learning with Imbalanced Data Sets II. 81--88.
[4]
Kubat M. and S. Matwin. 1997. Addressing the curse of imbalanced training sets: one-sided selection. In Proceedings of ICML-97. 179--186.
[5]
Li S., S. Lee, Y. Chen, C. Huang, and G. Zhou. 2010. Sentiment classification and polarity shifting. In Proceedings of COLING-10. 635--643.
[6]
Nakagawa T., K. Inui, and S. Kurohashi. 2010. Dependency tree-based sentiment classification using CRFs with hidden. In Proceedings of NAACL-10. 786--794.
[7]
Neyman J. 1934. On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society. 97,4(1934), 558--625.
[8]
Pang B., L. Lee, and S. Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of EMNLP-02.79--86.
[9]
Qian L., G. Zhou, F. Kong, and Q. Zhu. 2009. Semi-supervised learning for semantic relation classification using stratified sampling strategy. In Proceedings of EMNLP-09. 1437--1445.
[10]
Qian L. and G. Zhou. 2010. Clustering-based stratified seed sampling for semi-supervised selation slassification. In Proceedings of EMNLP-10. 346--355.
[11]
Riloff E., S. Patwardhan, and J. Wiebe. 2006. Feature subsumption for opinion analysis. In Proceedings of EMNLP-06. 440--448.
[12]
Turney P. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of ACL-02. 417--424.
[13]
Zhou Z. and X. Liu. 2006. Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Transaction on Knowledge and Data Engineering. 18(2006), 63--77.

Cited By

View all
  • (2023)Cluster-based ensemble learning model for improving sentiment classification of Arabic documentsNatural Language Engineering10.1017/S135132492300027X(1-39)Online publication date: 1-Jun-2023
  • (2022)A Novel Classification Method Based on a Two-Phase Technique for Learning Imbalanced Text DataSymmetry10.3390/sym1403056714:3(567)Online publication date: 13-Mar-2022
  • (2022)Sentiment analysis of government regulations regarding the implementation of reverse-transcriptase polymerase chain reaction (RT-PCR) during the Covid-19 pandemic in Indonesia (Case study: Air transportation mode)THE 2ND INTERNATIONAL CONFERENCE OF SCIENCE AND INFORMATION TECHNOLOGY IN SMART ADMINISTRATION (ICSINTESA 2021)10.1063/5.0107573(090002)Online publication date: 2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
October 2011
2712 pages
ISBN:9781450307178
DOI:10.1145/2063576
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. imbalanced classification
  2. opinion mining
  3. sentiment classification

Qualifiers

  • Poster

Conference

CIKM '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Cluster-based ensemble learning model for improving sentiment classification of Arabic documentsNatural Language Engineering10.1017/S135132492300027X(1-39)Online publication date: 1-Jun-2023
  • (2022)A Novel Classification Method Based on a Two-Phase Technique for Learning Imbalanced Text DataSymmetry10.3390/sym1403056714:3(567)Online publication date: 13-Mar-2022
  • (2022)Sentiment analysis of government regulations regarding the implementation of reverse-transcriptase polymerase chain reaction (RT-PCR) during the Covid-19 pandemic in Indonesia (Case study: Air transportation mode)THE 2ND INTERNATIONAL CONFERENCE OF SCIENCE AND INFORMATION TECHNOLOGY IN SMART ADMINISTRATION (ICSINTESA 2021)10.1063/5.0107573(090002)Online publication date: 2022
  • (2021)Sentiment classification based on weak tagging information and imbalanced dataIntelligent Data Analysis10.3233/IDA-20540825:3(555-570)Online publication date: 20-Apr-2021
  • (2021)Improving of Imbalanced Data in Multiclass Classification for Sentiment Analysis using Supervised Term Weighting2021 Research, Invention, and Innovation Congress: Innovation Electricals and Electronics (RI2C)10.1109/RI2C51727.2021.9559797(19-24)Online publication date: 1-Sep-2021
  • (2021)Deep Learning Sentiment Classification Based on Weak Tagging InformationIEEE Access10.1109/ACCESS.2021.30770599(66509-66518)Online publication date: 2021
  • (2021)Context-sensitive lexicon for imbalanced text sentiment classification using bidirectional LSTMJournal of Intelligent Manufacturing10.1007/s10845-021-01866-034:5(2123-2132)Online publication date: 10-Nov-2021
  • (2021)A Semi-supervised Framework for Misinformation DetectionDiscovery Science10.1007/978-3-030-88942-5_5(57-66)Online publication date: 11-Oct-2021
  • (2020)Imbalanced sentiment classification based on sequence generative adversarial netsJournal of Intelligent & Fuzzy Systems10.3233/JIFS-201370(1-11)Online publication date: 27-Aug-2020
  • (2019)Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental StudyFoundations of Computing and Decision Sciences10.2478/fcds-2019-000944:2(151-178)Online publication date: 6-Jun-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media