skip to main content
10.1145/2396761.2398481acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

If you are happy and you know it... tweet

Published: 29 October 2012 Publication History

Abstract

Extracting sentiment from Twitter data is one of the fundamental problems in social media analytics. Twitter's length constraint renders determining the positive/negative sentiment of a tweet difficult, even for a human judge. In this work we present a general framework for per-tweet (in contrast with batches of tweets) sentiment analysis which consists of: (1) extracting tweets about a desired target subject, (2) separating tweets with sentiment, and (3) setting apart positive from negative tweets. For each step, we study the performance of a number of classical and new machine learning algorithms. We also show that the intrinsic sparsity of tweets allows performing classification in a low dimensional space, via random projections, without losing accuracy. In addition, we present weighted variants of all employed algorithms, exploiting the available labeling uncertainty, which further improve classification accuracy. Finally, we show that spatially aggregating our per-tweet classification results produces a very satisfactory outcome, making our approach a good candidate for batch tweet sentiment analysis.

References

[1]
A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R. Passonneau. Sentiment analysis of twitter data. In LSM, pages 30--38, 2011.
[2]
A. Asiaee Taheri, M. Tepper, A. Banerjee, and G. Sapiro. If you are happy and know it... tweet. Technical Report 12-017, University of Minnesota, 2012.
[3]
L. Barbosa and J. Feng. Robust sentiment detection on twitter from biased and noisy data. In COLING, pages 36--44, 2010.
[4]
C. M. Bishop. Pattern Recognition and Machine Learning. Springer, 2nd edition, 2007.
[5]
J. Bollen, H. Mao, and X. Zeng. Twitter mood predicts the stock market. J. Computat. Science, 2(1):1--8, 2011.
[6]
J. Bollen, A. Pepe, and H. Mao. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. In ICWSM, 2011.
[7]
A. Bruckstein, D. Donoho, and M. Elad. From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Review, 51(1):34--81, 2009.
[8]
R. Calderbank, S. Jafarpour, and R. Schapire. Compressed learning: Universal sparse dimensionality reduction and learning in the measurement domain. Technical report, Rice University, 2009.
[9]
P. Dodds, K. Harris, I. Kloumann, C. Bliss, and C. Danforth. Temporal patterns of happiness and information in a global social network: hedonometrics and Twitter. PLoS ONE, 6(12):e26752, 2011.
[10]
A. Go, R. Bhayani, and L. Huang. Twitter sentiment classification using distant supervision. Technical report, Stanford University, 2009.
[11]
S. Golder and M. Macy. Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures. Science, 333(6051):1878--1881, 2011.
[12]
D. Hsu, S. Kakade, J. Langford, and T. Zhang. Multi-label prediction via compressed sensing. In NIPS. 2009.
[13]
L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao. Target-dependent twitter sentiment classification. In ACL HLT, volume 1, pages 151--160, 2011.
[14]
K. Nigam, A. K. Mccallum, S. Thrun, and T. Mitchell. Text classification from labeled and unlabeled documents using EM. Machine Learning, 39(2):103--134, 2000.
[15]
A. Pak and P. Paroubek. Twitter as a corpus for sentiment analysis and opinion mining. In LREC, 2010.
[16]
B. Pang and L. Lee. Opinion mining and sentiment analysis. Found. Trends Inf. Retr., 2(1-2):1--135, 2008.
[17]
I. Ramirez, P. Sprechmann, and G. Sapiro. Classification and clustering via dictionary learning with structured incoherence and shared features. In CVPR, pages 3501--3508, 2010.
[18]
K. Weinberger, A. Dasgupta, J. Langford, A. Smola, and J. Attenberg. Feature hashing for large scale multitask learning. In ICML, pages 1113--1120, 2009.
[19]
X. Yang, Q. Song, and A. Cao. Weighted support vector machine for data classification. In IJCNN, volume 2, pages 859--864, 2005.

Cited By

View all
  • (2024)CIDER: Context-sensitive polarity measurement for short-form textPLOS ONE10.1371/journal.pone.029949019:4(e0299490)Online publication date: 18-Apr-2024
  • (2021)CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSISCLUDS: SOSYAL MEDYA ANALİZİ İÇİN ETİKETLİ VE ETİKETSİZ VERİLERİ LOJİSTİK REGRESYON İLE BİRLEŞTİRMEMühendislik Bilimleri ve Tasarım Dergisi10.21923/jesd.7800029:4(1048-1061)Online publication date: 20-Dec-2021
  • (2021)Over a decade of social opinion mining: a systematic reviewArtificial Intelligence Review10.1007/s10462-021-10030-254:7(4873-4965)Online publication date: 1-Oct-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management
October 2012
2840 pages
ISBN:9781450311564
DOI:10.1145/2396761
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bayes classification
  2. compressed learning
  3. sparse modeling
  4. supervised learning
  5. svm
  6. twitter sentiment analysis

Qualifiers

  • Short-paper

Conference

CIKM'12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,029 of 4,238 submissions, 24%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)CIDER: Context-sensitive polarity measurement for short-form textPLOS ONE10.1371/journal.pone.029949019:4(e0299490)Online publication date: 18-Apr-2024
  • (2021)CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSISCLUDS: SOSYAL MEDYA ANALİZİ İÇİN ETİKETLİ VE ETİKETSİZ VERİLERİ LOJİSTİK REGRESYON İLE BİRLEŞTİRMEMühendislik Bilimleri ve Tasarım Dergisi10.21923/jesd.7800029:4(1048-1061)Online publication date: 20-Dec-2021
  • (2021)Over a decade of social opinion mining: a systematic reviewArtificial Intelligence Review10.1007/s10462-021-10030-254:7(4873-4965)Online publication date: 1-Oct-2021
  • (2021)Recent Developments in Sentiment Analysis on Social Networks: Techniques, Datasets, and Open IssuesPrinciples of Social Networking10.1007/978-981-16-3398-0_13(279-306)Online publication date: 19-Aug-2021
  • (2020)A New Ensemble Method for Classifying Sentiments of COVID-19-Related Tweets2020 International Conference on Computational Science and Computational Intelligence (CSCI)10.1109/CSCI51800.2020.00060(313-316)Online publication date: Dec-2020
  • (2020)The emergence of social media data and sentiment analysis in election predictionJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-020-02423-y12:2(2601-2627)Online publication date: 6-Aug-2020
  • (2019)Sentiment Dataset Creation using Multi-Agent System Improved by Deep Learning2019 International Conference on Applied Automation and Industrial Diagnostics (ICAAID)10.1109/ICAAID.2019.8934952(1-4)Online publication date: Sep-2019
  • (2019)Microblogs data management: a surveyThe VLDB Journal10.1007/s00778-019-00569-6Online publication date: 18-Sep-2019
  • (2018)Mining Public Opinion on Aadhaar Card Linking in India2018 International Conference on Computing, Power and Communication Technologies (GUCON)10.1109/GUCON.2018.8674918(1066-1071)Online publication date: Sep-2018
  • (2017)Sentiment lexicon adaptation with context and semantics for the social webSemantic Web10.3233/SW-1702658:5(643-665)Online publication date: 1-Jan-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media