research-article

Compass: Spatio Temporal Sentiment Analysis of US Election What Twitter Says!

Authors:

Murali Krishna Teja,

Richie FrostAuthors Info & Claims

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Pages 1585 - 1594

https://doi.org/10.1145/3097983.3098053

Published: 13 August 2017 Publication History

Abstract

With the widespread growth of various social network tools and platforms, analyzing and understanding societal response and crowd reaction to important and emerging social issues and events through social media data is increasingly an important problem. However, there are numerous challenges towards realizing this goal effectively and efficiently, due to the unstructured and noisy nature of social media data. The large volume of the underlying data also presents a fundamental challenge. Furthermore, in many application scenarios, it is often interesting, and in some cases critical, to discover patterns and trends based on geographical and/or temporal partitions, and keep track of how they will change overtime.

This brings up the interesting problem of spatio-temporal sentiment analysis from large-scale social media data. This paper investigates this problem through a data science project called "US Election 2016, What Twitter Says". The objective is to discover sentiment on Twitter towards either the democratic or the republican party at US county and state levels over any arbitrary temporal intervals, using a large collection of geotagged tweets from a period of 6 months leading up to the US Presidential Election in 2016. Our results demonstrate that by integrating and developing a combination of machine learning and data management techniques, it ispossible to do this at scale with effective outcomes. The results of our project have the potential to be adapted towards solving and influencing other interesting social issues such as building neighborhood happiness and health indicators.

Supplementary Material

MP4 File (paul_sentiment_analysis.mp4)

Download
349.86 MB

References

[1]

D. Agarwal and B.-C. Chen. flda: matrix factorization through latent dirichlet allocation. In WSDM, 2010.

Digital Library

[2]

L. AlSumait, D. Barbará, and C. Domeniconi. On-line lda: Adaptive topic models for mining text streams with applications to topic detection and tracking. In ICDM. IEEE, 2008.

Digital Library

[3]

A. Anandkumar, D. P. Foster, D. J. Hsu, S. M. Kakade, and Y.-K. Liu. A spectral algorithm for latent dirichlet allocation. In NIPS, 2012.

Digital Library

[4]

D. Anuta, J. Churchin, and J. Luo. Election bias: Comparing polls and twitter in the 2016 us election. arXiv:1701.06232, 2016.

[5]

AOL. 2016 presidential election timeline, 2016. [accessed 08-02-2017].

[6]

D. M. Blei. Probabilistic topic models. CACM, 55(4):77--84, 2012.

Digital Library

[7]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. JMLR, 3(Jan):993--1022, 2003.

Digital Library

[8]

P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov. Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606, 2016.

[9]

A. Bovet, F. Morone, and H. A. Makse. Predicting election trends with twitter: Hillary clinton versus donald trump. arXiv:1610.01587, 2016.

[10]

M. Cataldi, L. Di Caro, and C. Schifanella. Emerging topic detection on twitter based on temporal and social terms evaluation. In MDM/KDD, 2010.

Digital Library

[11]

G. Cormode and S. Muthukrishnan. An improved data stream summary: The count-min sketch and its applications. In LATIN, 2004.

[12]

C. N. Dos Santos and M. Gatti. Deep convolutional neural networks for sentiment analysis of short texts. In COLING, 2014.

[13]

A. Duric and F. Song. Feature selection for sentiment analysis based on content and syntax models. Decision Support Systems, 53(4):704--711, 2011.

Digital Library

[14]

A. El-Kishky, Y. Song, C. Wang, C. R. Voss, and J. Han. Scalable topical phrase mining from text corpora. PVLDB, 8(3), 2014.

[15]

A. Genkin, D. D. Lewis, and D. Madigan. Large-scale bayesian logistic regression for text categorization. Technometrics, 49(3), 2007.

[16]

A. Go, R. Bhayani, and L. Huang. Twitter sentiment classification using distant supervision. CS224N Project, Stanford, 1(12), 2009.

[17]

F. Godin, V. Slavkovikj, W. De Neve, B. Schrauwen, and R. Van de Walle. Using topic models for twitter hashtag recommendation. In WWW, 2013.

Digital Library

[18]

S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997.

Digital Library

[19]

T. Hofmann. Probabilistic latent semantic analysis. In Uncertainty in artificial intelligence, pages 289--296, 1999.

Digital Library

[20]

IETF. Rfc 7946 - the geojson format, 2017. [accessed 08-Feb-2017].

[21]

L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao. Target-dependent twitter sentiment classification. In ACL HLT, pages 151--160.

[22]

T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In ECML, 1998.

Digital Library

[23]

A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759, 2016.

[24]

D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014.

[25]

J. Kleinberg. Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery, 7(4):373--397, 2003.

Digital Library

[26]

E. Kouloumpis, T. Wilson, and J. D. Moore. Twitter sentiment analysis: The good the bad and the omg! Icwsm, 11(538--541), 2011.

[27]

S. Lai, L. Xu, K. Liu, and J. Zhao. Recurrent convolutional neural networks for text classification. In AAAI, volume 333, pages 2267--2273, 2015.

Digital Library

[28]

Q. Li, S. Shah, X. Liu, A. Nourbakhsh, and R. Fang. Tweetsift: Tweet topic classification based on entity knowledge base and topic enhanced word embedding. In CIKM, 2016.

Digital Library

[29]

R. Lu and Q. Yang. Trend analysis of news topics on twitter. IJMLC, 2(3), 2012.

[30]

A. McCallum, K. Nigam, et al. A comparison of event models for naive bayes text classification. In AAAI, volume 752, pages 41--48, 1998.

[31]

L. Medsker and L. C. Jain. Recurrent neural networks: design and applications. CRC press, 1999.

[32]

T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. arXiv:1301.3781, 2013.

[33]

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS, pages 3111--3119, 2013.

Digital Library

[34]

D.-P. Nguyen, R. Gravel, R. Trieschnigg, and T. Meder. "how old do you think i am?" a study of language and age in twitter. 2013.

[35]

B. Pang, L. Lee, et al. Opinion mining and sentiment analysis. FTIR, 2(1--2):1--135, 2008.

[36]

PRC. Demographics of social media users in 2016, 2016. [accessed 08-Feb-2017].

[37]

D. A. Shamma, L. Kennedy, and E. F. Churchill. Peaks and persistence: Modeling the shape of microblog conversations. In CSCW, 2011.

[38]

M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede. Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2):267--307, 2011.

Digital Library

[39]

D. Tang, B. Qin, and T. Liu. Document modeling with gated recurrent neural network for sentiment classification. In EMNLP, pages 1422--1432, 2015.

[40]

N. Y. Times. Election 2016: Exit polls, 2016.

[41]

S. Vosoughi, H. Zhou, and D. Roy. Enhanced twitter sentiment classification using contextual information. arXiv:1605.05195, 2016.

[42]

C. Wang and D. M. Blei. Collaborative topic modeling for recommending scientific articles. In SIGKDD, 2011.

Digital Library

[43]

Z. Wei, G. Luo, K. Yi, X. Du, and J.-R. Wen. Persistent data sketching. In SIGMOD, 2015.

Digital Library

[44]

Wikipedia. Swift gamma-ray burst mission -- wikipedia, the free encyclopedia, 2016. [accessed 08-Feb-2017].

[45]

D. Xie, F. Li, B. Yao, G. Li, L. Zhou, and M. Guo. Simba: Efficient in-memory spatial analytics. In SIGMOD, 2016.

Digital Library

[46]

W. Xie, F. Zhu, J. Jiang, E.-P. Lim, and K. Wang. Topicsketch: Real-time bursty topic detection from twitter. In ICDM, pages 837--846, 2013.

[47]

W. X. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing twitter and traditional media using topic models. In ECIR, 2011.

Digital Library

[48]

C. Zhou, C. Sun, Z. Liu, and F. Lau. A c-lstm neural network for text classification. arXiv:1511.08630, 2015.

[49]

Y. Zhu and D. Shasha. Efficient elastic burst detection in data streams. In SIGKDD, 2003.

Digital Library

Cited By

Jeong DHwang JChoi YKim Y(2025)Automatic Seed Word Selection for Topic ModelingIEEE Access10.1109/ACCESS.2025.354041013(31269-31285)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2025.3540410
Hanny DResch B(2024)Clustering-Based Joint Topic-Sentiment Modeling of Social Media Data: A Neural Networks ApproachInformation10.3390/info1504020015:4(200)Online publication date: 4-Apr-2024
https://doi.org/10.3390/info15040200
Veltmeijer EGerritsen C(2024)Automatic Domain-Adaptive Sentiment Analysis with SentiMapInternational Journal of Semantic Computing10.1142/S1793351X2441005818:01(97-120)Online publication date: 30-Jan-2024
https://doi.org/10.1142/S1793351X24410058
Show More Cited By

Index Terms

Compass: Spatio Temporal Sentiment Analysis of US Election What Twitter Says!
1. Information systems

Recommendations

On tweets, retweets, hashtags and user profiles in the 2016 American Presidential Election Scene
dg.o '17: Proceedings of the 18th Annual International Conference on Digital Government Research

Twitter is a microblogging where users can publish short messages restricted to 140 characters. It has been used in the political scene from different perspectives. One of them is predicting election results. In this area, many researchers have drawn ...
The diffusion of misinformation on social media

This study examines dynamic communication processes of political misinformation on social media focusing on three components: the temporal pattern, content mutation, and sources of misinformation. We traced the lifecycle of 17 popular political rumors ...
The Effect of Misinformation Intervention: Evidence from Trump’s Tweets and the 2020 Election
Disinformation in Open Online Media
Abstract
In this study, we examine the effect of actions of misinformation mitigation. We use three datasets that contain a wide range of misinformation stories during the 2020 election, and we use synthetic controls to examine the causal effect of Twitter’...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 2017

2240 pages

ISBN:9781450348874

DOI:10.1145/3097983

General Chairs:
Stan Matwin
Dalhousie University
,
Shipeng Yu
LinkedIn
,
Faisal Farooq
IBM

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSFC Grant
NSF Grant

Conference

KDD '17

Sponsor:

KDD '17: The 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 13 - 17, 2017

NS, Halifax, Canada

Acceptance Rates

KDD '17 Paper Acceptance Rate 64 of 748 submissions, 9%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

43
Total Citations
View Citations
1,308
Total Downloads

Downloads (Last 12 months)58
Downloads (Last 6 weeks)9

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jeong DHwang JChoi YKim Y(2025)Automatic Seed Word Selection for Topic ModelingIEEE Access10.1109/ACCESS.2025.354041013(31269-31285)Online publication date: 2025
https://doi.org/10.1109/ACCESS.2025.3540410
Hanny DResch B(2024)Clustering-Based Joint Topic-Sentiment Modeling of Social Media Data: A Neural Networks ApproachInformation10.3390/info1504020015:4(200)Online publication date: 4-Apr-2024
https://doi.org/10.3390/info15040200
Veltmeijer EGerritsen C(2024)Automatic Domain-Adaptive Sentiment Analysis with SentiMapInternational Journal of Semantic Computing10.1142/S1793351X2441005818:01(97-120)Online publication date: 30-Jan-2024
https://doi.org/10.1142/S1793351X24410058
Zhou ZElejalde E(2024)Unveiling the silent majority: stance detection and characterization of passive users on social media using collaborative filtering and graph convolutional networksEPJ Data Science10.1140/epjds/s13688-024-00469-y13:1Online publication date: 4-Apr-2024
https://doi.org/10.1140/epjds/s13688-024-00469-y
Vidushi Rajak APanda RKumar A(2024)Combining Text Information and Sentiment Dictionary for Sentiment Analysis on Twitter During Covid2024 IEEE International Conference on Computing, Power and Communication Technologies (IC2PCT)10.1109/IC2PCT60090.2024.10486448(298-302)Online publication date: 9-Feb-2024
https://doi.org/10.1109/IC2PCT60090.2024.10486448
Panigrahi RBele NPanigrahi PGupta B(2024)Features level sentiment mining in enterprise systems from informal text corpus using machine learning techniquesEnterprise Information Systems10.1080/17517575.2024.232818618:5Online publication date: 24-Mar-2024
https://doi.org/10.1080/17517575.2024.2328186
Zúñiga-Morales LGonzález-Ordiano JQuiroz-Ibarra JVillanueva Rivas C(2024)Machine learning framework for country image analysisJournal of Computational Social Science10.1007/s42001-023-00246-37:1(523-547)Online publication date: 3-Feb-2024
https://doi.org/10.1007/s42001-023-00246-3
Chen XZou DCheng GXie H(2024)Deep neural networks for the automatic understanding of the semantic content of online course reviewsEducation and Information Technologies10.1007/s10639-023-11980-629:4(3953-3991)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s10639-023-11980-6
Peng SCao LPeng SCao L(2024)IntroductionTextual Emotion Classification Using Deep Broad Learning10.1007/978-3-031-67718-2_1(1-30)Online publication date: 28-Sep-2024
https://doi.org/10.1007/978-3-031-67718-2_1
Zhou ZElejalde E(2023)Stance Inference in Twitter through Graph Convolutional Collaborative Filtering Networks with Minimal SupervisionCompanion Proceedings of the ACM Web Conference 202310.1145/3543873.3587640(1030-1038)Online publication date: 30-Apr-2023
https://dl.acm.org/doi/10.1145/3543873.3587640
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten