skip to main content
10.1145/2492517.2492639acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

TopicFlow: visualizing topic alignment of Twitter data over time

Published: 25 August 2013 Publication History

Abstract

Social media, particularly Twitter, provides an abundance of real-time data. To account for this volume, researchers often use automated analysis and visualization techniques to produce a high-level overview of a Twitter stream. Existing techniques for understanding Twitter data make use of hashtags or word-pairs and may ignore the complex trends in discussions over time. To remedy this, we present an application of statistical topic modeling and alignment (binned topic models) to group related tweets into automatically generated topics and TopicFlow, an interactive tool to visualize the evolution of these topics. The effectiveness of this visualization for reasoning about large data sets is demonstrated by a usability study with 18 participants.

References

[1]
J. Martinez. (2012) Twitter CEO Dick Costolo reveals staggering number of tweets per day. http://www.complex.com/tech/2012/10/twitter-ceo-dick-costolo-reveals/-staggering-number-of-tweets-per-day.
[2]
J. Leskovec, L. Backstrom, and J. Kleinberg, "Meme-tracking and the dynamics of the news cycle," in Proc. 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, pp. 497--506.
[3]
J. Kleinberg, "Bursty and hierarchical structure in streams," in Data Mining and Knowledge Discovery, 2003, pp. 373--397.
[4]
D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent Dirichlet Allocation," J. Mach. Learn. Res., vol. 3, pp. 993--1022, 2003.
[5]
X. Wang and A. McCallum, "Topics Over Time: a non-markov continuous-time model of topical trends," in Proc. 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006, pp. 424--433.
[6]
D. M. Blei and J. D. Lafferty, "Dynamic topic models," in Proc. 23rd International Conference on Machine Learning. ACM, 2006, pp. 113--120.
[7]
A. Sopan, P. Rey, B. Butler, and B. Shneiderman, "Monitoring academic conferences: Real-time visualization and retrospective analysis of backchannel conversations," in ASE International Conference on Social Informatics, 2012, pp. 63--69.
[8]
B. Lee, N. H. Riche, A. K. Karlson, and S. Carpendale, "SparkClouds: visualizing trends in tag clouds," IEEE Transactions on Visualization and Computer Graphics, vol. 16, no. 6, pp. 1182--1189, 2010.
[9]
J. J. Kaye, A. Lillie, D. Jagdish, J. Walkup, R. Parada, and K. Mori, "Nokia internet pulse: a long term deployment and iteration of a twitter visualization." ACM, 2012, pp. 829--844.
[10]
A. Don, E. Zheleva, M. Gregory, S. Tarkan, L. Auvil, T. Clement, B. Shneiderman, and C. Plaisant, "Discovering interesting usage patterns in text collections: integrating text mining with visualization," in Proc. 16th ACM conference on Conference on Information and Knowledge Management. ACM, 2007, pp. 213--222.
[11]
W. Dou, X. Wang, R. Chang, and W. Ribarsky, "Paralleltopics: A probabilistic approach to exploring document collections," in 2011 IEEE Conference on Visual Analytics Science and Technology (VAST), 2011, pp. 231--240.
[12]
J. Eisenstein, D. H. Chau, A. Kittur, and E. P. Xing, "TopicViz: interactive topic exploration in document collections," in CHI Extended Abstracts'12, 2012, pp. 2177--2182.
[13]
A. Chaney and D. Blei, "Visualizing topic models." in International AAAI Conference on Social Media and Weblogs, 2012.
[14]
F. B. Viégas, M. Wattenberg, and K. Dave, "Studying cooperation and conflict between authors with history flow visualizations," in Proc. ACM SIGCHI Conference on Human Factors in Computing Systems, 2004, pp. 575--582.
[15]
R. Nallapati, D. Mcfarland, and C. Manning, "TopicFlow Model: Unsupervised learning of topic-specific influences of hyperlinked documents," in Artificial Intelligence and Statistics, 2011.
[16]
S. T. ORourke, R. A. Calvo, and D. S. McNamara, "Visualizing topic flow in students essays," vol. 3, pp. 4--15, 2011.
[17]
S. Havre, B. Hetzler, and L. Nowell, "ThemeRiver: Visualizing theme changes over time," in Proc. IEEE Symposium on Information Visualization, 2000, pp. 115--123.
[18]
W. Cui, S. Liu, L. Tan, C. Shi, Y. Song, Z. Gao, H. Qu, and X. Tong, "TextFlow: Towards better understanding of evolving topics in text," IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 12, pp. 2412--2421, 2011.
[19]
N. Shuyo. (2011) LDA implementation. https://github.com/shuyo/iir/blob/master/lda/lda.py.
[20]
Y. Teh, M. Jordan, B. M. J., and B. D. M., "Hierarchical Dirichlet processes." vol. 101, pp. 1566--1581, 2006.
[21]
P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining, 1st ed. Addison Wesley, 2005.
[22]
Y. Liu, A. Niculescu-Mizil, and W. Gryc, "Topic-link LDA: joint models of topic and author community," in Proc. 26th Annual International Conference on Machine Learning. ACM, 2009, pp. 665--672.
[23]
D. Ramage, D. Hall, R. Nallapati, and C. D. Manning, "Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora," in Proc. 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1. Association for Computational Linguistics, 2009, pp. 248--256.
[24]
J. Lin, "Divergence measures based on the shannon entropy," IEEE Transactions on Information Theory, vol. 37, no. 1, pp. 145--151, 1991.
[25]
M. Nikulin, Hazewinkel, Michiel, Encyclopaedia of mathematics: an updated and annotated translation of the Soviet "Mathematical encyclopaedia. Reidel Sold and distributed in the U.S.A. and Canada by Kluwer Academic Publishers, 2001.
[26]
S. Kullback and R. A. Leibler, "On information and sufficiency," Annals of Mathematical Statistics, vol. 22, pp. 49--86, 1951.
[27]
W. L. O'Brien, "Preliminary investigation of the use of Sankey diagrams to enhance building performance simulation-supported design," in Proc. 2012 Symposium on Simulation for Architecture and Urban Design. Society for Computer Simulation International, 2012, pp. 15:1--15:8.
[28]
M. Bostock. (2012) Data Driven Documents (d3). http://d3js.org.
[29]
S. Hart and L. Staveland, "Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research," Human Mental Workload, vol. 1, pp. 139--183, 1988.
[30]
T. E. Clement, "A thing not beginning and not ending: Using digital tools to distant-read Gertrude Stein's The Making of Americans," Literary and Linguistic Computing, vol. 23, no. 3, pp. 361--381, 2008.
[31]
Y. Hu, J. Boyd-Graber, and B. Satinoff, "Interactive topic modeling," Under Review.
[32]
K. Zhai, J. Boyd-Graber, N. Asadi, and M. Alkhouja, "Mr. LDA: A flexible large scale topic modeling package using variational inference in mapreduce," in ACM International Conference on World Wide Web, 2012.

Cited By

View all
  • (2024)A Novel Method for Technology Roadmapping: NanorobotsApplied Sciences10.3390/app14221060614:22(10606)Online publication date: 18-Nov-2024
  • (2024)Visualizing Temporal Topic Embeddings with a CompassIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345614331:1(272-282)Online publication date: 10-Sep-2024
  • (2023)Toxicity in Evolving Twitter TopicsComputational Science – ICCS 202310.1007/978-3-031-36027-5_4(40-54)Online publication date: 26-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASONAM '13: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
August 2013
1558 pages
ISBN:9781450322409
DOI:10.1145/2492517
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 August 2013

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ASONAM '13
Sponsor:
ASONAM '13: Advances in Social Networks Analysis and Mining 2013
August 25 - 28, 2013
Ontario, Niagara, Canada

Acceptance Rates

Overall Acceptance Rate 116 of 549 submissions, 21%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Novel Method for Technology Roadmapping: NanorobotsApplied Sciences10.3390/app14221060614:22(10606)Online publication date: 18-Nov-2024
  • (2024)Visualizing Temporal Topic Embeddings with a CompassIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345614331:1(272-282)Online publication date: 10-Sep-2024
  • (2023)Toxicity in Evolving Twitter TopicsComputational Science – ICCS 202310.1007/978-3-031-36027-5_4(40-54)Online publication date: 26-Jun-2023
  • (2022)An Interactive Visualization Tool to Explore People’s Tweets towards COVID-19Proceedings of the 2022 International Conference on Advanced Visual Interfaces10.1145/3531073.3534496(1-3)Online publication date: 6-Jun-2022
  • (2022)VisInReport: Complementing Visual Discourse Analytics Through Personalized Insight ReportsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.310402628:12(4757-4769)Online publication date: 1-Dec-2022
  • (2022)Research topic flows in co-authorship networksScientometrics10.1007/s11192-022-04529-w128:9(5051-5078)Online publication date: 11-Oct-2022
  • (2021)Visualising Scientific Topic EvolutionCompanion Proceedings of the Web Conference 202110.1145/3442442.3451371(468-472)Online publication date: 19-Apr-2021
  • (2021)Content-Aware Galaxies: Digital Fingerprints of Discussions on Social MediaIEEE Transactions on Computational Social Systems10.1109/TCSS.2020.30247628:2(294-307)Online publication date: Apr-2021
  • (2021)Comprehensive Survey on Techniques of Topic Evolution Mining2021 6th International Conference on Communication and Electronics Systems (ICCES)10.1109/ICCES51350.2021.9488965(1306-1312)Online publication date: 8-Jul-2021
  • (2021)Social Media Communities Topic ModelingData Science and Intelligent Systems10.1007/978-3-030-90321-3_50(605-614)Online publication date: 17-Nov-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media