skip to main content
10.1145/1645953.1646023acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections

Interactive, topic-based visual text summarization and analysis

Published: 02 November 2009 Publication History


We are building an interactive, visual text analysis tool that aids users in analyzing a large collection of text. Unlike existing work in text analysis, which focuses either on developing sophisticated text analytic techniques or inventing novel visualization metaphors, ours is tightly integrating state-of-the-art text analytics with interactive visualization to maximize the value of both. In this paper, we focus on describing our work from two aspects. First, we present the design and development of a time-based, visual text summary that effectively conveys complex text summarization results produced by the Latent Dirichlet Allocation (LDA) model. Second, we describe a set of rich interaction tools that allow users to work with a created visual text summary to further interpret the summarization results in context and examine the text collection from multiple perspectives. As a result, our work offers two unique contributions. First, we provide an effective visual metaphor that transforms complex and even imperfect text summarization results into a comprehensible visual summary of texts. Second, we offer users a set of flexible visual interaction tools as the alternatives to compensate for the deficiencies of current text summarization techniques. We have applied our work to a number of text corpora and our evaluation shows the promise of the work, especially in support of complex text analyses.


E. Bier, M. Stone, K. Pier, W. Buxton, and T. DeRose. Toolglass and magic lenses: the see-through interface. In SIGGRAPH '93, pages 73--80.
D. Blei, A. Ng, and M. Jordan. Latent dirichlet allocation. J. of Machine Learning Res., 3(5):993--1022, 2003.
L. Byron and M. Wattenberg. Stacked graphs - geometry&aesthetics. In InfoVis '08, pages 1245--1252.
G. Carenini, R. Ng, and X. Zhou. Summarizing email conversations with clue words. In WWW 2007, pages 91--100.
C. Clarke, M. Kolla, G. Cormack, O. Vechtomova, A. Ashkan, S. Buttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR '08, pages 659--666.
A. Don, E. Zheleva, M. Gregory, S. Tarkan, L. Auvil, T. Clement, B. Shneiderman, and C. Plaisant. Discovering interesting usage patterns in text collections: integrating text mining with visualization. In CIKM '07, pages 213--222.
M. Dredze, H. Wallach, D. Puller, and F. Pereira. Generating summary keywords for emails using topics. In IUI '08, pages 199--206.
S. Havre, E. Hetzler, P. Whitney, and L. Nowell. Themeriver: visualizing thematic changes in large document collections. volume 8, pages 9--20, 2002.
M. Hearst. Tilebars: Visualization of term distribution information in full text information access. In CHI '95, pages 59--66.
T. Iwata, T. Yamada, and N. Ueda. Probabilistic latent semantic visualization: Topic model for visualizing documents. In KDD'08, pages 363--371.
B. Kerr. Thread arcs: an email thread visualization. In InfoVis '03, pages 211--218.
M. Luboschik, H. Schumann, and H. Cords. Particle-based labeling: fast point-feature labeling without obscuring other visual features. In InfoVis '08, pages 1237--1244.
A. McCallum, X. Wang, and A. Corrada-Emmanuel. Topic and role discovery in social networks with experiments on enron and academic email. J. of Artificial Intelligence Research, 30:249--272, April 2007.
B. Nardi, S. Whittaker, E. Isaacs, M. Creech, J. Johnson, and J. Hainsworth. Integrating communication and information through contactmap. Comm. ACM, 45(4):89--95, 2002.
A. Perer and M. Smith. Contrasting portraits of email practices: visual approaches to reflection and analysis. In AVI '06, pages 389--395.
E. Rennison. Galaxy of news: An approach to visualizing and understanding expansive news landscapees. In UIST' 94, pages 3--12.
M. Sarkar and M. Brown. Graphical fisheye view. Comm. ACM, 37(12):73--83, 1994.
J. Stasko, C. Gorg, and Z. Liu. Jigsaw: Supporting investigative analysis through interactive visualization. Information Visualization, 7(2):118--132, 2008.
G. Venolia and C. Neustaedter. Understanding sequence and reply relationships within email conversations: a mixed-model visualization. In CHI '03, pages 361--368.
F. Viegas, S. Golder, and J. Donath. Visualizing email content: portraying relationships from conversational histories. In CHI '06, pages 979--988.
S. Wan and K. McKeown. Generating overview summaries of ongoing email thread discussions. In COLING '04, pages 549--555.
D. Wang, T. Li, S. Zhu, and C. Ding. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In SIGIR '08, pages 307--314.
M. Wattenberg and F. Viegas. The word tree, an interactive visual concordance. In InfoVis '08, pages 1221--1228.
J. Wise, J. Thomas, K. Pennock, D. Lantrip, M. Pottier, A. Schur, and V. Crow. Visualizing the non-visual: spatial analysis and interaction with information from text documents. In InfoVis '95, pages 51--58.

Cited By

View all
  • (2024)Amplifying the music listening experience through song comments on music streaming platformsJournal of Visualization10.1007/s12650-024-00966-227:3(401-419)Online publication date: 10-Mar-2024
  • (2023)ContextWing: Pair-wise Visual Comparison for Evolving Sequential Patterns of Contexts in Social Media Data StreamsProceedings of the ACM on Human-Computer Interaction10.1145/35794737:CSCW1(1-31)Online publication date: 16-Apr-2023
  • (2022)Real-Time Visual Analysis of High-Volume Social Media PostsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.311480028:1(879-889)Online publication date: 1-Jan-2022
  • Show More Cited By



Information & Contributors


Published In

cover image ACM Conferences
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management
November 2009
2162 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]



Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2009


Request permissions for this article.

Check for updates

Author Tags

  1. interactive text visualization
  2. stacked graph
  3. text summarization
  4. topic model


  • Research-article


CIKM '09

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)7
Reflects downloads up to 20 Feb 2025

Other Metrics


Cited By

View all
  • (2024)Amplifying the music listening experience through song comments on music streaming platformsJournal of Visualization10.1007/s12650-024-00966-227:3(401-419)Online publication date: 10-Mar-2024
  • (2023)ContextWing: Pair-wise Visual Comparison for Evolving Sequential Patterns of Contexts in Social Media Data StreamsProceedings of the ACM on Human-Computer Interaction10.1145/35794737:CSCW1(1-31)Online publication date: 16-Apr-2023
  • (2022)Real-Time Visual Analysis of High-Volume Social Media PostsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.311480028:1(879-889)Online publication date: 1-Jan-2022
  • (2021)Optimal layout of stacked graph for visualizing multidimensional financial time series dataInformation Visualization10.1177/1473871621104500521:1(63-73)Online publication date: 14-Sep-2021
  • (2021)Towards Understanding How Readers Integrate Charts and Captions: A Case Study with Line ChartsProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445443(1-11)Online publication date: 6-May-2021
  • (2019)Data Linkage Discovery ApplicationsAdvanced Methodologies and Technologies in Network Architecture, Mobile Computing, and Data Analytics10.4018/978-1-5225-7598-6.ch026(354-366)Online publication date: 2019
  • (2019)Semantics-Space-Time Cube. A Conceptual Framework for Systematic Analysis of Texts in Space and TimeIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2018.2882449(1-1)Online publication date: 2019
  • (2019)ConceptMap: A Conceptual Approach for Formulating User Preferences in Large Information SpacesWeb Information Systems Engineering – WISE 201910.1007/978-3-030-34223-4_49(779-794)Online publication date: 29-Oct-2019
  • (2018)Unsupervised keyword extraction from microblog posts via hashtagsJournal of Web Engineering10.5555/3370048.337005317:1-2(93-120)Online publication date: 1-Mar-2018
  • (2018)Data Linkage Discovery ApplicationsEncyclopedia of Information Science and Technology, Fourth Edition10.4018/978-1-5225-2255-3.ch155(1783-1793)Online publication date: 2018
  • Show More Cited By

View Options

Login options

View options


View or Download as a PDF file.



View online with eReader.







Share this Publication link

Share on social media