skip to main content
10.1145/1835804.1835827acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

TIARA: a visual exploratory text analytic system

Published: 25 July 2010 Publication History

Abstract

In this paper, we present a novel exploratory visual analytic system called TIARA (Text Insight via Automated Responsive Analytics), which combines text analytics and interactive visualization to help users explore and analyze large collections of text. Given a collection of documents, TIARA first uses topic analysis techniques to summarize the documents into a set of topics, each of which is represented by a set of keywords. In addition to extracting topics, TIARA derives time-sensitive keywords to depict the content evolution of each topic over time. To help users understand the topic-based summarization results, TIARA employs several interactive text visualization techniques to explain the summarization results and seamlessly link such results to the original text. We have applied TIARA to several real-world applications, including email summarization and patient record analysis. To measure the effectiveness of TIARA, we have conducted several experiments. Our experimental results and initial user feedback suggest that TIARA is effective in aiding users in their exploratory text analytic tasks.

Supplementary Material

JPG File (kdd2010_wei_tiara_01.jpg)
MOV File (kdd2010_wei_tiara_01.mov)

References

[1]
D. Blei and J. Lafferty. Topic Models, chapter: Topic Models. Taylor and Francis, 2009. (in Press).
[2]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, 2003.
[3]
C. Carpineto, S. Osinski, G. Romano, and D. Weiss. A survey of web clustering engines. ACM Comput. Surv., 41(3):1--38, 2009.
[4]
Y. Chen, L. Wang, M. Dong, and J. Hua. Exemplar-based visualization of large document corpus. IEEE Trans. Vis. Comput. Graph., 15(6):1161--1168, 2009.
[5]
E. Clarkson, K. Desai, and J. D. Foley. Resultmaps: Visualization for search interfaces. IEEE Trans. Vis. Comput. Graph., 15(6):1057--1064, 2009.
[6]
M. Dredze, H. M. Wallach, D. Puller, and F. Pereira. Generating summary keywords for emails using topics. In IUI, pages 199--206, 2008.
[7]
T. L. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101(Suppl. 1):5228--5235, April 2004.
[8]
F. V. Ham, M. Wattenberg, and F. B. Viegas. Mapping text with phrase nets. In InfoVis, pages 1169--1176, 2009.
[9]
S. Havre, E. Hetzler, P. Whitney, and L. Nowell. Themeriver: Visualizing thematic changes in large document collections. IEEE Trans. Vis. Comput. Graph., 8(1):9--20, 2002.
[10]
T. Iwata, T. Yamada, and N. Ueda. Probabilistic latent semantic visualization: topic model for visualizing documents. In KDD, pages 363--371, 2008.
[11]
A. Jain, M. Murty, and P. Flynn. Data clustering: A review. ACM Computing Surveys, 31(3):264--323, 1999.
[12]
W. Ke, C. R. Sugimoto, and J. Mostafa. Dynamicity vs. effectiveness: studying online clustering for scatter/gather. In SIGIR, pages 19--26, 2009.
[13]
B. Lee, G. Smith, G. G. Robertson, M. Czerwinski, and D. S. Tan. Facetlens: exposing trends and relationships to support sensemaking within faceted datasets. In CHI, pages 1293--1302, 2009.
[14]
J. Leskovec, L. Backstrom, and J. Kleinberg. Meme-tracking and the dynamics of the news cycle. In KDD, pages 497--506, 2009.
[15]
S. Liu, M. X. Zhou, S. Pan, W. Qian, W. Cai, and X. Lian. Interactive, topic-based visual text summarization and analysis. In CIKM, pages 543--552, 2009.
[16]
A. K. McCallum. Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu, 2002.
[17]
Q. Mei and C. Zhai. Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In KDD, pages 198--207, 2005.
[18]
P. Pirolli, P. Schank, M. Hearst, and C. Diehl. Scatter/gather browsing communicates the topic structure of a very large text collection. In CHI, pages 213--220, 1996.
[19]
G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, New York, NY, USA, 1986.
[20]
G. Smith, M. Czerwinski, B. Meyers, D. C. Robbins, G. G. Robertson, and D. S. Tan. Facetmap: A scalable search and browse visualization. IEEE Trans. Vis. Comput. Graph., 12(5):797--804, 2006.
[21]
Y. Song, S. Pan, S. Liu, F. Wei, M. X. Zhou, and W. Qian. Constrained co-clustering for textual documents. In AAAI, 2010.
[22]
Y. Song, S. Pan, S. Liu, M. X. Zhou, and W. Qian. Topic and keyword re-ranking for LDA-based topic modeling. In CIKM, pages 1757--1760, 2009.
[23]
X. Wang and A. McCallum. Topics over time: a non-Markov continuous-time model of topical trends. In KDD, pages 424--433, 2006.
[24]
M. Wattenberg and F. B. Viegas. The word tree, aninteractive visual concordance. In InfoVis, pages 1221--1228, 2008.
[25]
X. Wei and W. B. Croft. LDA-based document models for ad-hoc retrieval. In SIGIR, pages 178--185, 2006.
[26]
S. Xu, T. Jin, and F. C. M. Lau. A new visual search interface for web browsing. In WSDM, pages 152--161, 2009.
[27]
J. Zhang, Y. Song, C. Zhang, and S. Liu. Evolutionary hierarchical Dirichlet processes for multiple correlated time-varying corpora. In KDD, 2010.
[28]
C. Ziemkiewicz and R. Kosara. Preconceptions and individual differences in understanding visual metaphors. Comput. Graph. Forum, 28(3):911--918, 2009.

Cited By

View all
  • (2025)"My Very Subjective Human Interpretation": Domain Expert Perspectives on Navigating the Text Analysis Loop for Topic ModelsProceedings of the ACM on Human-Computer Interaction10.1145/37012019:1(1-30)Online publication date: 10-Jan-2025
  • (2024)TextVista: NLP-Enriched Time-Series Text Data VisualizationsProceedings of the 50th Graphics Interface Conference10.1145/3670947.3670971(1-12)Online publication date: 3-Jun-2024
  • (2024)A New Perspective for the Study of "Five Thousand Years of Chinese History ": Visual Analysis of Single Text and Time Series Text2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825897(8793-8797)Online publication date: 15-Dec-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
July 2010
1240 pages
ISBN:9781450300551
DOI:10.1145/1835804
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. document summarization
  2. exploratory analytics
  3. interactive visualization
  4. topic analysis
  5. visual text analytics

Qualifiers

  • Research-article

Conference

KDD '10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)55
  • Downloads (Last 6 weeks)6
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)"My Very Subjective Human Interpretation": Domain Expert Perspectives on Navigating the Text Analysis Loop for Topic ModelsProceedings of the ACM on Human-Computer Interaction10.1145/37012019:1(1-30)Online publication date: 10-Jan-2025
  • (2024)TextVista: NLP-Enriched Time-Series Text Data VisualizationsProceedings of the 50th Graphics Interface Conference10.1145/3670947.3670971(1-12)Online publication date: 3-Jun-2024
  • (2024)A New Perspective for the Study of "Five Thousand Years of Chinese History ": Visual Analysis of Single Text and Time Series Text2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825897(8793-8797)Online publication date: 15-Dec-2024
  • (2024)Amplifying the music listening experience through song comments on music streaming platformsJournal of Visualization10.1007/s12650-024-00966-227:3(401-419)Online publication date: 10-Mar-2024
  • (2023)BiverWordle: Visualizing Stock Market Sentiment with Financial Text Data and TrendsProceedings of the 16th International Symposium on Visual Information Communication and Interaction10.1145/3615522.3615541(1-5)Online publication date: 22-Sep-2023
  • (2023)CoArgue : Fostering Lurkers’ Contribution to Collective Arguments in Community-based QA PlatformsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580932(1-17)Online publication date: 19-Apr-2023
  • (2023)TopicBubbler: An interactive visual analytics system for cross-level fine-grained exploration of social media dataVisual Informatics10.1016/j.visinf.2023.08.0027:4(41-56)Online publication date: Dec-2023
  • (2022)PlanHelperProceedings of the ACM on Human-Computer Interaction10.1145/35555556:CSCW2(1-26)Online publication date: 11-Nov-2022
  • (2022)VisInReport: Complementing Visual Discourse Analytics Through Personalized Insight ReportsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.310402628:12(4757-4769)Online publication date: 1-Dec-2022
  • (2022)DeFVIS: Visual Analysis of Microscopic Parameters and Macroscopic Properties of Fluoropolymers2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI)10.1109/PRAI55851.2022.9904043(1292-1298)Online publication date: 19-Aug-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media