skip to main content
10.1145/3106426.3106455acmconferencesArticle/Chapter ViewAbstractPublication PageswiConference Proceedingsconference-collections
research-article

Constructing and visualizing topic forests for text streams

Published: 23 August 2017 Publication History

Abstract

A great deal of such texts as news and blog articles, web pages, and scientific literature are posted on the web as time goes by, and are generally called time-series documents or text streams. For each document, some strongly or weakly relevant texts exist. Although such relevance is represented as citations among scientific literatures, trackback among blog articles, hyperlinks among Wikipedia articles or web pages and so on, the relevance among news articles is not always clearly specified. One easy way to build a similarity network is by calculating the similarity among news articles and making links among similar articles; however, adding information about the posted times of articles to a similarity network is difficult. To overcome this problem, we propose a framework that consists of two parts: 1) tree structures called Topic Forests and 2) their visualization. Topic Forests are constructed by semantically and temporally linking cohesive texts while preserving their posted order. We provide effective access for users to text streams by embedding Topic Forests over the polar coordinates with a technique called Polar Coordinate Embedding. From experimental evaluations using the actual text streams of news articles, we confirm that Topic Forests semantically and temporally maintain cohesiveness, and Polar Coordinate Embedding achieves effective accessibility.

References

[1]
J. Alsakran, Y. Chen, D. Luo, Y. Zhao, J. Yang, W. Dou, and S. Liu. 2012. Real-Time Visualization of Streaming Text with a Force-Based Dynamic System. IEEE Comput. Graph. Appl. 32, 1 (Jan. 2012), 34--45.
[2]
M. Belkin and P. Niyogi. 2001. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. In Advances in Neural Information Processing Systems 14. MIT Press, 585--591.
[3]
T. Fushimi, Y. Kubota, K. Saito, M. Kimura, K. Ohara, and H. Motoda. 2011. AI 2011: Advances in Artificial Intelligence: 24th Australasian Joint Conference, Perth, Australia, December 5-8, 2011. Proceedings. Springer Berlin Heidelberg, Chapter Speeding Up Bipartite Graph Visualization Method, 697--706.
[4]
Y. Ishikawa and M. Hasegawa. 2007. T-Scroll: Visualizing Trends in a Time-Series of Documents for Interactive User Exploration. Springer Berlin Heidelberg, Berlin, Heidelberg, 235--246.
[5]
T. Iwata, K. Saito, N. Ueda, S. Stromsten, T. L. Griffiths, and J. B. Tenenbaum. 2005. Parametric Embedding for Class Visualization. In Advances in Neural Information Processing Systems 17, L. K. Saul, Y. Weiss, and L. Bottou (Eds.). MIT Press, 617--624.
[6]
T. Kamada and S. Kawai. 1989. An algorithm for drawing general undirected graphs. Inf. Process. Lett. 31 (April 1989), 7--15. Issue 1.
[7]
D. A. Keim, D. Luo, J. Yang, W. Ribarsky, and M. Krstajic. 2010. EventRiver: Visually Exploring Text Collections with Temporal References. IEEE Transactions on Visualization & Computer Graphics 18 (2010), 93--105.
[8]
J. M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46 (September 1999), 604--632. Issue 5.
[9]
M. Krstajic, E. Bertini, and D. A. Keim. 2011. CloudLines: Compact Display of Event Episodes in Multiple Time-Series. IEEE Trans. Vis. Comput. Graph. 17, 12 (2011), 2432--2439.
[10]
A. N. Langville and C. D. Meyer. 2004. Deeper Inside PageRank. Internet Mathematics 1, 3 (2004), 335--380.
[11]
J. Leskovec, L. Backstrom, and J. M. Kleinberg. 2009. Meme-tracking and the dynamics of the news cycle. In KDD. 497--506.
[12]
S. T. Roweis and L. K. Saul. 2000. Nonlinear dimensionality reduction by locally linear embedding. SCIENCE 290 (2000), 2323--2326.
[13]
G. Salton, A. Wong, and C. S. Yang. 1975. A Vector Space Model for Automatic Indexing. Commun. ACM 18, 11 (Nov. 1975), 613--620.
[14]
A. Šilić and B. D. Bašić. 2010. Visualization of Text Streams: A Survey. Springer Berlin Heidelberg, Berlin, Heidelberg, 31--43.
[15]
J. B. Tenenbaum, V. Silva, and J. C. Langford. 2000. A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science 290, 5500 (2000), 2319--2323.
[16]
W. Torgerson. 1952. Multidimensional scaling: I. Theory and method. Psychometrika 17 (1952), 401--419. Issue 4.
[17]
L.J.P. van der Maaten and G. E. Hinton. 2008. Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research 9 (2008), 2579--2605.
[18]
T. Yamada, K. Saito, and N. Ueda. 2003. Cross-entropy directed embedding of network data. In Proceedings of the 20th International Conference on Machine Learning (ICML03). 832--839.

Cited By

View all
  • (2018)Dynamic Visualization of Citation Networks and Detection of Influential Node AdditionComplex Networks IX10.1007/978-3-319-73198-8_25(291-302)Online publication date: 15-Feb-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WI '17: Proceedings of the International Conference on Web Intelligence
August 2017
1284 pages
ISBN:9781450349512
DOI:10.1145/3106426
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. text stream
  2. tree structure
  3. visualization

Qualifiers

  • Research-article

Funding Sources

  • JSPS

Conference

WI '17
Sponsor:

Acceptance Rates

WI '17 Paper Acceptance Rate 118 of 178 submissions, 66%;
Overall Acceptance Rate 118 of 178 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Dynamic Visualization of Citation Networks and Detection of Influential Node AdditionComplex Networks IX10.1007/978-3-319-73198-8_25(291-302)Online publication date: 15-Feb-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media