skip to main content
10.1145/1835804.1835889acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Online multiscale dynamic topic models

Published: 25 July 2010 Publication History

Abstract

We propose an online topic model for sequentially analyzing the time evolution of topics in document collections. Topics naturally evolve with multiple timescales. For example, some words may be used consistently over one hundred years, while other words emerge and disappear over periods of a few days. Thus, in the proposed model, current topic-specific distributions over words are assumed to be generated based on the multiscale word distributions of the previous epoch. Considering both the long-timescale dependency as well as the short-timescale dependency yields a more robust model. We derive efficient online inference procedures based on a stochastic EM algorithm, in which the model is sequentially updated using newly obtained data; this means that past data are not required to make the inference. We demonstrate the effectiveness of the proposed method in terms of predictive performance and computational efficiency by examining collections of real documents with timestamps.

Supplementary Material

JPG File (kdd2010_iwata_omdtm_01.jpg)
MOV File (kdd2010_iwata_omdtm_01.mov)

References

[1]
L.AlSumait, D.Barbara, and C.Domeniconi. On-line LDA: Adaptive topic models for mining text streams with applications to topic detection and tracking. In ICDM '08, pages 3--12, 2008.
[2]
C.Andrieu, N.deFreitas, A.Doucet, and M.I. Jordan. An introduction to MCMC for machine learning. Machine Learning, 50(1):5--43, 2003.
[3]
A.Asuncion, M.Welling, P.Smyth, and Y.W. Teh. On smoothing and inference for topic models. In UAI '09, pages 27--34, 2009.
[4]
A.Banerjee and S.Basu. Topic models over text streams: A study of batch and online unsupervised learning. In SDM '07, 2007.
[5]
D.M. Blei and J.D. Lafferty. Dynamic topic models. In ICML '06, pages 113--120, 2006.
[6]
D.M. Blei, A.Y. Ng, and M.I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.
[7]
K.R. Canini, L.Shi, and T.L. Griffiths. Online inference of topics with latent Dirichlet allocation. In AISTATS '09, volume5, pages 65--72, 2009.
[8]
T.L. Griffiths and M.Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101 Suppl 1:5228--5235, 2004.
[9]
T.Hofmann. Probabilistic latent semantic analysis. In UAI '99, pages 289--296, 1999.
[10]
T.Hofmann. Collaborative filtering via Gaussian probabilistic latent semantic analysis. In SIGIR '03, pages 259--266, 2003.
[11]
T.Iwata, S.Watanabe, T.Yamada, and N.Ueda. Topic tracking model for analyzing consumer purchase behavior. In IJCAI '09, pages 1427--1432, 2009.
[12]
T.Iwata, T.Yamada, and N.Ueda. Probabilistic latent semantic visualization: topic model for visualizing documents. In KDD '08, pages 363--371, 2008.
[13]
T.Minka. Estimating a Dirichlet distribution. Technical report, M.I.T., 2000.
[14]
R.Nallapati, W.Cohen, S.Ditmore, J.Lafferty, and K.Ung. Multiscale topic tomography. In KDD '07, pages 520--529, 2007.
[15]
S.Papadimitriou, J.Sun, and C.Faloutsos. Streaming pattern discovery in multiple time-series. In VLDB '05, pages 697--708, 2005.
[16]
L.Ren, D.B. Dunson, and L.Carin. The dynamic hierarchical Dirichlet process. In ICML '08, pages 824--831, 2008.
[17]
M.Stephens. Dealing with label switching in mixture models. Journal of the Royal Statistical Society B, 62:795--809, 2000.
[18]
Y.W. Teh, M.I. Jordan, M.J. Beal, and D.M. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476):1566--1581, 2006.
[19]
H.M. Wallach. Topic modeling: Beyond bag-of-words. In ICML '06, pages 997--984, 2006.
[20]
C.Wang, D.M. Blei, and D. Heckerman. Continuous time dynamic topic models. In UAI '08, pages 579--586, 2008.
[21]
X.Wang and A.McCallum. Topics over time: a non-Markov continuous-time model of topical trends. In KDD '06, pages 424--433, 2006.
[22]
X.Wei, J.Sun, and X.Wang. Dynamic mixture models for multiple time-series. In IJCAI '07, pages 2909--2914, 2007.

Cited By

View all
  • (2024)Sparseness-constrained nonnegative tensor factorization for detecting topics at different time scalesFrontiers in Applied Mathematics and Statistics10.3389/fams.2024.128707410Online publication date: 22-Jul-2024
  • (2024)Learning Joint Topic Representation for Detecting Drift in Social Media TextInternational Journal of Uncertainty, Fuzziness and Knowledge-Based Systems10.1142/S021848852450024732:06(955-983)Online publication date: 21-Oct-2024
  • (2024)Bi-channel dynamic topic model for quality monitoring considering initial and additional online customer reviewsIISE Transactions10.1080/24725854.2024.2414394(1-25)Online publication date: 9-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
July 2010
1240 pages
ISBN:9781450300551
DOI:10.1145/1835804
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. online learning
  2. time-series analysis
  3. topic model

Qualifiers

  • Research-article

Conference

KDD '10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)59
  • Downloads (Last 6 weeks)9
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Sparseness-constrained nonnegative tensor factorization for detecting topics at different time scalesFrontiers in Applied Mathematics and Statistics10.3389/fams.2024.128707410Online publication date: 22-Jul-2024
  • (2024)Learning Joint Topic Representation for Detecting Drift in Social Media TextInternational Journal of Uncertainty, Fuzziness and Knowledge-Based Systems10.1142/S021848852450024732:06(955-983)Online publication date: 21-Oct-2024
  • (2024)Bi-channel dynamic topic model for quality monitoring considering initial and additional online customer reviewsIISE Transactions10.1080/24725854.2024.2414394(1-25)Online publication date: 9-Oct-2024
  • (2024)ANTM: Aligned Neural Topic Models for Exploring Evolving TopicsTransactions on Large-Scale Data- and Knowledge-Centered Systems LVI10.1007/978-3-662-69603-3_3(76-97)Online publication date: 21-Jul-2024
  • (2023)Detecting Favorite Topics in Computing Scientific Literature via Dynamic Topic ModelingIEEE Access10.1109/ACCESS.2023.326966011(41535-41545)Online publication date: 2023
  • (2022)Identification of topic evolution: network analytics with piecewise linear representation and word embeddingScientometrics10.1007/s11192-022-04273-1127:9(5353-5383)Online publication date: 1-Sep-2022
  • (2022)Dynamic Topic-Noise Models for Social MediaAdvances in Knowledge Discovery and Data Mining10.1007/978-3-031-05936-0_34(429-443)Online publication date: 16-May-2022
  • (2021)Predicting standardized absolute returns using rolling-sample textual modellingPLOS ONE10.1371/journal.pone.026013216:12(e0260132)Online publication date: 7-Dec-2021
  • (2021)Recurrent Coupled Topic Modeling over Sequential DocumentsACM Transactions on Knowledge Discovery from Data10.1145/345153016:1(1-32)Online publication date: 20-Jul-2021
  • (2020)A Micro Perspective of Research Dynamics Through “Citations of Citations” Topic AnalysisJournal of Data and Information Science10.2478/jdis-2020-00345:4(19-34)Online publication date: 28-Jul-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media