ABSTRACT
This paper presents a new Bayesian topical trend analysis. We regard the parameters of topic Dirichlet priors in latent Dirichlet allocation as a function of document timestamps and optimize the parameters by a gradient-based algorithm. Since our method gives similar hyperparameters to the documents having similar timestamps, topic assignment in collapsed Gibbs sampling is affected by timestamp similarities. We compute TFIDF-based document similarities by using a result of collapsed Gibbs sampling and evaluate our proposal by link detection task of Topic Detection and Tracking.
- TDT4 data set, http://projects.ldc.upenn.edu/tdt4/.Google Scholar
- J. Allan, V. Lavrenko, and R. Nallapati. UMass at TDT 2002. In Notebook Proceedings of TDT 2002 Workshop, 2003.Google Scholar
- D. M. Blei and J. D. Lafferty. Dynamic topic models. In Proceedings of ICML'06, pages 113--120, 2006. Google ScholarDigital Library
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003. Google ScholarDigital Library
- W.-Y. Chen, D. Zhang, and E. Y. Chang. Combinational collaborative filtering for personalized community recommendation. In Proceedings of KDD'08, pages 115--123, 2008. Google ScholarDigital Library
- T. L. Griffiths and M. Steyvers. Finding scientific topics. Proc. Natl. Acad. Sci., 101 Suppl 1:5228--5235, 2004.Google ScholarCross Ref
- T. Huynh, M. Fritz, and B. Schiele. Discovery of activity patterns using topic models. In Proceedings of UbiComp'08, pages 10--19, 2008. Google ScholarDigital Library
- E. Linstead, P. Rigor, S. Bajracharya, C. Lopes, and P. Baldi. Mining internet-scale software repositories. In NIPS 20, pages 929--936. 2008.Google Scholar
- D.-C. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(1-3):503--528, 1989. Google ScholarDigital Library
- R. M. Nallapati, S. Ditmore, J. D. Lafferty, and K. Ung. Multiscale topic tomography. In Proceedings of KDD'07, pages 520--529, 2007. Google ScholarDigital Library
- J. Nocedal. Updating quasi-Newton matrices with limited storage. Mathematics of Computation, 35(151):773--782, 1980.Google ScholarCross Ref
- C. Shah, W. B. Croft, and D. Jensen. Representing documents with named entities for story link detection (SLD). In Proceedings of CIKM'06, pages 868--869, 2006. Google ScholarDigital Library
- C. Wang, D. Blei, and D. Heckerman. Continuous time dynamic topic models. In Proceedings of UAI'08, pages 579--586, 2008.Google Scholar
- C.-H. Wang, L. Zhang, and H.-J. Zhang. Learning to reduce the semantic gap in Web image retrieval and annotation. In Proceedings of SIGIR'08, pages 355--362, 2008. Google ScholarDigital Library
- X.-R. Wang and A. McCallum. Topics over time: A non-Markov continuous-time model of topical trends. In Proceedings of KDD'06, pages 424--433, 2006. Google ScholarDigital Library
- X. Wei and W. B. Croft. LDA-based document models for ad--hoc retrieval. In Proceedings of SIGIR'06, pages 178--185, 2006. Google ScholarDigital Library
Index Terms
- Dynamic hyperparameter optimization for bayesian topical trend analysis
Recommendations
Topics over time: a non-Markov continuous-time model of topical trends
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningThis paper presents an LDA-style topic model that captures not only the low-dimensional structure of data, but also how the structure changes over time. Unlike other recent work that relies on Markov assumptions or discretization of time, here each ...
A hybrid term-term relations analysis approach for topic detection
We extract co-occurrence term relations using IdeaGraph.We extract semantic term relations using topic model.We fuse multiple types of relations to form a coupled term graph.We extract topics from the graph using a graph analytical approach. Topic ...
Understanding Sparse Topical Structure of Short Text via Stochastic Variational-Gibbs Inference
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementWith the soaring popularity of online social media like Twitter, analyzing short text has emerged as an increasingly important task which is challenging to classical topic models, as topic sparsity exists in short text. Topic sparsity refers to the ...
Comments