Abstract
Time stamped texts or text sequences are ubiquitous in real life, such as news reports. Tracking the topic evolution of these texts has been an issue of considerable interest. Recent work has developed methods of tracking topic shifting over long time scales. However, most of these researches focus on a large corpus. Also, they only focus on the text itself and no attempt have been made to explore the temporal distribution of the corpus, which could provide meaningful and comprehensive clues for topic tracking. In this paper, we formally address this problem and put forward a novel method based on the topic model. We investigate the temporal distribution of news reports of a specific event and try to integrate this information with a topic model to enhance the performance of topic model. By focusing on a specific news event, we try to reveal more details about the event, such as, how many stages are there in the event, what aspect does each stage focus on, etc.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Michal, R.-Z., Thomas, G., Mark, S., Padhraic, S.: The author-topic model for authors and documents. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 487–494. AUAI Press (2004)
Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine learning, ICML 2006, pp. 113–120. ACM, New York (2006)
Wang, X., McCallum, A.: Topics over time: A non-Markov continuous-time model of topical trends. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 424–433. ACM (2006)
Chong, W., David, B., David, H.: Continuous Time Dynamic Topic Models. In: Proceedings of the 24th Conference in Uncertainty in Artificial Intelligence (UAI) (2008)
Ahmed, A., Xing, E.P.: Timeline: A dynamic hierarchical Dirichlet process model for recovering birth/death and evolution of topics in text stream. arXiv preprint arXiv:1203.3463 (2012)
Tang, S., Zhang, Y., Wang, H., Chen, M., Wu, F., Zhuang, Y.: The discovery of burst topic and its intermittent evolution in our real world. Communications, China 10(3), 1–12 (2013)
Zehnalova, S., Horak, Z., Kudelka, M., Snasel, V.: Evolution of Author’s Topic in Authorship Network. In: Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012), pp. 1207–1210. IEEE Computer Society (2012)
Lin, C., Lin, C., Li, J., Wang, D., Chen, Y., Li, T.: Generating event storylines from microblogs. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 175–184. ACM (2012)
Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: An efficient k-means clustering algorithm: Analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), 881–892 (2002)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America 101(suppl. 1), 5228–5235 (2004)
Heinrich, G.: Parameter estimation for text analysis (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, J., Liu, X., Wang, J., Zhao, W. (2014). News Topic Evolution Tracking by Incorporating Temporal Information. In: Zong, C., Nie, JY., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2014. Communications in Computer and Information Science, vol 496. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45924-9_43
Download citation
DOI: https://doi.org/10.1007/978-3-662-45924-9_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45923-2
Online ISBN: 978-3-662-45924-9
eBook Packages: Computer ScienceComputer Science (R0)