A Dynamic Probabilistic Model to Visualise Topic Evolution in Text Streams

Kabán, Ata; Girolami, Mark A.

doi:10.1023/A:1013673310093

A Dynamic Probabilistic Model to Visualise Topic Evolution in Text Streams

Published: March 2002

Volume 18, pages 107–125, (2002)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Ata Kabán¹ &
Mark A. Girolami¹

291 Accesses
6 Altmetric
Explore all metrics

Abstract

We propose a novel probabilistic method, based on latent variable models, for unsupervised topographic visualisation of dynamically evolving, coherent textual information. This can be seen as a complementary tool for topic detection and tracking applications. This is achieved by the exploitation of the a priori domain knowledge available, that there are relatively homogeneous temporal segments in the data stream. In a different manner from topographical techniques previously utilized for static text collections, the topography is an outcome of the coherence in time of the data stream in the proposed model. Simulation results on both toy-data settings and an actual application on Internet chat line discussion analysis is presented by way of demonstration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visual Analysis of Topical Evolution in Unstructured Text: Design and Evaluation of TopicFlow

The dynamic stochastic topic block model for dynamic networks with textual edges

Article 15 September 2018

A decade of research in statistics: a topic model approach

Article 12 March 2015

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Allan, J., Carbonell, J., Doddington, G., Yamron, J., and Yang, Y. (1998). Topic Detection and Tracking Pilot Study Final Report. In Proc. of DARPA Broadcast News Transcription and Understanding Workshop, Feb. 1998 (pp. 194–218).
Attias, H. (1999). Independent Factor Analysis. Neural Computation, 11(4), 803–851.
Google Scholar
Beeferman, D., Berger, A., and Lafferty, J. (1999). Statistical Models for Text Segmentation. In C. Cardie and R. Mooney (Eds.), Machine Learning, Special Issue on Natural Language Learning, 34(1–3), 177–210.
Google Scholar
Bishop, C.M., Hinton, G.E., and Strachan, I.G.D. (1997). GTM Through Time. In Proc. IEE Fifth International Conference on Artificial Neural Networks, IEE, London (pp. 111–116).
Google Scholar
Bishop, C.M., Svensen, M., and Williams, C.K.I. (1998). GTM: The Generative Topographic Mapping. Neural Computation, 10(1), 215–235.
Google Scholar
Deerwester, S., Dumais, S.-T., Furnas, G.-W., Landauer, T.-K., and Harshman, R. (1990). Indexing by Latent Semantic Analysis. J. Amer. Soc. Inf. Sci, 41(6), 391–407.
Google Scholar
Ghahramani, Z. and Beal, M.J. (To appear). Graphical Models and Variational Methods. In Saad &; Opper (Eds.), Advanced Mean Field Method—Theory and Practice. Cambridge, MA: MIT Press.
Girolami, M. (2001). Latent Class and Trait Models for Data Classification and Visualisation. Invited Chapter for book ‘ICA: Principles and Practice’, Cambridge University Press.
Hollmén, J. and Tresp, V. (1999). Call-Based Fraud Detection in Mobile Communications Networks Using a Hierarchical Regime-Switching Model. In M. Kearns, S. Solla, and D.A. Cone (Eds.), Neural Information Processing Systems, Vol. 11 (pp. 889–895). Cambridge, MA: MIT Press.
Google Scholar
Hyvarinen, A. (To appear). Complexity Pursuit: Separating Interesting Components from Time-Series. Neural Computation.
Jebara, T., Ivanov, Y., Rahimi, A., and Pentland, A. (2000). Tracking Conversational Context for Machine Mediation of Human Discourse. In AAAI Fall 2000 Symposium—Socially Intelligent Agents—The Human in the Loop, Nov. 2000.
Kabán, A. and Girolami, M. (in press). A Combined Latent Class and Trait Model for the analysis and visualization of Discrete Data. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Kimber, D. and Bush, M. (1993). Situated State Hidden Markov Models. ICASSP.
Lagus, K., Honkela, T., Kaski, S., and Kohonen, T. (1999). WEBSOM for Textual Data Mining. Artificial Intelligence Review, 13(5/6), 345–364.
Google Scholar
McCallum, A. and Nigam, K. (1998). A Comparison of Event Models for Naive Bayes Text Classification. In Proc. of AAAI/ICML-98 Workshop on Learning for Text Categorization (pp. 41–48).
McCullagh, P. and Nelder, L.A. (1985). Generalized Linear Models. London: Chapman and Hall.
Google Scholar
McLachlan, G. and Peel, D. (2000). Finite Mixture Models. New York: John Wiley &; Sons.
Google Scholar
Rabiner, L.R. (1989). A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. In Proc. of the IEEE, 77(2), 257–285.
Google Scholar
Roweis, S. (1999). Constrained Hidden Markov Models. In Neural Information Processing Systems, Vol. 12 (NIPS'99) (pp. 782–788).
Google Scholar
Sahami, M. (1998). Using Machine Learning to Improve Information Access. Ph.D. Thesis, Stanford University.
Salton, G. and McGill, M. (1983). Introduction to Modern Information Retrieval. New York: McGraw-Hill.
Google Scholar
Sammon, J.W. (1969). A Nonlinear Mapping for Data Structure Analysis. IEEE Transactions on Computers, C-18(5), 401–409.
Google Scholar
Saul, L. and Roweis, S. (2000). Nonlinear Dimensionality Reduction by Local Linear Embedding. Science.
Tenenbaum, J.B. (1997). Mapping a Manifold of Perceptual Observations. In Advances in Neural Information Processing Systems, Vol. 10 (NIPS'97).
Valpola, H. (2000). Unsupervised Learning of Nonlinear Dynamic State-Space Models, Publications in Computer and Information Science A59, Helsinki University of Technology, Espoo, Finland.
Google Scholar
Yamron, J. (1998). Topic Detection and Tracking Segmentation Task. In Proc. of Broadcast News Transcription and Understanding Workshop.

Download references

Author information

Authors and Affiliations

School of Information and Communications Technology, University of Paisley, Paisley, PA1 2BE, Scotland
Ata Kabán & Mark A. Girolami

Authors

Ata Kabán
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Girolami
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kabán, A., Girolami, M.A. A Dynamic Probabilistic Model to Visualise Topic Evolution in Text Streams. Journal of Intelligent Information Systems 18, 107–125 (2002). https://doi.org/10.1023/A:1013673310093

Download citation

Issue Date: March 2002
DOI: https://doi.org/10.1023/A:1013673310093

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Dynamic Probabilistic Model to Visualise Topic Evolution in Text Streams

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Visual Analysis of Topical Evolution in Unstructured Text: Design and Evaluation of TopicFlow

The dynamic stochastic topic block model for dynamic networks with textual edges

A decade of research in statistics: a topic model approach

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Subscribe and save

Buy Now

Navigation

A Dynamic Probabilistic Model to Visualise Topic Evolution in Text Streams

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Visual Analysis of Topical Evolution in Unstructured Text: Design and Evaluation of TopicFlow

The dynamic stochastic topic block model for dynamic networks with textual edges

A decade of research in statistics: a topic model approach

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now

Search

Navigation