ABSTRACT
Automated evaluation of topic quality remains an important unsolved problem in topic modeling and represents a major obstacle for development and evaluation of new topic models. Previous attempts at the problem have been formulated as variations on the coherence and/or mutual information of top words in a topic. In this work, we propose several new metrics for evaluating topic quality with the help of distributed word representations; our experiments suggest that the new metrics are a better match for human judgement, which is the gold standard in this case, than previously developed approaches.
- R. Al-Rfou, B. Perozzi, and S. Skiena. Polyglot: Distributed word representations for multilingual NLP. In Proc. 17th Conference on Computational Natural Language Learning, pages 183--192, Sofia, Bulgaria, August 2013. Association for Computational Linguistics.Google Scholar
- N. Arefyev, A. Panchenko, A. Lukanin, O. Lesota, and P. Romanov. Evaluating three corpus-based semantic similarity systems for russian. In Proc. International Conference on Computational Linguistics Dialogue, 2015.Google Scholar
- D. M. Blei. Introduction to probabilistic topic models. Communications of the ACM, 2011. Google ScholarDigital Library
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3(4--5):993--1022, 2003. Google ScholarDigital Library
- S. Bodrunova, S. Koltsov, O. Koltsova, S. I. Nikolenko, and A. Shimorina. Interval semi-supervised LDA: Classifying needles in a haystack. In Proc. 12th Mexican International Conference on Artificial Intelligence, volume 8625 of Lecture Notes in Computer Science, pages 265--274. Springer, 2013.Google ScholarCross Ref
- G. Bouma. Normalized (pointwise) mutual information in collocation extraction. In Proc. Biennial GSCL Conference, pages 31--40, 2013.Google Scholar
- J. Chang, J. Boyd-Graber, S. Gerrish, C. Wang, and D. M. Blei. Reading tea leaves: How humans interpret topic models. Advances in Neural Information Processing Systems, 20, 2009.Google Scholar
- T. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, 101 (Suppl. 1):5228--5335, 2004.Google ScholarCross Ref
- D. J. Hand and R. J. Till. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning, 45:171--186, 2001. Google ScholarDigital Library
- T. Hoffmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1):177--196, 2001.Google ScholarCross Ref
- J. H. Lau, D. Newman, and T. Baldwin. Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In EACL, pages 530--539, 2014.Google ScholarCross Ref
- C. X. Ling, J. Huang, and H. Zhang. AUC: a statistically consistent and more discriminating measure than accuracy. In Proc. International Joint Conference on Artificial Intelligence 2003, pages 519--526, 2003. Google ScholarDigital Library
- D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum. Optimizing semantic coherence in topic models. In Proc. Conference on Empirical Methods in Natural Language Processing, pages 262--272, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics. Google ScholarDigital Library
- D. Newman, J. H. Lau, K. Grieser, and T. Baldwin. Automatic evaluation of topic coherence. In Human Language Technologies 2010, HLT '10, pages 100--108, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics. Google ScholarDigital Library
- S. I. Nikolenko, O. Koltsova, and S. Koltsov. Topic modelling for qualitative studies. Journal of Information Science, 2015.Google Scholar
- A. Panchenko, N. Loukachevitch, D. Ustalov, D. Paperno, C. M. Meyer, and N. Konstantinova. Russe: The first workshop on Russian semantic similarity. In Proc. International Conference on Computational Linguistics and Intellectual Technologies (Dialogue), pages 89--105, May 2015.Google Scholar
- J. Pennington, R. Socher, and C. Manning. GloVe: Global vectors for word representation. In Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532--1543, Doha, Qatar, October 2014. Association for Computational Linguistics.Google ScholarCross Ref
- A. S. Rathore and D. Roy. Performance of LDA and DCT models. Journal of Information Science, 40(3):281--292, 2014. Google ScholarDigital Library
- K. Vorontsov. Additive regularization for topic models of text collections. Doklady Mathematics, 89(3):301--304, 2014.Google ScholarCross Ref
- K. V. Vorontsov and A. A. Potapenko. Additive regularization of topic models. Machine Learning, Special Issue on Data Analysis and Intelligent Optimization with Applications, 101(1):303--323, 2015. Google ScholarDigital Library
Index Terms
- Topic Quality Metrics Based on Distributed Word Representations
Recommendations
Topic modelling for qualitative studies
Qualitative studies, such as sociological research, opinion analysis and media studies, can benefit greatly from automated topic mining provided by topic models such as latent Dirichlet allocation LDA. However, examples of qualitative studies that ...
Incorporating appraisal expression patterns into topic modeling for aspect and sentiment word identification
With the considerable growth of user-generated content, online reviews are becoming extremely valuable sources for mining customers' opinions on products and services. However, most of the traditional opinion mining methods are coarse-grained and cannot ...
Using PageRank for Characterizing Topic Quality in LDA
ICTIR '18: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information RetrievalTopic models based on Latent Dirichlet Allocation (LDA) are employed effectively in various information retrieval and data mining tasks. Despite their popularity and wide-spread application, the question of assessing the quality of topics extracted by ...
Comments