Coherence Regularization for Neural Topic Models

Krasnashchok, Katsiaryna; Cherif, Aymen

doi:10.1007/978-3-030-22796-8_45

Coherence Regularization for Neural Topic Models

Katsiaryna Krasnashchok¹⁷ &
Aymen Cherif¹⁷

Conference paper
First Online: 26 June 2019

2099 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11554))

Abstract

Neural topic models aim to predict the words of a document given the document itself. In such models perplexity is used as a training criterion, whereas the final quality measure is topic coherence. In this work we introduce a coherence regularization loss that penalizes incoherent topics during training of the model. We analyze our approach using coherence and an additional metric - exclusivity, responsible for the uniqueness of the terms in topics. We argue that this combination of metrics is an adequate indicator of the model quality. Our results indicate the effectiveness of our loss and the potential to be used in the future neural topic models.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Arora, S., et al.: A practical algorithm for topic modeling with provable guarantees. In: International Conference on Machine Learning, pp. 280–288 (2013)
Google Scholar
Cao, Z., Li, S., Liu, Y., Li, W., Ji, H.: A novel neural topic model and its supervised extension. In: AAAI, pp. 2210–2216 (2015)
Google Scholar
Lau, J.H., Baldwin, T., Cohn, T.: Topically driven neural language model. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers. ACL (2017)
Google Scholar
Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J.L., Blei, D.M.: Reading tea leaves: how humans interpret topic models. In: Advances in Neural Information Processing Systems, pp. 288–296 (2009)
Google Scholar
Newman, D., Bonilla, E.V., Buntine, W.: Improving topic coherence with regularized topic models. In: Advances in Neural Information Processing Systems, pp. 496–504 (2011)
Google Scholar
Xie, P., Yang, D., Xing, E.: Incorporating word correlation knowledge into topic modeling. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL (2015)
Google Scholar
Yang, Y., Downey, D., Boyd-Graber, J.: Efficient methods for incorporating knowledge into topic models. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. ACL (2015)
Google Scholar
Ranzato, M.A., Szummer, M.: Semi-supervised learning of compact document representations with deep networks. In: Proceedings of the 25th International Conference on Machine Learning - ICML 2008. ACM Press (2008)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Replicated softmax: an undirected topic model. In: Advances in Neural Information Processing Systems, pp. 1607–1614 (2009)
Google Scholar
Larochelle, H., Lauly, S.: A neural autoregressive topic model. In: Advances in Neural Information Processing Systems, pp. 2708–2716 (2012)
Google Scholar
Srivastava, N., Salakhutdinov, R., Hinton, G.: Modeling documents with a deep Boltzmann machine. In: Uncertainty in Artificial Intelligence, p. 616. Citeseer (2013)
Google Scholar
Wallach, H.M., Murray, I., Salakhutdinov, R., Mimno, D.: Evaluation methods for topic models. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1105–1112. ACM (2009)
Google Scholar
Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM 2015. ACM Press (2015)
Google Scholar
O’Callaghan, D., Greene, D., Carthy, J., Cunningham, P.: An analysis of the coherence of descriptors in topic modeling. Expert Syst. Appl. 42(13), 5645–5657 (2015)
Google Scholar
Krasnashchok, K., Jouili, S.: Improving topic quality by promoting named entities in topic modeling. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 247–253 (2018)
Google Scholar
Newman, D., Lau, J.H., Grieser, K., Baldwin, T.: Automatic evaluation of topic coherence. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistic, pp. 100–108. ACL (2010)
Google Scholar
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 142–150. ACL (2011)
Google Scholar
BNC Consortium. The British National Corpus, version 3 (BNC XML Edition). Distributed by Oxford University Computing Services on behalf of the BNC Consortium (2007)
Google Scholar

Download references

Acknowledgements

The elaboration of this scientific paper was supported by the Ministry of Economy, Industry, Research, Innovation, IT, Employment and Education of the Region of Wallonia (Belgium), through the funding of the industrial research project Jericho (convention no. 7717).

Author information

Authors and Affiliations

EURA NOVA, Rue Emilie Francqui 4, 1435, Mont-Saint-Guibert, Belgium
Katsiaryna Krasnashchok & Aymen Cherif

Authors

Katsiaryna Krasnashchok
View author publications
You can also search for this author in PubMed Google Scholar
Aymen Cherif
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Katsiaryna Krasnashchok .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Huchuan Lu
Sichuan University, Chengdu, China
Huajin Tang
Northeastern University, Shenyang, China
Zhanshan Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Krasnashchok, K., Cherif, A. (2019). Coherence Regularization for Neural Topic Models. In: Lu, H., Tang, H., Wang, Z. (eds) Advances in Neural Networks – ISNN 2019. ISNN 2019. Lecture Notes in Computer Science(), vol 11554. Springer, Cham. https://doi.org/10.1007/978-3-030-22796-8_45

Download citation

DOI: https://doi.org/10.1007/978-3-030-22796-8_45
Published: 26 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22795-1
Online ISBN: 978-3-030-22796-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics