Skip to main content

Coherence Regularization for Neural Topic Models

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11554))

Abstract

Neural topic models aim to predict the words of a document given the document itself. In such models perplexity is used as a training criterion, whereas the final quality measure is topic coherence. In this work we introduce a coherence regularization loss that penalizes incoherent topics during training of the model. We analyze our approach using coherence and an additional metric - exclusivity, responsible for the uniqueness of the terms in topics. We argue that this combination of metrics is an adequate indicator of the model quality. Our results indicate the effectiveness of our loss and the potential to be used in the future neural topic models.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/jhlau/topically-driven-language-model.

  2. 2.

    https://www.ap.org/en-gb/.

  3. 3.

    http://www.natcorp.ox.ac.uk/.

References

  1. Arora, S., et al.: A practical algorithm for topic modeling with provable guarantees. In: International Conference on Machine Learning, pp. 280–288 (2013)

    Google Scholar 

  2. Cao, Z., Li, S., Liu, Y., Li, W., Ji, H.: A novel neural topic model and its supervised extension. In: AAAI, pp. 2210–2216 (2015)

    Google Scholar 

  3. Lau, J.H., Baldwin, T., Cohn, T.: Topically driven neural language model. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1: Long Papers. ACL (2017)

    Google Scholar 

  4. Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J.L., Blei, D.M.: Reading tea leaves: how humans interpret topic models. In: Advances in Neural Information Processing Systems, pp. 288–296 (2009)

    Google Scholar 

  5. Newman, D., Bonilla, E.V., Buntine, W.: Improving topic coherence with regularized topic models. In: Advances in Neural Information Processing Systems, pp. 496–504 (2011)

    Google Scholar 

  6. Xie, P., Yang, D., Xing, E.: Incorporating word correlation knowledge into topic modeling. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL (2015)

    Google Scholar 

  7. Yang, Y., Downey, D., Boyd-Graber, J.: Efficient methods for incorporating knowledge into topic models. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. ACL (2015)

    Google Scholar 

  8. Ranzato, M.A., Szummer, M.: Semi-supervised learning of compact document representations with deep networks. In: Proceedings of the 25th International Conference on Machine Learning - ICML 2008. ACM Press (2008)

    Google Scholar 

  9. Hinton, G.E., Salakhutdinov, R.R.: Replicated softmax: an undirected topic model. In: Advances in Neural Information Processing Systems, pp. 1607–1614 (2009)

    Google Scholar 

  10. Larochelle, H., Lauly, S.: A neural autoregressive topic model. In: Advances in Neural Information Processing Systems, pp. 2708–2716 (2012)

    Google Scholar 

  11. Srivastava, N., Salakhutdinov, R., Hinton, G.: Modeling documents with a deep Boltzmann machine. In: Uncertainty in Artificial Intelligence, p. 616. Citeseer (2013)

    Google Scholar 

  12. Wallach, H.M., Murray, I., Salakhutdinov, R., Mimno, D.: Evaluation methods for topic models. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1105–1112. ACM (2009)

    Google Scholar 

  13. Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM 2015. ACM Press (2015)

    Google Scholar 

  14. O’Callaghan, D., Greene, D., Carthy, J., Cunningham, P.: An analysis of the coherence of descriptors in topic modeling. Expert Syst. Appl. 42(13), 5645–5657 (2015)

    Google Scholar 

  15. Krasnashchok, K., Jouili, S.: Improving topic quality by promoting named entities in topic modeling. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol. 2, pp. 247–253 (2018)

    Google Scholar 

  16. Newman, D., Lau, J.H., Grieser, K., Baldwin, T.: Automatic evaluation of topic coherence. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistic, pp. 100–108. ACL (2010)

    Google Scholar 

  17. Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 142–150. ACL (2011)

    Google Scholar 

  18. BNC Consortium. The British National Corpus, version 3 (BNC XML Edition). Distributed by Oxford University Computing Services on behalf of the BNC Consortium (2007)

    Google Scholar 

Download references

Acknowledgements

The elaboration of this scientific paper was supported by the Ministry of Economy, Industry, Research, Innovation, IT, Employment and Education of the Region of Wallonia (Belgium), through the funding of the industrial research project Jericho (convention no. 7717).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Katsiaryna Krasnashchok .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Krasnashchok, K., Cherif, A. (2019). Coherence Regularization for Neural Topic Models. In: Lu, H., Tang, H., Wang, Z. (eds) Advances in Neural Networks – ISNN 2019. ISNN 2019. Lecture Notes in Computer Science(), vol 11554. Springer, Cham. https://doi.org/10.1007/978-3-030-22796-8_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-22796-8_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-22795-1

  • Online ISBN: 978-3-030-22796-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics