Skip to main content

Topic Models for NLP Applications

  • Reference work entry
  • First Online:
Encyclopedia of Machine Learning and Data Mining
  • 352 Accesses

Abstract

Topic modeling is a machine learning technique for discovering semantic topics from a document collection. It typically assumes that a document is a multinomial distribution over latent topics, and a topic is a multinomial distribution over words. By capturing the co-occurrence statistics of words in the documents, it uncovers these distributions which indicate important semantic relationships. Topic modeling has been widely studied in machine learning, text mining, and natural language processing (NLP). This chapter gives an introduction to topic modeling. It covers both the fundamental techniques and some of its important applications in NLP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 699.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 949.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  • Andrzejewski D, Zhu X, Craven M (2009) Incorporating domain knowledge into topic modeling via Dirichlet Forest priors. In: ICML, Montreal, pp 25–32

    Google Scholar 

  • Blei DM, McAuliffe JD (2010) Supervised topic models. In: NIPS, Whistler, pp 121–128

    Google Scholar 

  • Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Boyd-Graber JL, Blei DM, Zhu X (2007) A topic model for word sense disambiguation. In: EMNLP-CoNLL, Prague, pp 1024–1033

    Google Scholar 

  • Chang J, Boyd-Graber J, Chong W, Gerrish S, Blei DM (2009) Reading tea leaves: how humans interpret topic models. In: NIPS, Whistler, pp 288–296

    Google Scholar 

  • Chen Z, Liu B (2014) Topic modeling using topics from many domains, lifelong learning and big data. In: ICML, Beijing, pp 703–711

    Google Scholar 

  • Chen Z, Mukherjee A, Liu B, Hsu M, Castellanos M, Ghosh R (2013) Exploiting domain knowledge in aspect extraction. In: EMNLP, Seattle, pp 1655–1667

    Google Scholar 

  • Chen Z, Mukherjee A, Liu B (2014) Aspect extraction with automated prior knowledge learning. In: ACL, Baltimore, pp 347–358

    Google Scholar 

  • Eidelman V, Boyd-Graber J, Resnik P (2012) Topic models for dynamic translation model adaptation. In: ACL, Jeju Island, pp 115–119

    Google Scholar 

  • Griffiths TL, Steyvers M (2004) Finding scientific topics. PNAS 101(Suppl):5228–5235

    Article  Google Scholar 

  • Griffiths TL, Steyvers M, Blei DM, Tenenbaum JB (2004) Integrating topics and syntax. In: NIPS, Vancouver, pp 537–544

    Google Scholar 

  • Haghighi A, Vanderwende L (2009) Exploring content models for multi-document summarization. In: ACL, Boulder, pp 362–370

    Google Scholar 

  • Han X, Sun L (2012) An entity-topic model for entity linking. In: EMNLP, Jeju Island, pp 105–115

    Google Scholar 

  • Hofmann T (1999) Probabilistic latent semantic analysis. In: UAI, Stockholm, pp 289–296

    Google Scholar 

  • Hu Y, Boyd-Graber J, Satinoff B (2011) Interactive topic modeling. In: ACL, Portland, pp 248–257

    Google Scholar 

  • Jo Y, Oh AH (2011) Aspect and sentiment unification model for online review analysis. In: WSDM, Hong Kong, pp 815–824

    Google Scholar 

  • Krestel R, Fankhauser P, Nejdl W (2009) Latent dirichlet allocation for tag recommendation. In: RecSys, New York, pp 61–68

    Google Scholar 

  • Lin C, He Y (2009) Joint sentiment/topic model for sentiment analysis. In: CIKM, Hong Kong, pp 375–384

    Google Scholar 

  • Liu B (2012) Sentiment analysis and opinion mining. Synth Lect Hum Lang Technol 5(1):1–167

    Article  Google Scholar 

  • Lu Y, Zhai C (2008) Opinion integration through semi-supervised topic modeling. In: WWW, Beijing, pp 121–130

    Google Scholar 

  • Mei Q, Ling X, Wondra M, Su H, Zhai C (2007) Topic sentiment mixture: modeling facets and opinions in weblogs. In: WWW, Banff, pp 171–180

    Google Scholar 

  • Minka T, Lafferty J (2002) Expectation-propagation for the generative aspect model. In: UAI’02, Edmonton, pp 352–359

    Google Scholar 

  • Mukherjee A, Liu B (2012) Aspect extraction through semi-supervised modeling. In: ACL, Jeju Island, pp 339–348

    Google Scholar 

  • Petterson J, Smola A, Caetano T, Buntine W, Narayanamurthy S (2010) Word features for latent Dirichlet allocation. In: NIPS, Whistler, pp 1921–1929

    Google Scholar 

  • Teh YW, Jordan MI, Beal MJ, Blei DM (2006) Hierarchical Dirichlet processes. J Am Stat Assoc 101(476): 1–30

    Article  MathSciNet  MATH  Google Scholar 

  • Titov I, McDonald R (2008) Modeling online reviews with multi-grain topic models. In: WWW, Beijing, pp 111–120

    Google Scholar 

  • Toutanova K, Johnson M (2008) A Bayesian LDA-based Model for Semi-Supervised Part-of-speech Tagging. In: NIPS, Whistler

    Google Scholar 

  • Wei X, Croft WB (2006) LDA-based document models for ad-hoc retrieval. In: SIGIR, Seattle, pp 178–185

    Google Scholar 

  • Yao L, Haghighi A, Riedel S, McCallum A (2011) Structured relation discovery using generative models. In: EMNLP, Edinburgh, pp 1456–1466

    Google Scholar 

  • Zhao WX, Jiang J, He J, Song Y, Achananuparp P, Lim E-P, Li X (2011) Topical keyphrase extraction from twitter. In: ACL, Portland, pp 379–388

    Google Scholar 

  • Zhao WX, Jiang J, Yan H, Li X (2010) Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In: EMNLP, Cambridge, pp 56–65

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiyuan Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this entry

Cite this entry

Chen, Z., Liu, B. (2017). Topic Models for NLP Applications. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_906

Download citation

Publish with us

Policies and ethics