Skip to main content

Multi-objective Topic Modeling

  • Conference paper
Book cover Evolutionary Multi-Criterion Optimization (EMO 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7811))

Included in the following conference series:

Abstract

Topic Modeling (TM) is a rapidly-growing area at the interfaces of text mining, artificial intelligence and statistical modeling, that is being increasingly deployed to address the ‘information overload’ associated with extensive text repositories. The goal in TM is typically to infer a rich yet intuitive summary model of a large document collection, indicating a specific collection of topics that characterizes the collection – each topic being a probability distribution over words – along with the degrees to which each individual document is concerned with each topic. The model then supports segmentation, clustering, profiling, browsing, and many other tasks. Current approaches to TM, dominated by Latent Dirichlet Allocation (LDA), assume a topic-driven document generation process and find a model that maximizes the likelihood of the data with respect to this process. This is clearly sensitive to any mismatch between the ‘true’ generating process and statistical model, while it is also clear that the quality of a topic model is multi-faceted and complex. Individual topics should be intuitively meaningful, sensibly distinct, and free of noise. Here we investigate multi-objective approaches to TM, which attempt to infer coherent topic models by navigating the trade-offs between objectives that are oriented towards coherence as well as coverage of the corpus at hand. Comparisons with LDA show that adoption of MOEA approaches enables significantly more coherent topics than LDA, consequently enhancing the use and interpretability of these models in a range of applications, without significant degradation in generalization ability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baars, H., Kemper, H.G.: Management Support with Structured and Unstructured Data-An Integrated Business Intelligence Framework. Inf. Sys. Manag. 25(2), 132–148 (2008)

    Article  Google Scholar 

  2. Ha-Thuc, V., Srinivasan, P.: Topic Models and a Revisit of Text-related Applications. In: Proceedings of the 2nd PhD Workshop on Information and Knowledge Management (PIKM 2008), New York, pp. 25–32 (2008)

    Google Scholar 

  3. Steyvers, M., Griffiths, T.L.: Rational Analysis as a Link Between Human Memory and Information Retrieval. In: Chater, N., Oaksford, M. (eds.) The Probabilistic Mind: Prospects for Bayesian Cognitive Science, pp. 327–347. Oxford University Press (2008)

    Google Scholar 

  4. Blei, D.M., Lakerty, J.D.: A correlated topic model of science. Annals of Applied Statistics 1(1), 17–35 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  5. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. J. Mach. Learn. 3, 993–1022 (2003)

    MATH  Google Scholar 

  6. Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)

    Article  Google Scholar 

  7. Srivastava, A., Sahami, M.: Text Mining: Classification, Clustering, and Applications, 1st edn. Taylor and Francis Group (2009)

    Google Scholar 

  8. Wallach, H.M., Murray, I., Mimno, D.: Evaluation Methods for Topic Models. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1105–1112. ACM, Montreal Canada (2009)

    Google Scholar 

  9. Chang, J., Boyd-Graber, J., Wang, C., Gerrish, S., Blei, D.M.: Reading Tea Leaves: How Human Interpret Topic Models. In: Advances in Neural Information Processing Systems. NIPS Foundation, Vancouver British Columbia (2009)

    Google Scholar 

  10. de Waal, A., Barnard, E.: Evaluating Topic Models with Stability. In: Nineteenth Annual Symposium of the Pattern Recognition Association of South Africa, Cape Town South Africa (2008)

    Google Scholar 

  11. Newman, D., Noh, Y., Talley, E., Karimi, S., Baldwin, T.: Evaluating Topic Models for Digital Libraries. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, pp. 215–224. ACM, Gold Coast (2010)

    Chapter  Google Scholar 

  12. Su, Q., Xiang, K., Wang, H., Sun, B., Yu, S.: Using Pointwise Mutual Information to Identify Implicit Features in Customer Reviews. In: Matsumoto, Y., Sproat, R.W., Wong, K.-F., Zhang, M. (eds.) ICCPOL 2006. LNCS (LNAI), vol. 4285, pp. 22–30. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  13. Stevens, K., Kegelmeyer, P., Andrzejewski, D., Buttler, D.: Exploring Topic Coherence Over Many Models and Many Topics. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island Korea, pp. 952–961 (2012)

    Google Scholar 

  14. Newman, D., Lau, J.H., Grieser, K., Baldwin, T.: Automatic Evaluation of Topic Coherence. In: Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles California, pp. 100–108 (2010)

    Google Scholar 

  15. Bouma, G.: Normalized (Pointwise) Mutual Information in Collocation Extraction. In: Proceedings of The International Conference of the German Society for Computational Linguistics and Language Technology, pp. 31–40 (2009)

    Google Scholar 

  16. Pareto, V.: Cours d’Economie politique. Revue Economique 7(3), 426–430 (1896)

    Google Scholar 

  17. Coello, C.A.C.: Evolutionary Multi-objective Optimization: a Historical View of the Field. Computational Intelligence Magazine IEEE 1(1), 28–36 (2006)

    Article  Google Scholar 

  18. Coello Coello, C.A.: Evolutionary Multi-Objective Optimization: Basic Concepts and Some Applications in Pattern Recognition. In: Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A., Ben-Youssef Brants, C., Hancock, E.R. (eds.) MCPR 2011. LNCS, vol. 6718, pp. 22–33. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  19. Chen, X., Hu, X., Shen, X., Rosen, G.: Probabilistic Topic Modeling for Genomic Data Interpretation. In: 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 149–152. IEEE Press, Hong Kong (2010)

    Chapter  Google Scholar 

  20. Malisiewicz, T.J., Huang, J.C., Efros, A.A.: Detecting Objects via Multiple Segmentations and Latent Topic Models. Technical report, CMU Tech (2006)

    Google Scholar 

  21. Smaragdis, P., Shashanka, M., Raj, B.: Topic Models for Audio Mixture Analysis. In: Applications for Topic Models: Text and Beyond, Whistler (2009)

    Google Scholar 

  22. Shenghua, B., Shengliang, X., Li, Z., Rong, Y., Zhong, S., Dingyi, H., Yong, Y.: Joint Emotion-Topic Modeling for Social Affective Text Mining. In: Proceedings of the Ninth IEEE International Conference on Data Mining, pp. 699–704. IEEE Computer Society, Washington DC (2009)

    Google Scholar 

  23. Gabriel, D., Charles, E.: Financial Topic Models. In: Applications for Topic Models: Text and Beyond, Whistler Canada (2009)

    Google Scholar 

  24. Zhang, Q., Li, H.: MOEA/D: A Multiobjective Evolutionary Algorithm Based on Decomposition. IEEE Transactions on Evolutionary Comp. 11(6), 712–731 (2007)

    Article  Google Scholar 

  25. MALLET: Machine Learning for Language Toolkit, http://mallet.cs.umass.edu

  26. MOEA Framework: a Java library for multiobjective evolutionary algorithms, http://www.moeaframework.org

  27. Dudziak, W.J.: Multi-Dimensional Interpolation Function for Non-Uniform Data: Microsphere Projection. In: Conf. Computer Graphics and Visualization, Lisbon, pp. 143–147 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Khalifa, O., Corne, D.W., Chantler, M., Halley, F. (2013). Multi-objective Topic Modeling. In: Purshouse, R.C., Fleming, P.J., Fonseca, C.M., Greco, S., Shaw, J. (eds) Evolutionary Multi-Criterion Optimization. EMO 2013. Lecture Notes in Computer Science, vol 7811. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37140-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37140-0_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37139-4

  • Online ISBN: 978-3-642-37140-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics