Skip to main content

MEI: Mutual Enhanced Infinite Generative Model for Simultaneous Community and Topic Detection

  • Conference paper
Book cover Discovery Science (DS 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6926))

Included in the following conference series:

  • 1418 Accesses

Abstract

Community and topic are two widely studied patterns in social network analysis. However, most existing studies either utilize textual content to improve the community detection or use link structure to guide topic modeling. Recently, some studies take both the link emphasized community and text emphasized topic into account, but community and topic are modeled by using the same latent variable. However, community and topic are different from each other in practical aspects. Therefore, it is more reasonable to model the community and topic by using different variables. To discover community, topic and their relations simultaneously, a m utual e nhanced i nfinite generative model (MEI) is proposed. This model discriminates the community and topic from one another and relates them together via community-topic distributions. Community and topic can be detected simultaneously and can be enhanced mutually during learning process. To detect the appropriate number of communities and topics automatically, Hierarchical/Dirichlet Process Mixture model (H/DPM) is employed. Gibbs sampling based approach is adopted to learn the model parameters. Experiments are conducted on the co-author network extracted from DBLP where each author is associated with his/her published papers. Experimental results show that our proposed model outperforms several baseline models in terms of perplexity and link prediction performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. JMLR 3, 993–1022 (2003)

    MATH  Google Scholar 

  2. Escobar, M.D., West, M.: Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association 90, 577–588 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  3. Fortunato, S.: Community detection in graphs. Physics Reports 486(3-5), 75–174 (2010)

    Article  MathSciNet  Google Scholar 

  4. Gao, J., Liang, F., Fan, W., Wang, C., Sun, Y., Han, J.: On community outliers and their efficient detection in information networks. In: KDD, pp. 813–822 (2010)

    Google Scholar 

  5. Guo, Z., Zhang, Z.M., Zhu, S., Chi, Y., Gong, Y.: Knowledge discovery from citation networks. In: ICDM, pp. 800–805 (2009)

    Google Scholar 

  6. Heinrich, G.: Parameter estimation for text analysis. Technical report, University of Leipzig (2008)

    Google Scholar 

  7. Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR, pp. 50–57 (1999)

    Google Scholar 

  8. Li, H., Nie, Z., Lee, W.-C., Giles, C.L., Wen, J.-R.: Scalable community discovery on textual data with relations. In: WWW, pp. 101–110 (2008)

    Google Scholar 

  9. McCallum, A., Wang, X., Corrada-Emmanuel, A.: Topic and role discovery in social networks with experiments on enron and academic email. JAIR 30, 249–272 (2007)

    Google Scholar 

  10. McPherson, M., Lovin, L.S., Cook, J.M.: Birds of a feather: Homophily in social networks. Annual Review of Sociology 27(1), 415–444 (2001)

    Article  Google Scholar 

  11. Mei, Q., Cai, D., Zhang, D., Zhai, C.: Topic modeling with network regularization. In: CIKM, pp. 1203–1212 (2008)

    Google Scholar 

  12. Nallapati, R., Ahmed, A., Xing, E.P., Cohen, W.W.: Joint latent topic models for text and citations. In: KDD, pp. 542–550 (2008)

    Google Scholar 

  13. Neal, R.M.: Markov chain sampling methods for dirichlet process mixture models. Journal of Computational and Graphical Statistics 9(2), 249–265 (2000)

    MathSciNet  Google Scholar 

  14. Nowicki, K., Snijders, T.A.B.: Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association 96(455), 1077–1087 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  15. Sun, Y., Han, J., Gao, J., Yu, Y.: Itopicmodel: Information network-integrated topic modeling. In: ICDM, pp. 493–502 (2009)

    Google Scholar 

  16. Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical dirichlet processes. Journal of the American Statistical Association 101(476), 1566–1581 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  17. Wang, X., Mohanty, N., Mccallum, A.: Group and topic discovery from relations and text. In: LinkKDD, pp. 28–35 (2005)

    Google Scholar 

  18. Yang, T., Jin, R., Chi, Y., Zhu, S.: Combining link and content for community detection: A discriminative approach. In: KDD, pp. 927–935 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Duan, D., Li, Y., Li, R., Lu, Z., Wen, A. (2011). MEI: Mutual Enhanced Infinite Generative Model for Simultaneous Community and Topic Detection. In: Elomaa, T., Hollmén, J., Mannila, H. (eds) Discovery Science. DS 2011. Lecture Notes in Computer Science(), vol 6926. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24477-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24477-3_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24476-6

  • Online ISBN: 978-3-642-24477-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics