research-article

Topic-link LDA: joint models of topic and author community

Authors:
Yan Liu

IBM T.J. Watson Research Center, Yorktown Heights, NY

IBM T.J. Watson Research Center, Yorktown Heights, NY
View Profile

,
Alexandru Niculescu-Mizil

IBM T.J. Watson Research Center, Yorktown Heights, NY

IBM T.J. Watson Research Center, Yorktown Heights, NY
View Profile

,
Wojciech Gryc

Oxford University, Oxford, UK

Oxford University, Oxford, UK
View Profile

ICML '09: Proceedings of the 26th Annual International Conference on Machine LearningJune 2009Pages 665–672https://doi.org/10.1145/1553374.1553460

Published:14 June 2009Publication History

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

Pages 665–672

ABSTRACT

Given a large-scale linked document collection, such as a collection of blog posts or a research literature archive, there are two fundamental problems that have generated a lot of interest in the research community. One is to identify a set of high-level topics covered by the documents in the collection; the other is to uncover and analyze the social network of the authors of the documents. So far these problems have been viewed as separate problems and considered independently from each other. In this paper we argue that these two problems are in fact inter-dependent and should be addressed together. We develop a Bayesian hierarchical approach that performs topic modeling and author community discovery in one unified framework. The effectiveness of our model is demonstrated on two blog data sets in different domains and one research paper citation data from CiteSeer.

References

Blei, D. M., & Lafferty, J. D. (2006). Dynamic topic models. Proc. of Int. Conf. on Mach. Learn. (ICML'06) (pp. 113--120). Google ScholarDigital Library
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. J. Mach. Learn. Res., 3, 993--1022. Google ScholarCross Ref
Chakrabarti, D., & Faloutsos, C. (2006). Graph mining: Laws, generators, and algorithms. ACM Comput. Surv., 38, 2. Google ScholarDigital Library
Chang, J., & Blei, D. (2009). Relational topic models for document networks. Proc. of Conf. on AI and Statistics (AISTATS'09).Google Scholar
Cohn, D., & Hofmann, T. (2001). The missing link - a probabilistic model of document content and hypertext connectivity. Proc. of Conf. on Neural Information Processing Systems (NIPS'01) (pp. 430--436).Google Scholar
Dietz, L., Bickel, S., & Scheffer, T. (2007). Unsupervised prediction of citation influences. Proc. of Int. Conf. on Mach. Learn. (ICML'07) (pp. 233--240). Google ScholarDigital Library
Erosheva, E., Fienberg, S., & Lafferty, J. (2004). Mixed membership models of scientific publications. Proc. Nat. Acad. Sci., 101, 5220--5227.Google ScholarCross Ref
Gibson, D., Kleinberg, J. M., & Raghavan, P. (1998). Inferring web communities from link topology. UK Conference on Hypertext (pp. 225--234). Google ScholarDigital Library
Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proc. Nat. Acad. Sci., 101, 5228--5235.Google ScholarCross Ref
Jaakkola, T. (1997). Variational methods for inference and estimation in graphical models. PhD thesis, MIT. Google ScholarDigital Library
Jordan, M. I., Ghahramani, Z., Jaakkola, T., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37, 183--233. Google ScholarDigital Library
Mccallum, A., Corrada-Emmanuel, A., & Wang, X. (2005). Topic and role discovery in social networks. Proc. of Int. Joint Conf. on Articial Intelligence (IJCAI'05) (pp. 786--791). Google ScholarDigital Library
McCallum, A., Nigam, K., Rennie, J., & Seymore, K. (2000). Automating the construction of internet portals with machine learning. Information Retrieval Journal, 3, 127--163. Google ScholarDigital Library
Mei, Q., Cai, D., Zhang, D., & Zhai, C. (2008). Topic modeling with network regularization. Proc. of Int. World Wide Web Conf. (WWW'08) (pp. 101--110). Google ScholarDigital Library
Nallapati, R., & Cohen, W. (2008). Link-plsa-lda: A new unsupervised model for topics and influence in blogs. Proc. of Int. Conf. on Weblogs and Social Media (ICWSM'08) (pp. 84--92).Google Scholar
Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. Proc. of Conf. on Uncertainty in Artificial Intelligence (UAI'04) (pp. 487--494). Google ScholarDigital Library
Xu, Z., Tresp, V., Yu, K., Yu, S., & Kriegel, H.-P. (2005). Dirichlet enhanced relational learning. Proc. of Int. Conf. on Mach. Learn. (pp. 1004--1011). Google ScholarDigital Library
Yu, K., Chu, W., Yu, S., Tresp, V., & Xu, Z. (2006). Stochastic relational models for discriminative link prediction. Proc. of Conf. on Neural Information Processing Systems (NIPS'06) (pp. 1553--1560).Google Scholar

Index Terms

Recommendations

Research on Multi-document Summarization Based on LDA Topic Model
IHMSC '14: Proceedings of the 2014 Sixth International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 02

Compared with VSM (Vector Space Model) and graph-ranking models, LDA (Latent Dirichlet Allocation) Model can discover latent topics in the corpus and latent topics are beneficial to use sentence-ranking mechanisms to form a good summary. In the paper, ...
Read More
Blog topic analysis using TF smoothing and LDA
ICUIMC '13: Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication

In the era of Web 2.0, the number of blogs has explosively increased. With the appearance of social network services, blogs has become the places for sharing professional knowledge and personal branding. So, in order to understand the trends of topics ...
Read More
Multi-aspect Blog sentiment analysis based on LDA topic model and hownet lexicon
WISM'11: Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II

Blog is an important web2.0 application, which attracts many users to express their subjective reviews about financial events, political events and other objects. Usually a Blog page includes more than one theme. However the existing researches of multi-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
June 2009
1331 pages
ISBN:9781605585161
DOI:10.1145/1553374
General Chair:
Andrea Danyluk
Williams College
,
Program Chairs:
Léon Bottou
NEC Laboratories America
,
Michael Littman
Rutgers University
Copyright © 2009 Copyright 2009 by the author(s)/owner(s).
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate140of548submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 211
  Total Citations
  View Citations
- 2,043
  Total Downloads
- Downloads (Last 12 months)41
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Topic-link LDA: joint models of topic and author community

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Research on Multi-document Summarization Based on LDA Topic Model

Blog topic analysis using TF smoothing and LDA

Multi-aspect Blog sentiment analysis based on LDA topic model and hownet lexicon

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Topic-link LDA: joint models of topic and author community

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Research on Multi-document Summarization Based on LDA Topic Model

Blog topic analysis using TF smoothing and LDA

Multi-aspect Blog sentiment analysis based on LDA topic model and hownet lexicon

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media