Abstract
To explain why a user generates some observed content and behaviors, one has to determine the user’s topical interests as well as that of her community. Most existing works on modeling microblogging users and their communities however are based on either user generated content or user behaviors, but not both. In this paper, we propose the Community and Personal Interest (CPI) model, for modeling interest of microblogging users jointly with that of their communities using both the content and behaviors. The CPI model also provides a common framework to accommodate multiple types of user behaviors. Unlike the other models, CPI does not assume a hierarchical relationship between personal interest and community interest, i.e., one is determined purely based on the other. We build the CPI model based on the principle that a user’s personal interest is different from that of her community. We further develop a regularization technique to bias the model to learn more socially meaningful topics for each community. Our experiments on a Twitter dataset show that the CPI model outperforms other state-of-the-art models in topic learning and user classification tasks. We also demonstrate that the CPI model can effectively mine community interest through some representative case examples.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9 (2008)
Balasubramanyan, R., Cohen, W.W.: Block-LDA: jointly modeling entity-annotated text and entity-entity links. In: SDM (2011)
Balasubramanyan, R., Cohen, W.W.: Regularization of latent variable models to obtain sparsity. In: SDM13 (2013)
Chang, J., Blei, D.M.: Relational topic models for document networks. In: AISTATS (2009)
Chang, J., Boyd-Graber, J., Blei, D.M.: Connections between the lines: augmenting social networks with text. In: KDD (2009)
Cohen, R., Ruths, D.: Classifying political orientation on twitter: it’s not easy! In: ICWSM (2013)
Conover, M., Ratkiewicz, J., Francisco, M., Gonçalves, B., Flammini, A., Menczer, F.: Political polarization on twitter. In: 5th ICWSM (2011)
Cui, P., Wang, F., Liu, S., Ou, M., Yang, S., Sun, L.: Who should share what?: item-level social influence prediction for users and posts ranking. In: SIGIR (2011)
Dabeer, O., Mehendale, P., Karnik, A., Saroop, A.: Timing tweets to increase effectiveness of information campaigns. In: 5th ICWSM (2011)
Ding, Y.: Community detection: Topological vs. topical. J. Informetrics (2011)
Feller, A., Kuhnert, M., Sprenger, T., Welpe, I.: Divided they tweet: the network structure of political microbloggers and discussion topics. In: ICWSM (2011)
Hannon, J., Bennett, M., Smyth, B.: Recommending twitter users to follow using content and collaborative filtering approaches. In: RecSys 2010 (2010)
Hoang, T.A., Cohen, W.W., Lim, E.P.: On modeling community behaviors and sentiments in microblogging. In: SDM14 (2014)
Hoang, T.A., Cohen, W.W., Lim, E.P., Pierce, D., Redlawsk, D.P.: Politics, sharing and emotion in microblogs. In: ASONAM (2013)
Hoang, T.-A., Lim, E.-P.: On joint modeling of topical communities and personal interest in microblogs. In: Aiello, L.M., McFarland, D. (eds.) SocInfo 2014. LNCS, vol. 8851, pp. 1–16. Springer, Heidelberg (2014)
Hong, L., Davison, B.: Empirical study of topic modeling in twitter. In: SOMA (2010)
Java, A., Song, X., Finin, T., Tseng, B.: Why we twitter: understanding microblogging usage and communities. In: WebKDD/SNA-KDD 2007 (2007)
Jurgen, A.: Twitter top 100 for software developers. http://www.noop.nl/2009/02/twitter-top-100-for-software-developers.html
Kwak, H., Chun, H., Moon, S.: Fragile online relationship: a first look at unfollow dynamics in twitter. In: CHI (2011)
Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: WWW (2010)
Li, D., He, B., Ding, Y., Tang, J., Sugimoto, C., Qin, Z., Yan, E., Li, J., Dong, T.: Community-based topic modeling for social tagging. In: CIKM 2010 (2010)
Lim, K.H., Datta, A.: Following the follower: detecting communities with common interests on twitter. In: HT (2012)
Liu, J.S.: The collapsed gibbs sampler in bayesian computations with applications to a gene regulation problem. J. Amer. Stat. Assoc (1994)
Mehrotra, R., Sanner, S., Buntine, W., Xie, L.: Improving LDA topic models for microblogs via tweet pooling and automatic labeling. In: SIGIR (2013)
Michelson, M., Macskassy, S.A.: Discovering users’ topics of interest on twitter: a first look. In: AND 2010 (2010)
Nallapati, R.M., Ahmed, A., Xing, E.P., Cohen, W.W.: Joint latent topic models for text and citations. In: KDD (2008)
Newman, M.E.J.: Modularity and community structure in networks. PNAS (2006)
Qiu, M., Jiang, J., Zhu, F.: It is not just what we say, but how we say them: LDA-based behavior-topic model. In: SDM (2013)
Ramage, D., Dumais, S.T., Liebling, D.J.: Characterizing microblogs with topic models. In: ICWSM (2010)
Ramage, D., Hall, D., Nallapati, R., Manning, C.D.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: EMNLP (2009)
Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: UAI (2004)
Sachan, M., Contractor, D., Faruquie, T.A., Subramaniam, L.V.: Using content and interactions for discovering communities in social networks. In: WWW (2012)
Sachan, M., Xing, E., et. al.: Spatial compactness meets topical consistency: jointly modeling links and content for community detection. In: WSDM (2014)
Schantl, J., Kaiser, R., Wagner, C., Strohmaier, M.: The utility of social and topical factors in anticipating repliers in twitter conversations. In: WebSci (2013)
Suh, B., Hong, L., Pirolli, P., Chi, E.H.: Want to be retweeted? large scale analytics on factors impacting retweet in twitter network. In: SocialCom (2010)
Talukdar, P.P., Pereira, F.: Experiments in graph-based semi-supervised learning methods for class-instance acquisition. ACL (2010)
Tuan-Anh, H.: Modeling user interest and community interest in microbloggings: an integrated approach. https://www.dropbox.com/s/h0o7dca1i83qkck/CPI.pdf
Wu, S., Hofman, J.M., Mason, W.A., Watts, D.J.: Who says what to whom on twitter. In: WWW (2011)
Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: SIGIR (2003)
Yang, J., McAuley, J., Leskovec, J.: Community detection in networks with node attributes. In: ICDM (2013)
Yang, J., Counts, S.: Predicting the speed, scale, and range of information diffusion in twitter. In: ICWSM (2010)
Yin, D., Hong, L., Davison, B.D.: Structural link analysis and prediction in microblogs. In: CIKM (2011)
Yin, Z., Cao, L., Gu, Q., Han, J.: Latent community topic analysis: integration of community discovery with topic modeling. ACM TIST (2012)
Zhao, W.X., Jiang, J., Weng, J., He, J., Lim, E.-P., Yan, H., Li, X.: Comparing twitter and traditional media using topic models. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 338–349. Springer, Heidelberg (2011)
Zhou, D., Manavoglu, E., Li, J., Giles, C.L., Zha, H.: Probabilistic models for discovering e-communities. In: WWW 2006 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Hoang, TA. (2015). Modeling User Interest and Community Interest in Microbloggings: An Integrated Approach. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9077. Springer, Cham. https://doi.org/10.1007/978-3-319-18038-0_55
Download citation
DOI: https://doi.org/10.1007/978-3-319-18038-0_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18037-3
Online ISBN: 978-3-319-18038-0
eBook Packages: Computer ScienceComputer Science (R0)