Abstract
Data in the form of multiple matrices of relations among objects of a single type, representable as a collection of unipartite graphs, arise in a variety of biological settings, with collections of author-recipient email, and in social networks. Clustering the objects of study or situating them in a low dimensional space (e.g., a simplex) is only one of the goals of the analysis of such data; being able to estimate relational structures among the clusters themselves may be important. In , we introduced the family of stochastic block models of mixed membership to support such integrated data analyses. Our models combine features of mixed-membership models and block models for relational data in a hierarchical Bayesian framework. Here we present a nested variational inference scheme for this class of models, which is necessary to successfully perform fast approximate posterior inference, and we use the models and the estimation scheme to examine two data sets. (1) a collection of sociometric relations among monks is used to investigate the crisis that took place in a monastery [2], and (2) data from a school-based longitudinal study of the health-related behaviors of adolescents. Both data sets have recently been reanalyzed in [3] using a latent position clustering model and we compare our analyses with those presented there.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Stochastic block models of mixed membership. Manuscript under review (2006)
Sampson, F.S.: A Novitiate in a period of change: An experimental and case study of social relationships. PhD thesis, Cornell University (1968)
Handcock, M.S., Raftery, A.E., Tantrum, J.M.: Model-based clustering for social networks. Journal of the Royal Statistical Society, Series A 170, 1–22 (2007)
Holland, P.W., Leinhardt, S.: Local structure in social networks. In: Heise, D. (ed.) Sociological Methodology, pp. 1–45. Jossey-Bass, San Fransisco (1975)
Fienberg, S.E., Meyer, M.M., Wasserman, S.: Statistical analysis of multiple sociometric relations. Journal of the American Statistical Association 80, 51–67 (1985)
Wasserman, S., Pattison, P.: Logit models and logistic regression for social networks: I. an introduction to markov graphs and p  ∗ . Psychometrika 61, 401–425 (1996)
Snijders, T.A.B.: Markov chain monte carlo estimation of exponential random graph models. Journal of Social Structure (2002)
Hoff, P.D., Raftery, A.E., Handcock, M.S.: Latent space approaches to social network analysis. Journal of the American Statistical Association 97, 1090–1098 (2002)
Doreian, P., Batagelj, V., Ferligoj, A.: Generalized Blockmodeling. Cambridge University Press, Cambridge (2004)
Taskar, B., Wong, M.F., Abbeel, P., Koller, D.: Link prediction in relational data. In: Neural Information Processing Systems, vol. 15 (2003)
Kemp, C., Griffiths, T.L., Tenenbaum, J.B.: Discovering latent classes in relational data. Technical Report AI Memo 2004-019, MIT (2004)
Kemp, C., Tenenbaum, J.B., Griffiths, T.L., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: Proceedings of the 21st National Conference on Artificial Intelligence (2006)
McCallum, A., Wang, X., Mohanty, N.: Joint group and topic discovery from relations and text. In: Airoldi, E., Blei, D.M., Fienberg, S.E., Goldenberg, A., Xing, E.P., Zheng, A.X. (eds.) ICML 2006. LNCS, vol. 4503, pp. 28–44. Springer, Heidelberg (2007)
Blei, D.M., Ng, A., Jordan, M.I.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Cohn, D., Hofmann, T.: The missing link—A probabilistic model of document content and hypertext connectivity. In: Advances in Neural Information Processing Systems, vol. 13 (2001)
Erosheva, E.A., Fienberg, S.E., Lafferty, J.: Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences 97(22), 11885–11892 (2004)
Barnard, K., Duygulu, P., de Freitas, N., Forsyth, D., Blei, D., Jordan, M.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
Erosheva, E.A., Fienberg, S.E.: Bayesian mixed membership models for soft clustering and classification. In: Weihs, C., Gaul, W. (eds.) Classification—The Ubiquitous Challenge, pp. 11–26. Springer, Heidelberg (2005)
Manton, K.G., Woodbury, M.A., Tolley, H.D.: Statistical Applications Using Fuzzy Sets. Wiley, Chichester (1994)
Rosenberg, N.A., Pritchard, J.K., Weber, J.L., Cann, H.M., Kidd, K.K., Zhivotovsky, L.A., Feldman, M.W.: Genetic structure of human populations. Science 298, 2381–2385 (2002)
Pritchard, J., Stephens, M., Donnelly, P.: Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000)
Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.: Distance metric learning with applications to clustering with side information. In: Advances in Neural Information Processing Systems, vol. 16 (2003)
Holland, P., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: Some first steps. Social Networks 5, 109–137 (1983)
Anderson, C.J., Wasserman, S., Faust, K.: Building stochastic blockmodels. Social Networks 14, 137–161 (1992)
Nowicki, K., Snijders, T.A.B.: Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association 96, 1077–1087 (2001)
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Admixtures of latent blocks with application to protein interaction networks. Manuscript under review (2006)
Airoldi, E.M., Fienberg, S.E., Xing, E.P.: Latent aspects analysis for gene expression data. Manuscript under review (2006)
Carley, K.M.: Smart agents and organizations of the future. In: Lievrouw, L., Livingstone, S. (eds.) The Handbook of New Media, pp. 206–220 (2002)
Jordan, M., Ghahramani, Z., Jaakkola, T., Saul, L.: Introduction to variational methods for graphical models. Machine Learning 37, 183–233 (1999)
Airoldi, E.M., Fienberg, S.E., Joutard, C., Love, T.M.: Discovering latent patterns with hierarchical Bayesian mixed-membership models and the issue of model choice. Technical Report CMU-ML-06-101, School of Computer Science, Carnegie Mellon University (2006)
Xing, E.P., Jordan, M.I., Russell, S.: A generalized mean field algorithm for variational inference in exponential families. In: Uncertainty in Artificial Intelligence, vol. 19 (2003)
Schervish, M.J.: Theory of Statistics. Springer, Heidelberg (1995)
Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families and variational inference. Technical Report 649, Department of Statistics, University of California, Berkeley (2003)
David, G.B., Carley, K.M.: Clearing the FOG: Fuzzy, overlapping groups for social networks. Manuscript under review (2006)
Breiger, R.L., Boorman, S.A., Arabie, P.: An algorithm for clustering relational data with applications to social network analysis and comparison to multidimensional scaling. Journal of Mathematical Psychology 12, 328–383 (1975)
Harris, K.M., Florey, F., Tabor, J., Bearman, P.S., Jones, J., Udry, R.J.: The national longitudinal study of adolescent health: research design. Technical report, Caorlina Population Center, University of North Carolina, Chapel Hill (2003)
Udry, R.J.: The national longitudinal study of adolescent health: (add health) waves i and ii, 1994–1996; wave iii 2001–2002. Technical report, Carolina Population Center, University of North Carolina, Chapel Hill (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P. (2007). Combining Stochastic Block Models and Mixed Membership for Statistical Network Analysis. In: Airoldi, E., Blei, D.M., Fienberg, S.E., Goldenberg, A., Xing, E.P., Zheng, A.X. (eds) Statistical Network Analysis: Models, Issues, and New Directions. ICML 2006. Lecture Notes in Computer Science, vol 4503. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73133-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-73133-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73132-0
Online ISBN: 978-3-540-73133-7
eBook Packages: Computer ScienceComputer Science (R0)