Abstract
A bipartite structure is a common property of many real-world network data sets such as agents which are affiliated with societies, customers who buy, rent, or rate products, and authors who write scientific papers. The one-mode projection of these networks onto either set of entities (e.g., societies, products, and articles) is a well-established approach for the analysis of such data and deduces relations between these entities. Some bipartite data sets of key importance contain several distinct types of relations between their entities. These networks require a projection method which accounts for multiple edge types. In this article, we present the multiplex extension of an existing projection algorithm for simplex bipartite networks, i.e., networks that contain a single type of relation. We use synthetic data to show the robustness of our method before applying it to a real-world network of user ratings for films, namely, the Netflix data set. Based on the assumption that co-ratings of films contain information about the films’ similarity, we analyse the multiplex projection as an approximation of the similarity landscape of the films. Besides comparing the projection to the coarse-grained classification of films into genres, we validate the resulting similarities based on ground truth data sets containing film series. Our analysis confirms the predictive power of the network of positive co-ratings. We furthermore explore the potential of additional, mixed co-rating patterns in improving the prediction of similarities and highlight necessary criteria for this approach.









Similar content being viewed by others
Notes
Note that in the used Netflix data set the same user rates a certain film only once by either liking or disliking it. Thus, the maximal multiplicity of the resulting bipartite graph is 1.
We also ran experiments on synthetic data where the degree sequence on one of the node sets of the bipartite graph was more homogenous. This work in progress shows that the presented multiplex one-mode projection is robust when using a different network model as well.
The term ground truth is a standard term in machine learning which defines the set of observations that is to be re-discovered by a good algorithm. Any algorithm can then be evaluated by the number of true positive predictions, i.e., those that are in the ground truth, the number of false positives, i.e., those not in the ground truth set but predicted by the algorithm, the number of true negatives (not predicted, not present in ground truth), and the number of false negatives (not predicted, but present in ground truth).
The Area Under (the receiver operating, ROC) Curve is a standard machine learning measure, which quantifies the probability that true positives are assigned lower scores than true negatives by a given algorithm (Fawcett 2006). Thus, a perfect one-mode projection algorithm regarding ground truth has an AUC of 1 while random guessing results in an AUC of 0.5.
References
Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466:761–764
Barabási AL, Jeong H, Néda Z, Ravasz E, Schubert A, Vicsek T (2002) Evolution of the social network of scientific collaborations. Physica A 311:590–614
boyd d, Crawford K (2011) Six provocations for big data. In: A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society
Breiger RL (1974) The duality of persons and groups. Soc Forces 53(2):181–190
Bródka, Stawiak P, Kazienko P (2011) Shortest path discovery in the multi-layered social network. In: Proceedings of the 2011 Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM ’11), pp 497–501
Campbell C, Yang S, Albert R, Sheab K (2011) A network model for plant–pollinator community assembly. Proc Natl Acad Sci 108:197–202
Davis D, Lichtenwalter R, Chawla NV (2012) Supervised methods for multi-relational link prediction. Soc Netw Anal Min, pp 1–15
Eagle N, Pentland AS, Lazer D (2009) Inferring friendship network structure by using mobile phone data. Proc Natl Acad Sci 106:15274–15278
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874
Foster JG, Foster DV, Grassberger P, Paczuski M (2010) Edge direction and the structure of networks. Proc Natl Acad Sci 107(24):10815–10820
Film series at Wikipedia. http://en.wikipedia.org/wiki/Film_series/
Gionis A, Mannila H, Mielikinen T, Tsaparas P (2007) Assessing data mining results via swap randomization. ACM Trans Knowl Discov Data 1(3). Art no 14
Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99:7821–7826
Gómez-Gardeñes J, Vilone D, Sanchez A (2011) Disentangling social and group heterogeneities: Public Goods games on complex networks. Eur J Phys 95:68003
Gotelli NJ, Graves GR (1996) Null-Models in Ecology. Smithsonian Institution Press, Washington, DC
Holme P, Liljeros F, Edling CR, Kim BJ (2003) Network bipartivity. Phys Rev E 68:056107
Horvát EÁ, Zweig KA (2012) One-mode projection of bipartite graphs. In: Proceedings of the 2012 Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM ’12), pp 598–605
Kazienko P, Musial K, Kajdanowicz T (2011) Multidimensional social network in the social recommender system. IEEE Trans Syst Man Cybern Part A Syst Hum 41(4):746–759
Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78:046110
Lehmann S, Schwartz M, Hansen LK (2008) Biclique communities. Phys Rev E 78:016108
Lewis K, Kaufman J, Gonzalez M, Wimmer A, Christachis N (2008) Tastes, ties, and time: a new social network dataset using facebook.com. Soc Netw 30:330–342
Li M, Fan Y, Chen J, Gao L, Di Z, Wu J (2005) Weighted networks of scientific communication: the measurement and topological role of weight. Physica A 350:643–656
Li N, Chen G (2009) Multi-layered friendship modeling for location-based mobile social networks. In: Proceedings of Mobiquitous 2009 (MobiQuitous ’09), pp 1–10
Magnani M, Rossi L (2011) The ML-model for multi-layer social networks. In: Proceedings of the 2011 Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM ’11), pp 5–12
Mane KK, Börner K (2004) Mapping topics and topic bursts in PNAS. Proc Natl Acad Sci 101:5287–5290
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: homophily in social networks. Annu Rev Sociol 27:415–444
Milo R, Shen-Orr SS, Itzkovitz S, Kashtan N, Chklovskii D, Alon U (2004) Network motifs: simple building blocks of complex networks. Science 298:824–827
Mucha PJ, Richardson T, Macon K, Porter MA, Onnela JP (2010) Community structure in time-dependent, multiscale, and multiplex networks. Science 328:876–878
Neal Z (2013) Identifying statistically significant edges in one-mode projections. Soc Netw Anal Min, pp 1–10
Newman MEJ (2001a) Scientific collaboration networks. I. Network construction and fundamental results. Phys Rev Lett 64:016131
Newman MEJ (2001b) Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. Phys Rev Lett 64:016132
Newman MEJ (2002) Assortative mixing in networks. Phys Rev Lett 89:208701
Newman MEJ (2004) Coauthorship networks and patterns of scientific collaboration. Proc Natl Acad Sci 101:5200–5205
Park J, Barabási AL (2007) Distribution of node characteristics in complex networks. Proc Natl Acad Sci 104(46):17916–17920
Piatetsky-Shapiro G, Frawley W (1991) Knowledge Discovery in Databases. AAAI/MIT Press, Cambridge, pp 229–248
Ramasco JJ, Dorogovtsev S, Pastor-Satorras R (2004) Self-organization of collaboration networks. Phys Rev E 70:036106
Ramasco JJ, Morris SA (2006) Social inertia in collaboration networks. Phys Rev E 73:016122
Saavedra S, Reed-Tsochas F, Uzzi B (2009) A simple model of bipartite cooperation for ecological and organizational networks. Nature 457:463–466
Shen-Orr SS, Milo R, Mangan S, Alon U (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet 31:64–68
Szell M, Lambiotte R, Thurner S (2010) Multirelational organization of large-scale social networks in an online world. Proc Natl Acad Sci 107:13636–13641
Szell M, Thurner S (2010) Measuring social dynamics in a massive multiplayer online game. Soc Netw 32:313–329
The Internet Movie Database (IMDb). Alternative interfaces. http://imdb.com/interfaces
The Netflix Prize. http://www.netflixprize.com/
Uhlmann S, Mannsperger H, Zhang JD, Horvát EÁ, Schmidt C, Küblbeck M, Ward A, Tschulena U, Zweig K, Korf U, Wiemann S, Sahin Ö (2012) Global miRNA regulation of a local protein network: case study with the EGFR-driven cell cycle network in breast cancer. Mol Syst Biol 570:8
Wasserman S, Faust K (1994) Social network analysis: methods and applications. Cambridge University Press, Cambridge
Watts DJ, Strogatz SH (1998) Collective dynamics of ’small-world’ networks. Nature 393:440–442
Yeger-Lotem E, Sattath S, Kashtan N, Itzkovitz S, Milo R, Pinter RY, Alon U, Margalit H (2004) Network motifs in integrated cellular networks of transcription–regulation and protein–protein interaction. Proc Natl Acad Sci 101:5934–5939
Zahoránszky L, Katona G, Hári P, Málnási-Csizmadia A, Zweig K, Zahoránszky-Kőhalmi G (2009) Breaking the hierarchy—a new cluster selection mechanism for hierarchical clustering methods. Algorithms Mol Biol 4:12
Zhou T, Ren J, Medo M, Zhang YC (2007) Bipartite network projection and personal recommendation. Phys Rev E 76:046115
Zweig KA (2010) How to forget the second side of the story: a new method for the one-mode projection of bipartite graphs. In: Proceedings of the second Interntional Conference on Advances in Social Networks Analysis and Mining (ASONAM’10), pp 200–207
Zweig KA, Kaufmann M (2011) A systematic approach to the one-mode projection of bipartite graphs. Soc Netw Anal Min 1(3):187–218
Acknowledgments
The authors would like to thank Andreas Spitz for useful discussions, ground truth data, and software. The authors are also grateful to the anonymous reviewers for their helpful comments. EÁH is supported by the Heidelberg Graduate School of Mathematical and Computational Methods for the Sciences, University of Heidelberg, Germany, which is funded by the German Excellence Initiative (GSC 220).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Horvát, EÁ., Zweig, K.A. A fixed degree sequence model for the one-mode projection of multiplex bipartite graphs. Soc. Netw. Anal. Min. 3, 1209–1224 (2013). https://doi.org/10.1007/s13278-013-0133-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13278-013-0133-9