Abstract
Bipartite network is a branch of complex network. It is widely used in many applications such as social network analysis, collaborative filtering and information retrieval. Partitioning a bipartite network into smaller modules helps to get insight of the structure of the bipartite network. The main contributions of this paper include: (1) proposing an MDL 21 criterion for identifying a good partition of a bipartite network. (2) presenting a greedy algorithm based on combination theory, named as MDL-greedy, to approach the optimal partition of a bipartite network. The greedy algorithm automatically searches for the number of partitions, and requires no user intervention. (3) conducting experiments on synthetic datasets and the southern women dataset. The results show that our method generates higher quality results than the state-of-art methods Cross-Association and Information-theoretic co-clustering. Experiment results also show the good scalability of the proposed algorithm. The highest improvement could be up to about 14% for the precision, 40% for the ratio and 70% for the running time.
This work was supported by NSFC Grant Number: 60773169 and 11-th Five Years Key Programs for Sci. &Tech. Development of China under grant No. 2006BAI05A01.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Newman, M.E.J.: The Structure and Function of Complex Networks. SIAM Review 45, 167–256 (2003)
Guimerà , R., Amaral, L.: Functional cartography of complex metabolic networks. Nature 433, 895–900 (2005)
Danon, L., Duch, J., Diaz-Guilera, A., Arenas, A.: Comparing community structure identification. J. Stat. Mech. P09008 (2005)
Newman, M.E.J., Girvan, M.: Finding and Evaluating Community Structure in Networks. Physical Review EÂ 69, 026113 (2004)
Linden, G., Smith, B., York, J.: Amazon.com Recommendations: Item-to-Item Collaborative Filtering. IEEE Internet Computing 7, 76–80 (2003)
Newman, M.E.J.: Modularity and community structure in networks. Proceedings of the National Academy of Sciences 103, 8577–8582 (2006)
Pujol, J., Béjar, J., Delgado, J.: Clustering algorithm for determining community structure in large networks. Physical Review E 74, 9 (2006)
Fortunato, S., Barthelemy, M.: Resolution Limit in Community Detection. Proceedings of the National Academy of Sciences 104, 36–41 (2007)
Rosvall, M., Bergstrom, C.T.: An information-theoretic framework for resolving community structure in complex networks. Proc. Natl. Acad. Sci. USA 104, 7327–7331 (2007)
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proc. Natl. Acad. Sci. USA 105, 1118–1123 (2008)
Barron, A., Rissanen, J., Yu, B.: The minimum description principle in coding and modeling. IEEE Transactions on Information Theory 44, 2743–2760 (1998)
Strogatz, S.H.: Exploring complex networks. Nature 410, 268–276 (2001)
Chakrabarti, D., Papadimitriou, S., Modha, D.S., Faloutsos, C.: Fully automatic cross-associations. In: Proc. Tenth ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 79–88 (2004)
Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-clustering. In: KDD (2003)
Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM TCBB 1, 24–45 (2004)
Cheng, Y., Church, G.M.: Biclustering of expression data. In: ISMB (2000)
Guimerà , R., Sales-Pardo, M., Lan, A.: Module identification in bipartite and directed networks. Physical Review E 76 (2007)
Barber, M.J.: Modularity and community detection in bipartite network. Physical Review EÂ 76 (2007)
Lehmann, S., Schwartz, M., Hansen, L.K.: Biclique communities. Phys. Rev. EÂ 78, 016108 (2008)
Sun, J., Faloutsos, C., Papadimitriou, S., Yu, P.: GraphScope: Parameter-free mining of large time-evolving graphs. In: KDD (2007)
Papadimitriou, S., Sun, J., Faloutsos, C., Yu, P.: Hierarchical, parameter-free community discovery. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 170–187. Springer, Heidelberg (2008)
Sipser, M.: Introduction to the Theory of Computation. PWS Publishing Company (1997)
Ashlock, D.: Evolutionary computation for modeling and optimization. Springer, New York (2005)
Xu, K.K., Liu, Y.T., Tang, R.e.a.: A novel method for real parameter optimization based on Gene Expression Programming. Applies Soft Computing 9, 725–737 (2009)
Rosen, K.H.: Discrete mathematics and its applications, 4th edn. WCB/McGraw-Hill, Boston (1999)
Girvan, M., Neuman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99, 7821–7826 (2002)
Davis, A., Gardner, B.B., Gardner, M.R.: Deep South. University of Chicago Press, Chicago (1941)
Freeman, L.: Dynamic Social Network Modeling and Analysis. The National Academies Press, Washington (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, K., Tang, C., Li, C., Jiang, Y., Tang, R. (2010). An MDL Approach to Efficiently Discover Communities in Bipartite Network. In: Kitagawa, H., Ishikawa, Y., Li, Q., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 5981. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12026-8_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-12026-8_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12025-1
Online ISBN: 978-3-642-12026-8
eBook Packages: Computer ScienceComputer Science (R0)