C&C: An Effective Algorithm for Extracting Web Community Cores

Zhang, Xianchao; Li, Yueting; Liang, Wenxin

doi:10.1007/978-3-642-14589-6_32

Xianchao Zhang²²,
Yueting Li²² &
Wenxin Liang²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6193))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

684 Accesses
3 Citations

Abstract

Communities is a significant pattern of the Web. A community is a group of pages related to a common topic. Web communities are able to be characterized by dense bipartite subgraphs. Each community almost surely contains at least one core. A core is a complete bipartite graph (CBG). Focusing on the issues of extracting such community cores from the Web, in this paper we propose an effective C&C algorithm based on combination and consolidation to extract all embedded cores in web graphs. Experiments on real and large data collections demonstrate that the proposed algorithm C&C is efficient and effective for the community core extraction because: 1) all the largest emerging cores can be identified; 2) identifying all the embedded cores with different sizes only requires one-pass execution of C&C; 3) the extraction process needs no user-determined parameters in C&C.

This work was partially supported by NSFC under grant No. 60873180, and by the start-up funding (#1600-893313) for newly appointed academic staff of Dalian University of Technology, China.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adamic, L.A., Huberman, B.A.: Pawer-Law Distribution of the World Wide Web. Science 287, 2115 (2000)
Article Google Scholar
Agrawal, R., Srikanth, R.: Fast algorithms for mining association rules. In: proceedings of 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann, San Fransisco (1994)
Google Scholar
Boldi, P., Vigna, S.: The Web Graph Framework: Compression Techniques. In: Proceedings of the Thirteenth International World Wide Web Conference, pp. 595–601. ACM, New York (2004)
Google Scholar
Borodin, A., Gareth, O., Jeffrey, S., Tsaparas, P.: Finding authorities and hubs from link structures on the World Wide Web. In: Proceedings of the 10th international conference on World Wide Web, pp. 415–429. ACM, New York (2001)
Google Scholar
Dourisboure, Y., Geraci, F., Pellegrini, M.: Extraction and classification of dense communities in the web. In: 16th international conference on World Wide Web, pp. 461–470. ACM, New York (2007)
Chapter Google Scholar
Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of Web communities. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 150–160. ACM, New York (2000)
Chapter Google Scholar
Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.M.: Self-Organization and Identification of Web Communities. Computer 35, 66–71 (2002)
Article Google Scholar
Gibson, D., Kleinberg, J.M., Raghavan, P.: Inferring Web communities from link topology. In: Proceedings of the ninth ACM conference on Hypertext and hypermedia: links, objects, time and space, pp. 225–234. ACM, New York (1998)
Chapter Google Scholar
Gibson, D., Kumar, R., Tomkins, A.: Discovering large dense subgraphs in massive graphs. In: 31st international conference on Very large data bases, pp. 721–732. ACM, New York (2005)
Google Scholar
Hao, J.X., Orlin, J.B.: A faster algorithm for finding the minimum cut in a graph. In: Proceedings of the third annual ACM-SIAM symposium on Discrete algorithms, pp. 165–174. SIAM, Philadelphia (1992)
Google Scholar
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 604–632 (1999)
Article MATH MathSciNet Google Scholar
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for emerging cyber-communities. Computer Networks 31, 11–16 (1999)
Article Google Scholar
Park, H.W., Thelwall, M.: Hyperlink Analyses of the World Wide Web: A Review. Journal of Computer Mediated Communication 8(4) (2003)
Google Scholar
Reddy, P.K., Kitsuregawa, M.: An Approach to Find Related Communities Based on Bipartite Graphs. Institute of Electronics, Information and Communication Engineers 101, 7–14 (2001)
Google Scholar
Stoer, M., Wagner, F.: A simple min-cut algorithm. Journal of the ACM 44, 585–591 (1997)
Article MATH MathSciNet Google Scholar
WISDOM Lab.: http://wisdom.dlut.edu.cn/
Zhang, Y.C., Yu, J.X., Hou, J.Y.: Web communities: analysis and construction. Springer, Berlin (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Software, Dalian University of Technology, China
Xianchao Zhang, Yueting Li & Wenxin Liang

Authors

Xianchao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yueting Li
View author publications
You can also search for this author in PubMed Google Scholar
Wenxin Liang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Informatics, Kyoto University, Yoshida Honmachi, Sakyo, 606-8501, Kyoto, Japan
Masatoshi Yoshikawa
Information School, Renmin University of China, 100872, Beijing, China
Xiaofeng Meng
Graduate School of Engineering, University of Hyogo, 2167 Shosha, Himeji, 671-2280, Hyogo, Japan
Takayuki Yumoto
Graduate School of Informatics, Kyoto University, Yoshidahonmachi, Sakyo, 606-8501, Kyoto, Japan
Qiang Ma
Institute of HCI and Media Integration, Tsinghua University, 100084, Bejing, China
Lifeng Sun
Department of Information Science, Ochanomizu University, 2-1-1, Otsuka, Bunkyo-ku, 112-8610, Tokyo, Japan
Chiemi Watanabe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, X., Li, Y., Liang, W. (2010). C&C: An Effective Algorithm for Extracting Web Community Cores. In: Yoshikawa, M., Meng, X., Yumoto, T., Ma, Q., Sun, L., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 6193. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14589-6_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-14589-6_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14588-9
Online ISBN: 978-3-642-14589-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics