An Improved Algorithm for Extracting Research Communities from Bibliographic Data

Nakamura, Yushi; Horiike, Toshihiko; Taira, Yoshimasa; Sakamoto, Hiroshi

doi:10.1007/978-3-642-14589-6_34

Yushi Nakamura²²,
Toshihiko Horiike²²,
Yoshimasa Taira²² &
…
Hiroshi Sakamoto^22,23

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6193))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

660 Accesses

Abstract

In this paper we improve the performance of the community extraction algorithm in [1] from bibliographic data, which was originally proposed for web community discovery by [2]. A web community is considered to be a set of web pages holding a common topic, in other words, it is a dense subgraph induced in web graph. Such subgraphs obtained by the max-flow algorithm are called max-flow communities, and this algorithm was improved to obtain research communities from bibliographic data by the strategy for selection of community nodes in [1]. We propose an improvement of this algorithm by carefully selecting initial seed node, and show the performance of this algorithm by experiments for the list of many keywords frequently appearing in data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Horiike, T., Takahashi, Y., Kuboyama, T., Sakamoto, H.: Extracting research communities by improved maximum flow algorithm. In: Velásquez, J.D., Ríos, S.A., Howlett, R.J., Jain, L.C. (eds.) KES 2009, Part II. LNCS, vol. 5712, pp. 472–479. Springer, Heidelberg (2009)
Google Scholar
Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: KDD 2000, pp. 150–160 (2000)
Google Scholar
Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.: Self-organization and identification of web communities. IEEE Computer 35(3), 66–71 (2002)
Google Scholar
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the web for emerging cyber-communities. Computer Networks 31(11-16), 1481–1493 (1999)
Article Google Scholar
Chakrabarti, S., Dom, B., Raghavan, P., Rajagopalan, S., Gibson, D., Kleinberg, J.M.: Automatic resource compilation by analyzing hyperlink structure and associated text. Computer Networks 30(1-7), 65–74 (1998)
Google Scholar
Gibson, D., Kleinberg, J.M., Raghavan, P.: Inferring web communities from link topology. In: Hypertext 1998, pp. 225–234 (1998)
Google Scholar
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Extracting large-scale knowledge bases from the web. In: VLDB 1999, pp. 639–650 (1999)
Google Scholar
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. In: SODA 1998, pp. 668–677 (1998)
Google Scholar
Goldberg, A., Tarjan, R.: A new approach to the maximal flow problem. In: STOC 1986, pp. 136–146 (1986)
Google Scholar
Ford Jr., L., Fulkerson, D.: Maximal flow through a network. Canadian Journal of Mathematics 8, 399–404 (1956)
MATH MathSciNet Google Scholar
Edmonds, J., Karp, R.M.: Theoretical improvements in algorithmic efficiency for network flow problems. J. ACM 19(2), 248–264 (1972)
Article MATH Google Scholar
CiteSeer.IST: http://citeseer.ist.psu.edu/
Imafuji, N., Kitsuregawa, M.: Effects of maximum flow algorithm on identifying web community. In: WIDM 2002, pp. 43–48 (2002)
Google Scholar
Toyoda, M., Kitsuregawa, M.: Creating a web community chart for navigating related communities. In: Hypertex 2001, pp. 103–112 (2001)
Google Scholar
Imafuji, N., Kitsuregawa, M.: Finding a web community by maximum flow algorithm with hits score based capacity. In: DASFAA 2003, pp. 101–106 (2003)
Google Scholar
Dean, J., Henzinger, M.R.: Finding related pages in the world wide web. Computer Networks 31(11-16), 1467–1479 (1999)
Article Google Scholar
Asano, Y., Nishizeki, T., Toyoda, M., Kitsuregawa, M.: Mining communities on the web using a max-flow and a site-oriented framework. IEICE Transactions 89-D(10), 2606–2615 (2006)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Kyushu Institute of Technology, 680-4 Kawazu, Iizuka-shi, Fukuoka, 820-8502, Japan
Yushi Nakamura, Toshihiko Horiike, Yoshimasa Taira & Hiroshi Sakamoto
PRESTO JST, Kawaguchi Center Building 4-1-8, Honcho, Kawaguchi-shi, Saitama, 332-0012, Japan
Hiroshi Sakamoto

Authors

Yushi Nakamura
View author publications
You can also search for this author in PubMed Google Scholar
Toshihiko Horiike
View author publications
You can also search for this author in PubMed Google Scholar
Yoshimasa Taira
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Sakamoto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Informatics, Kyoto University, Yoshida Honmachi, Sakyo, 606-8501, Kyoto, Japan
Masatoshi Yoshikawa
Information School, Renmin University of China, 100872, Beijing, China
Xiaofeng Meng
Graduate School of Engineering, University of Hyogo, 2167 Shosha, Himeji, 671-2280, Hyogo, Japan
Takayuki Yumoto
Graduate School of Informatics, Kyoto University, Yoshidahonmachi, Sakyo, 606-8501, Kyoto, Japan
Qiang Ma
Institute of HCI and Media Integration, Tsinghua University, 100084, Bejing, China
Lifeng Sun
Department of Information Science, Ochanomizu University, 2-1-1, Otsuka, Bunkyo-ku, 112-8610, Tokyo, Japan
Chiemi Watanabe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nakamura, Y., Horiike, T., Taira, Y., Sakamoto, H. (2010). An Improved Algorithm for Extracting Research Communities from Bibliographic Data. In: Yoshikawa, M., Meng, X., Yumoto, T., Ma, Q., Sun, L., Watanabe, C. (eds) Database Systems for Advanced Applications. DASFAA 2010. Lecture Notes in Computer Science, vol 6193. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14589-6_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-14589-6_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14588-9
Online ISBN: 978-3-642-14589-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics