|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
Web Structure Mining by Isolated Cliques
Yushi UNO Yoshinobu OTA Akio UEMICHI
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E90-D
No.12
pp.1998-2006 Publication Date: 2007/12/01 Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e90-d.12.1998 Print ISSN: 0916-8532 Type of Manuscript: PAPER Category: Data Mining Keyword: link analysis, link farm, isolated clique, webgraph, web community, web structure mining,
Full Text: PDF(720.7KB)>>
Summary:
The link structure of the Web is generally viewed as the webgraph. Web structure mining is a research area that mainly aims to find hidden communities by focusing on the webgraph, and communities or their cores are supposed to constitute dense subgraphs. Therefore, structure mining can actually be realized by enumerating such substructures, and Kleinberg's biclique model is well-known among them. In this paper, we examine some candidate substructures, including conventional bicliques, and attempt to find useful information from the real web data. Especially, we newly exploit isolated cliques for our experiments of structure mining. As a result, we discovered that isolated cliques that lie over multiple domains can stand for useful communities, which implies the validity of isolated clique as a candidate substructure for structure mining. On the other hand, we also observed that most of isolated cliques on the Web correspond to menu structures and are inherent in single domains, and that isolated cliques can be quite useful for detecting harmful link farms.
|
open access publishing via
|
|
|
|
|
|
|
|