skip to main content
10.1145/1242572.1242728acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

A link classification based approach to website topic hierarchy generation

Published: 08 May 2007 Publication History

Abstract

Hierarchical models are commonly used to organize a Website's content. A Website's content structure can be represented by a topic hierarchy, a directed tree rooted at a Website's homepage in which the vertices and edges correspond to Web pages and hyperlinks. In this work, we propose a new method for constructing the topic hierarchy of a Website. We model the Website's link structure using weighted directed graph, in which the edge weights are computed using a classifier that predicts if an edge connects a pair of nodes representing a topic and a sub-topic. We then pose the problem of building the topic hierarchy as finding the shortest-path tree and directed minimum spanning tree in the weighted graph. We've done extensive experiments using real Websites and obtained very promising results.

References

[1]
W. S. Li, O Kolak, Q. Vu and H. Takan Defining Logical Domains in a Website. Proc. of 11th ACM Conf. on Hypertext and Hypermedia, San Antonio, 2000.
[2]
Z. Chen, S. Liu, W. Liu, G Pu and W. Y. Ma. Building a Web Thesaurus from Web Link Structure. In Proc. of the 25th ACM SIGIR Conference, Finland, 2002.
[3]
N. Liu and C. C. Yang. Automatic Extraction of Website's Content Structure from Link Structure. In Proc. Of ACM CIKM, 2005.

Cited By

View all
  • (2017)Explicitly and implicitly exploiting the hierarchical structure for mining website interests on news eventsInformation Sciences: an International Journal10.1016/j.ins.2017.08.056420:C(263-277)Online publication date: 1-Dec-2017
  • (2011)Multilingual document mining and navigation using self-organizing mapsInformation Processing and Management: an International Journal10.1016/j.ipm.2009.12.00347:5(647-666)Online publication date: 1-Sep-2011
  • (2009)Keyphrase extraction for labeling a website topic hierarchyProceedings of the 11th International Conference on Electronic Commerce10.1145/1593254.1593266(81-88)Online publication date: 12-Aug-2009
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '07: Proceedings of the 16th international conference on World Wide Web
May 2007
1382 pages
ISBN:9781595936547
DOI:10.1145/1242572
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. content structure
  2. topioc hierarchy
  3. website mining

Qualifiers

  • Article

Conference

WWW'07
Sponsor:
WWW'07: 16th International World Wide Web Conference
May 8 - 12, 2007
Alberta, Banff, Canada

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2017)Explicitly and implicitly exploiting the hierarchical structure for mining website interests on news eventsInformation Sciences: an International Journal10.1016/j.ins.2017.08.056420:C(263-277)Online publication date: 1-Dec-2017
  • (2011)Multilingual document mining and navigation using self-organizing mapsInformation Processing and Management: an International Journal10.1016/j.ipm.2009.12.00347:5(647-666)Online publication date: 1-Sep-2011
  • (2009)Keyphrase extraction for labeling a website topic hierarchyProceedings of the 11th International Conference on Electronic Commerce10.1145/1593254.1593266(81-88)Online publication date: 12-Aug-2009
  • (2009)Automatic Construction of Multilingual Web Directory Using Self-Organizing MapsProceedings of the 2009 Fifth International Joint Conference on INC, IMS and IDC10.1109/NCM.2009.78(1283-1288)Online publication date: 25-Aug-2009
  • (2009)Multilingual Hierarchy Generation and Alignment Using Self-Organizing MapsProceedings of the 2009 International Conference on Education Technology and Computer10.1109/ICETC.2009.9(326-330)Online publication date: 17-Apr-2009
  • (2008)User oriented link function classificationProceedings of the 17th international conference on World Wide Web10.1145/1367497.1367721(1191-1192)Online publication date: 21-Apr-2008

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media