Abstract
This paper describes a path-based method to use the multi-step navigation information discovered from website structures for web page ranking. Use of hyperlinks to enhance page ranking has been widely studied. The underlying assumption is that hyperlinks convey recommendations. Although this technique has been used successfully in global web search, it produces poor results for website search, because the majority of the hyperlinks in local websites are used to organize information and convey no recommendations. This paper defines the Hierarchical Navigation Path (HNP) as a new resource to exploit these hyperlinks for improved web search. HNP is composed of multi-step hyperlinks in visitors’ website navigation. It provides indications of the content of the destination page. The HierPathExt algorithm is given to extract HNPs in local websites. Then, the PathRank algorithm is created to use HNPs for web page retrieval. The experiments show that our approach results in significant improvements over existing solutions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Broder, Kumar, R., Maghoul, F., et al.: Graph structure in the web. In: Proc. of WWW 2000 (2000)
Broder: A taxonomy of web search. SIGIR Forum 36(2), 3–10 (2002)
Chen, M., et al.: A System for Organizing Intranet Search Results. In: Proc. of USENIX USITS (1999)
Shen, Sun, J.-T., Yang, Q., Chen, Z.: A comparison of implicit and explicit links for web page classification. In: Proc. of WWW 2006, pp. 643–650 (2006)
Cai, D., He, X., et al.: Block-Level Link Analysis. In: Proc. of SIGIR 2004, pp. 440–447 (2004)
Glover, J., Tsioutsiouliklis, K., Lawrence, S., Pennock, D.M., Flake, G.W.: Using web structure for classifying and describing web pages. In: Proc. of WWW 2002, pp. 562–569 (2002)
Chi, E.H., et al.: Using Information Scent to Model User Information Needs and Actions on the Web. In: Proc. of SIGCHI (2001)
Xue, Zeng, H., et al.: Implicit Link Analysis to Small Web Search. In: Proc. of SIGIR 2003, pp. 56–63 (2003)
Hagen, P., Manning, H., Paul, Y.: Must search stink? The Forrester report (June 2000)
Hawking, D., Voorhees, E., Bailey, P., Craswell, N.: Overview of TREC-8 web track. In: Proceeding of TREC-8, pp. 131–150 (1999)
Kleinberg: Authoritative source in a hyperlinked environment. J. of ACM 46, 604–622 (1999)
Chen, L., Baoyao, Y., et al.: Function-based object model towards Website Adaptation. In: Proc. WWW 2001 (2001)
Sepandar, H., Taher, M., Christopher, G.: Gene. Exploiting the Block Structure of the Web for Computing PageRank, Stanford University Technical Report (2003)
Matsuda, Fukushima, T.: Task-Oriented World Wide Web Retrieval by Document Type Classification. In: Proc. of CIKM 1999, pp. 109–113 (1999)
Page, Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web, Technical Report, Stanford University (1998)
Najork, Wiener, J.: Breadth-First Search Crawling Yields High-Quality Pages. In: Proc. of WWW 2000, pp. 114–118 (2000)
Henzinger, M.: Hyperlink analysis on the world wide web. In: Proc. of ACM Hypertext 2005 (2005)
Eiron, McCurley, K.: Analysis of anchor text for web search. In: Proc. of SIGIR 2003, pp. 459–460 (2003)
Pandit, S., Olston, C.: Source, Navigation-Aided Retrieval. In: Proc. of WWW 2007, pp. 391–400 (2007)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Fagin, R., Kumar, R., McCurley, K.S., Novak, J., Sivakumar, D., Tomlin, J.A., Williamson, D.P.: Searching the workplace web. In: Proc. of WWW 2003, pp. 366–375 (2003)
Robertson, S.E., Walker, S., et al.: Okapi at TREC. In: Text REtrieval Conference (1992)
Hu, Y., Xin, G., Song, R., Hu, G., et al.: Title Extraction from Bodies of HTML Documents and Its Application to Web Page Retrieval. In: Proceeding of SIGIR 2005, pp. 250–257 (2005)
Mizuuchi, Y., Tajima, K.: Finding context path for web pages. In: Proc. of ACM Hypertext (1999)
Chen, Z., Liu, S.: Building Web Thesaurus from Web Link Structure. In: Proc. of SIGIR 2003 (2003)
Nie, Z., Zhang, Y., Wen, J.R., et al.: Object-level ranking: bringing order to objects. In: Proc. of WWW 2005 (2005)
http://ir.dcs.gla.ac.uk/test_collections/access_to_data.html
Rose, Levinson: Understanding User Goals in Web Search. In: Proc. of WWW 2004, pp. 13–19 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, J., Zhao, Y. (2009). PathRank: Web Page Retrieval with Navigation Path. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds) Advances in Information Retrieval. ECIR 2009. Lecture Notes in Computer Science, vol 5478. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00958-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-00958-7_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00957-0
Online ISBN: 978-3-642-00958-7
eBook Packages: Computer ScienceComputer Science (R0)