Skip to main content

PathRank: Web Page Retrieval with Navigation Path

  • Conference paper
Book cover Advances in Information Retrieval (ECIR 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5478))

Included in the following conference series:

Abstract

This paper describes a path-based method to use the multi-step navigation information discovered from website structures for web page ranking. Use of hyperlinks to enhance page ranking has been widely studied. The underlying assumption is that hyperlinks convey recommendations. Although this technique has been used successfully in global web search, it produces poor results for website search, because the majority of the hyperlinks in local websites are used to organize information and convey no recommendations. This paper defines the Hierarchical Navigation Path (HNP) as a new resource to exploit these hyperlinks for improved web search. HNP is composed of multi-step hyperlinks in visitors’ website navigation. It provides indications of the content of the destination page. The HierPathExt algorithm is given to extract HNPs in local websites. Then, the PathRank algorithm is created to use HNPs for web page retrieval. The experiments show that our approach results in significant improvements over existing solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Broder, Kumar, R., Maghoul, F., et al.: Graph structure in the web. In: Proc. of WWW 2000 (2000)

    Google Scholar 

  2. Broder: A taxonomy of web search. SIGIR Forum 36(2), 3–10 (2002)

    Article  MATH  Google Scholar 

  3. Chen, M., et al.: A System for Organizing Intranet Search Results. In: Proc. of USENIX USITS (1999)

    Google Scholar 

  4. Shen, Sun, J.-T., Yang, Q., Chen, Z.: A comparison of implicit and explicit links for web page classification. In: Proc. of WWW 2006, pp. 643–650 (2006)

    Google Scholar 

  5. Cai, D., He, X., et al.: Block-Level Link Analysis. In: Proc. of SIGIR 2004, pp. 440–447 (2004)

    Google Scholar 

  6. Glover, J., Tsioutsiouliklis, K., Lawrence, S., Pennock, D.M., Flake, G.W.: Using web structure for classifying and describing web pages. In: Proc. of WWW 2002, pp. 562–569 (2002)

    Google Scholar 

  7. Chi, E.H., et al.: Using Information Scent to Model User Information Needs and Actions on the Web. In: Proc. of SIGCHI (2001)

    Google Scholar 

  8. Xue, Zeng, H., et al.: Implicit Link Analysis to Small Web Search. In: Proc. of SIGIR 2003, pp. 56–63 (2003)

    Google Scholar 

  9. Hagen, P., Manning, H., Paul, Y.: Must search stink? The Forrester report (June 2000)

    Google Scholar 

  10. Hawking, D., Voorhees, E., Bailey, P., Craswell, N.: Overview of TREC-8 web track. In: Proceeding of TREC-8, pp. 131–150 (1999)

    Google Scholar 

  11. Kleinberg: Authoritative source in a hyperlinked environment. J. of ACM 46, 604–622 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  12. Chen, L., Baoyao, Y., et al.: Function-based object model towards Website Adaptation. In: Proc. WWW 2001 (2001)

    Google Scholar 

  13. Sepandar, H., Taher, M., Christopher, G.: Gene. Exploiting the Block Structure of the Web for Computing PageRank, Stanford University Technical Report (2003)

    Google Scholar 

  14. Matsuda, Fukushima, T.: Task-Oriented World Wide Web Retrieval by Document Type Classification. In: Proc. of CIKM 1999, pp. 109–113 (1999)

    Google Scholar 

  15. Page, Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web, Technical Report, Stanford University (1998)

    Google Scholar 

  16. Najork, Wiener, J.: Breadth-First Search Crawling Yields High-Quality Pages. In: Proc. of WWW 2000, pp. 114–118 (2000)

    Google Scholar 

  17. Henzinger, M.: Hyperlink analysis on the world wide web. In: Proc. of ACM Hypertext 2005 (2005)

    Google Scholar 

  18. Eiron, McCurley, K.: Analysis of anchor text for web search. In: Proc. of SIGIR 2003, pp. 459–460 (2003)

    Google Scholar 

  19. Pandit, S., Olston, C.: Source, Navigation-Aided Retrieval. In: Proc. of WWW 2007, pp. 391–400 (2007)

    Google Scholar 

  20. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)

    Google Scholar 

  21. Fagin, R., Kumar, R., McCurley, K.S., Novak, J., Sivakumar, D., Tomlin, J.A., Williamson, D.P.: Searching the workplace web. In: Proc. of WWW 2003, pp. 366–375 (2003)

    Google Scholar 

  22. Robertson, S.E., Walker, S., et al.: Okapi at TREC. In: Text REtrieval Conference (1992)

    Google Scholar 

  23. Hu, Y., Xin, G., Song, R., Hu, G., et al.: Title Extraction from Bodies of HTML Documents and Its Application to Web Page Retrieval. In: Proceeding of SIGIR 2005, pp. 250–257 (2005)

    Google Scholar 

  24. Mizuuchi, Y., Tajima, K.: Finding context path for web pages. In: Proc. of ACM Hypertext (1999)

    Google Scholar 

  25. Chen, Z., Liu, S.: Building Web Thesaurus from Web Link Structure. In: Proc. of SIGIR 2003 (2003)

    Google Scholar 

  26. Nie, Z., Zhang, Y., Wen, J.R., et al.: Object-level ranking: bringing order to objects. In: Proc. of WWW 2005 (2005)

    Google Scholar 

  27. http://ir.dcs.gla.ac.uk/test_collections/access_to_data.html

  28. Rose, Levinson: Understanding User Goals in Web Search. In: Proc. of WWW 2004, pp. 13–19 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, J., Zhao, Y. (2009). PathRank: Web Page Retrieval with Navigation Path. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds) Advances in Information Retrieval. ECIR 2009. Lecture Notes in Computer Science, vol 5478. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00958-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00958-7_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00957-0

  • Online ISBN: 978-3-642-00958-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics