Skip to main content

Using Hyperlink Features to Personalize Web Search

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3932))

Abstract

Personalized search has gained great popularity to improve search effectiveness in recent years. The objective of personalized search is to provide users with information tailored to their individual contexts. We propose to personalize Web search based on features extracted from hyperlinks, such as anchor terms or URL tokens. Our methodology personalizes PageRank vectors by weighting links based on the match between hyperlinks and user profiles. In particular, here we describe a profile representation using Internet domain features extracted from URLs. Users specify interest profiles as binary vectors where each feature corresponds to a set of one or more DNS tree nodes. Given a profile vector, a weighted PageRank is computed assigning a weight to each URL based on the match between the URL and the profile. We present promising results from an experiment in which users were allowed to select among nine URL features combining the top two levels of the DNS tree, leading to 29 pre-computed PageRank vectors from a Yahoo crawl. Personalized PageRank performed favorably compared to pure similarity based ranking and traditional PageRank.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. van Rijsbergen, C.: Information Retrieval, 2nd edn. Butterworths, London (1979)

    Google Scholar 

  2. Salton, G., McGill, M.: An Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  3. Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks 30, 107–117 (1998)

    Google Scholar 

  4. Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 604–632 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  5. Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the Web. Technical report, Stanford University Database Group (1998)

    Google Scholar 

  6. Brin, S., Motwani, R., Page, L., Winograd, T.: What can you do with a Web in your pocket. IEEE Data Engineering Bulletin 21, 37–47 (1998)

    Google Scholar 

  7. Arasu, A., Cho, J., Garcia-Molina, H., Paepcke, A., Raghavan, S.: Searching the web. ACM Trans. Inter. Tech. 1, 2–43 (2001)

    Article  Google Scholar 

  8. Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Mathematics (forthcoming)

    Google Scholar 

  9. Langville, A.N., Meyer, C.D.: A survey of eigenvector methods of Web information retrieval. SIAM Review (forthcoming)

    Google Scholar 

  10. Haveliwala, T.: Topic-sensitive PageRank. In: Lassner, D., De Roure, D., Iyengar, A. (eds.) Proc. 11th International World Wide Web Conference. ACM Press, New York (2002)

    Google Scholar 

  11. Richardson, M., Domingos, P.: The intelligent surfer: Probabilistic combination of link and content information in PageRank. In: Advances in Neural Information Processing Systems, vol. 14, pp. 1441–1448. MIT Press, Cambridge, MA (2002)

    Google Scholar 

  12. Jeh, G., Widom, J.: Scaling personalized Web search. In: Proc. 12th International World Wide Web Conference (2003)

    Google Scholar 

  13. Haveliwala, T.: Efficient computation of pagerank. Technical report, Stanford Database Group (1999)

    Google Scholar 

  14. Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Exploiting the block structure of the Web for computing PageRank. Technical report, Stanford University (2003)

    Google Scholar 

  15. Kamvar, S.D., Haveliwala, T.H., Manning, C.D., Golub, G.H.: Extrapolation methods for accelerating the computation of pagerank. In: Proc. 12th International World Wide Web Conference (2003)

    Google Scholar 

  16. Kamvar, S.D., Haveliwala, T.H., Golub, G.H.: Adaptive methods for the computation of PageRank. Technical report, Stanford University (2003)

    Google Scholar 

  17. Eiron, N., McCurley, K., Tomlin, J.: Ranking the Web frontier. In: Proc. 13th conference on World Wide Web, pp. 309–318. ACM Press, New York (2004)

    Chapter  Google Scholar 

  18. Acharyya, S., Ghosh, J.: Outlink estimation for pagerank computation under missing data. In: Alt. Track Papers and Posters Proc. 13th International World Wide Web Conference, pp. 486–487 (2004)

    Google Scholar 

  19. Pitkow, J., Schutze, H., Cass, T., Cooley, R., Turnbull, D., Edmonds, A., Adar, E., Breuel, T.: Personalized Search. Communication of ACM 42(9) (2002)

    Google Scholar 

  20. Eirinaki, M., Vazirgiannis, M.: Web Mining for Web Personalization. ACM Transactions on Internet Technologies (ACM TOIT) 3(1)

    Google Scholar 

  21. Mostafa, J.: Information Customization. IEEE Intelligent Systems 17.6 (2002)

    Google Scholar 

  22. Ha, S.H.: Helping Online Customers Decide through Web Personalization. IEEE Intelligent Systems 17.6 (2002)

    Google Scholar 

  23. Jenamani, M., Mohapatra, P., Ghose, S.: Online Customized Index Synthesis in Commercial Web Sites. IEEE Intelligent Systems 17.6 (2002)

    Google Scholar 

  24. Nasraoui, O., Petenes, C.: Combining Web Usage Mining and Fuzzy Inference for Website Personalization. In: Proc. of WebKDD 2003 - KDD Workshop on Web mining as a Premise to Effective and Intelligent Web Applications, Washington DC, August 2003, p. 37 (2003)

    Google Scholar 

  25. Mobasher, B., Dai, H., Luo, T., Nakagawa, M.: Effective personalizaton based on association rule discovery from Web usage data. In: ACM Workshop on Web information and data management, Atlanta, GA

    Google Scholar 

  26. Li, J., Zaiane, O.: Using Distinctive Information Channels for a Mission-based Web-Recommender System. In: Proc. of WebKDD-2004 workshop on Web Mining and Web Usage Analysis, part of the ACM KDD: Knowledge Discovery and Data Mining Conference, Seattle, WA (2004)

    Google Scholar 

  27. Davison, B.D.: Topical locality in the Web. In: Proceedings of the 1st International World Wide Web Conference, Geneva (1994), www1.cern.ch/PapersWWW94/reinpost.ps

  28. Bradshaw, S., Hammond, K.: Automatically Indexing Research Papers Using Text Surrounding Citations. In: Working Notes of the Workshop on Intelligent Information Systems, Sixteenth National Conference on Artificial Intelligence, Orlando, FL, July 18-19

    Google Scholar 

  29. Liu, F., Yu, C., Meng, W.: Personalized Web Search For Improving Retrieval Effectiveness. IEEE Transactions on Knowledge and Data Engineering (January 2004)

    Google Scholar 

  30. BaezaYates, R., Davis, E.: Web Page Ranking using Link Attributes. In: WWW 2004, May 17-22, New York, USA (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Aktas, M.S., Nacar, M.A., Menczer, F. (2006). Using Hyperlink Features to Personalize Web Search. In: Mobasher, B., Nasraoui, O., Liu, B., Masand, B. (eds) Advances in Web Mining and Web Usage Analysis. WebKDD 2004. Lecture Notes in Computer Science(), vol 3932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11899402_7

Download citation

  • DOI: https://doi.org/10.1007/11899402_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-47127-1

  • Online ISBN: 978-3-540-47128-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics