Abstract
The ongoing explosion of web information calls for more intelligent and personalized methods towards better search result quality for advanced queries. Query logs and click streams obtained from web browsers or search engines can contribute to better quality by exploiting the collaborative recommendations that are implicitly embedded in this information. This paper presents a new method that incorporates the notion of query nodes into the PageRank model and integrates the implicit relevance feedback given by click streams into the automated process of authority analysis. This approach generalizes the well-known random-surfer model into a random-expert model that mimics the behavior of an expert user in an extended session consisting of queries, query refinements, and result-navigation steps. The enhanced PageRank scores, coined QRank scores, can be computed offline; at query-time they are combined with query-specific relevance measures with virtually no overhead. Our preliminary experiments, based on real-life query-log and click-stream traces from eight different trial users indicate significant improvements in the precision of search results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., Preda, M., Cobena, G.: Adaptive on-line page importance computation. In: WWW Conference 2003, pp. 280–290 (2003)
Achlioptas, D., Fiat, A., Karlin, A.R., McSherry, F.: Web Search via Hub Synthesis. In: FOCS 2001, pp. 500–509 (2001)
Bharat, K., Henzinger, M.R.: Improved Algorithms for Topic Distillation in a Hyperlinked Environment. In: 21st ACM SIGIR 1998, pp. 104–111 (1998)
Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: 6th ACM SIGKDD 2000, pp. 407–416 (2000)
Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. In: WWW Conference 1998 (1998)
Chakrabarti, S.: Mining the Web: Discovering Knowledge from Hypertext Data. Morgan Kaufmann, San Francisco (2002)
Cohn, D.A., Hofmann, T.: The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity. In: NIPS 2000, pp. 430–436 (2000)
Cui, H., Wen, J.-R., Nie, J.-Y., Ma, W.-Y.: Query Expansion by MiningUser Logs. IEEE Trans. Knowl. Data Eng. 15(4) (2003)
Ding, C.H.Q., He, X., Husbands, P., Zha, H., Simon, H.D.: PageRank, HITS and a Unified Framework for Link Analysis. In: SDM 2003 (2003)
Fagin, R., Kumar, R., McCurley, K.S., Novak, J., Sivakumar, D., Tomlin, J.A., Williamson, D.P.: Searching the workplace web. In: WWW Conference 2003, pp. 366–375 (2003)
Fagin, R., Kumar, R., Sivakumar, D.: Efficient similarity search and classification via rank aggregation. In: ACM SIGMOD 2003, pp. 301–312 (2003)
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Haveliwala, T.H.: Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search. IEEE Trans. Knowl. Data Eng. 15(4), 784–796 (2003)
Kleinberg, J.M.: Authoritative Sources in a Hyperlinked Environment. J. ACM 46(5), 604–632 (1999)
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Richardson, M., Domingos, P.: The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank. In: NIPS 2001, pp. 1441–1448 (2001)
Sizov, S., Biwer, M., Graupmann, J., Siersdorfer, S., Theobald, M., Weikum, G., Zimmer, P.: The BINGO! System for Information Portal Generation and Expert Web Search. In: CIDR 2003 (2003)
Theobald, M., Schenkel, R., Weikum, G.: Exploiting Structure, Annotation, and Ontological Knowledge for Automatic Classification of XML Data. In: WebDB 2003, pp. 1–6 (2003)
Ulbrich, H.: UrlSearch 2.4.6, http://people.freenet.de/h.ulbrich/
Wang, J., Chen, Z., Tao, L., Ma, W.-Y., Wenyin, L.: Ranking User’s Relevance to a Topic through Link Analysis on Web Logs. In: WIDM 2002, pp. 49–54 (2002)
Wen, J.-R., Nie, J.-Y., Zhang, H.-J.: Query Clustering Using User Logs. ACM Trans. Inf. Syst. 20(1), 59–81 (2002)
Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/wiki/Main_Page
Wu, Y.-H., Chen, A.L.P.: Prediction of Web Page Accesses by Proxy Server Log. World Wide Web 5(1), 67–88 (2002)
Xue, G.-R., Zeng, H.-J., Chen, Z., Ma, W.-Y., Zhang, H.-J., Lu, C.-J.: Implicit Link Analysis for Small Web Search. In: SIGIR 2003, pp. 56–63 (2003)
Zaiane, O.R., Srivastava, J., Spiliopoulou, M., Masand, B.M.: 4th International Workshop, WEBKDD 2002 - MiningWeb Data for Discovering Usage Patterns and Profiles, Revised Papers Springer (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Luxenburger, J., Weikum, G. (2004). Query-Log Based Authority Analysis for Web Information Search. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K. (eds) Web Information Systems – WISE 2004. WISE 2004. Lecture Notes in Computer Science, vol 3306. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30480-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-30480-7_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23894-2
Online ISBN: 978-3-540-30480-7
eBook Packages: Springer Book Archive