ABSTRACT
In this paper we consider the problem of re-ranking search results by incorporating user feedback. We present a graph theoretic measure for discriminating irrelevant results from relevant results using a few labeled examples provided by the user. The key intuition is that nodes relatively closer (in graph topology) to the relevant nodes than the irrelevant nodes are more likely to be relevant. We present a simple sampling algorithm to evaluate this measure at specific nodes of interest, and an efficient branch and bound algorithm to compute the top k nodes from the entire graph under this measure. On quantifiable prediction tasks the introduced measure outperforms other diffusion-based proximity measures which take only the positive relevance feedback into account. On the Entity-Relation graph built from the authors and papers of the entire DBLP citation corpus (1.4 million nodes and 2.2 million edges) our branch and bound algorithm takes about 1.5 seconds to retrieve the top 10 nodes w.r.t. this measure with 10 labeled nodes.
- J. Abernethy, O. Chapelle, and C. Castillo. Web spam identification through content and hyperlinks. In Proc. AIRWEB, 2008. Google ScholarDigital Library
- D. Aldous and J. A. Fill. Reversible Markov Chains. 2001.Google Scholar
- A. Balmin, V. Hristidis, and Y. Papakonstantinou. ObjectRank: Authority-based keyword search in databases. In VLDB, 2004. Google ScholarDigital Library
- S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. In Proc. WWW, 1998. Google ScholarDigital Library
- S. Chakrabarti. Dynamic personalized pagerank in entity-relation graphs. In WWW, 2007. Google ScholarDigital Library
- Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res., 2003. Google ScholarDigital Library
- L. Grady. Random walks for image segmentation. PAMI, 2006. Google ScholarDigital Library
- Z. Gyongyi, H. Garcia--Molina, and J. Pedersen. Combating web spam with trustrank. In Proc. VLDB, 2004. Google ScholarDigital Library
- G. Jeh and J. Widom. Scaling personalized web search. In Stanford University Technical Report, 2002.Google Scholar
- R. Jin, H. Valizadegan, and H. Li. Ranking refinement and its application to information retrieval. In WWW, 2008. Google ScholarDigital Library
- A. Joshi, R. Kumar, B. Reed, and A. Tomkins. Anchor-based proximity measures. In Proc. WWW, 2007. Google ScholarDigital Library
- I. Koutis and G. L. Miller. A linear work, o(n1=6) time, parallel algorithm for solving planar laplacians. In Proc. SODA, 2007. Google ScholarDigital Library
- A. Levin, D. Lischinski, and Y. Weiss. Colorization using optimization. ACM Transactions on Graphics, 2004. Google ScholarDigital Library
- A. Ntoulas, M. Najork, M. Manasse, and D. Fetterly. Detecting spam web pages through content analysis. In Proc. WWW, 2006. Google ScholarDigital Library
- P. Sarkar and A. Moore. A tractable approach to finding closest truncated-commute-time neighbors in large graphs. In Proc. UAI, 2007.Google Scholar
- P. Sarkar, A. W. Moore, and A. Prakash. Fast incremental proximity search in large graphs. In ICML, 2008. Google ScholarDigital Library
- D. Spielman and S. Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proceedings of the STOC'04, 2004. Google ScholarDigital Library
- Z. Xu, R. Akella, and Y. Zhang. Incorporating diversity and density in active learning for relevance feedback. In ECIR, 2007. Google ScholarDigital Library
- X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In ICML, volume 20, 2003.Google Scholar
Index Terms
- Fast dynamic reranking in large graphs
Recommendations
Click-boosting random walk for image search reranking
ICIMCS '13: Proceedings of the Fifth International Conference on Internet Multimedia Computing and ServiceImage reranking is an effective way for improving the retrieval performance of keyword-based image search engines. A fundamental issue underlying the success of existing image reranking approaches is the ability in identifying potentially useful ...
Node ranking in labeled directed graphs
CIKM '04: Proceedings of the thirteenth ACM international conference on Information and knowledge managementOur work is motivated by the problem of ranking hyper-linked documents for a given query. Given an arbitrary directed graph with edge and node labels, we present a new flow-based model and an efficient method to dynamically rank the nodes of this graph ...
Post-ranking query suggestion by diversifying search results
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information RetrievalQuery suggestion refers to the process of suggesting related queries to search engine users. Most existing researches have focused on improving the relevance of suggested queries. In this paper, we introduce the concept of diversifying the content of ...
Comments