ABSTRACT
Pseudo relevance feedback (PRF) via query expansion assumes that the top ranked documents from the first-pass retrieval are relevant. The most informative terms in the pseudo relevant documents are then used to update the original query representation in order to boost the retrieval performance. Most current PRF approaches estimate the importance of the candidate expansion terms based on their statistics on document level. However, in traditional PRF approaches, the context information is always ignored in traditional query expansion models. Therefore, off-topic terms can also be selected, which may result in a decrease of retrieval performance. In this paper, we propose a context-based feedback framework based on Bayesian network, in which multiple context information can be taken into account. In order to demonstrate the effectiveness of our framework, we explore two different kinds of context in our experiments. The experimental results show that our proposed algorithm performs significantly better than a strong PRF baseline.
- J. Bai, J.-Y. Nie, G. Cao, and H. Bouchard. Using query contexts in information retrieval. In SIGIR '07: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in information retrieval, pages 15--22, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- S. Bao, G. Xue, X. Wu, Y. Yu, B. Fei, and Z. Su. Optimizing web search using social annotations. In WWW '07: Proceedings of the 16th international conference on World Wide Web, pages 501--510, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- M. Beaulieu, M. Gatford, X. Huang, S. Robertson, S. Walker, and P. Williams. Okapi at trec-5. In Proceedings of TREC-5, pages 143--166, 1997.Google Scholar
- G. Cao, J.-Y. Nie, J. Gao, and S. Robertson. Selecting good expansion terms for pseudo-relevance feedback. In SIGIR '08: Proceedings of the 31st annual International ACM SIGIR Conference on Research and Development in information retrieval, pages 243--250, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- C. Carpineto, R. de Mori, G. Romano, and B. Bigi. An information-theoretic approach to automatic query expansion. ACM Transactions on Information Systems (TOIS), 19(1): 1--27, 2001. Google ScholarDigital Library
- C. Carpineto, R. de Mori, G. Romano, and B. Bigi. An information-theoretic approach to automatic query expansion. ACM Trans. Inf. Syst., 19(1): 1--27, 2001. Google ScholarDigital Library
- F. Diaz and D. Metzler. Improving the estimation of relevance models using large external corpora. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 154--161, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- A. Hotho, R. Jãd'schke, C. Schmitz, and G. Stumme. Information retrieval in folksonomies: Search and ranking. In Proceedings of the 3rd European Semantic Web Conference, volume 4011 of LNCS, pages 411--426, Budva, Montenegro, June 2006. Springer. Google ScholarDigital Library
- S. Kullback and R. A. Leibler. On information and sufficiency. The Annals of Mathematical Statistics, 21(1): 79--86, 1951.Google ScholarCross Ref
- V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 120--127, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
- K. Lund and C. Burgess. Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instrumentation, and Computers, 28: 203--208, 1996.Google Scholar
- K. Lund, C. Burgess, and R. A. Atchley. Semantic and associative priming in high-dimensional semantic space. In Proceedings of the 17th Annual Conference of the Cognitive Science Society, pages 660--665, 1995.Google Scholar
- A. Mathes. Folksonomies - cooperative classification and communication through shared metadata. In KDD '08: Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 2004.Google Scholar
- D. Metzler and W. B. Croft. A markov random field model for term dependencies. In SIGIR '05: Proceedings of the 28th annual International ACM SIGIR Conference on Research and Development in information retrieval, pages 472--479, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- D. Metzler, J. Novak, H. Cui, and S. Reddy. Building enriched document representations using aggregated anchor text. In SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 219--226, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- J. Pearl. Bayesian networks: A model of self-activated memory for evidential reasoning. In Proceedings of the 7th Conference of the Cognitive Science Society, University of California, Irvine, pages 329--334, August 1985.Google Scholar
- V. Plachouras and I. Ounis. Multinomial randomness models for retrieval with document fields. In Proceedings of ECIR, pages 28--39, 2007. Google ScholarDigital Library
- S. E. Robertson. On term selection for query expansion. J. Doc., 46(4): 359--364, 1990. Google ScholarDigital Library
- S. E. Robertson, S. Walker, M. M. Beaulieu, M. Gatford, and A. Payne. Okapi at TREC-4. In Proceedings of TREC 4, 1995.Google Scholar
- J. J. Rocchio. Relevance feedback in information retrieval. In G. Salton, The SMART retrieval system: Experiments in automatic document, pages 313--323, 1971.Google Scholar
- Y. Song, Z. Zhuang, H. Li, Q. Zhao, J. Li, W.-C. Lee, and C. L. Giles. Real-time automatic tag recommendation. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 515--522, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- O. Vechtomova and Y. Wang. A study of the effect of term proximity on query expansion. J. Information Science, 32(4): 324--333, 2006.Google ScholarCross Ref
- X. Wu, L. Zhang, and Y. Yu. Exploring social annotations for the semantic web. In WWW '06: Proceedings of the 15th international conference on World Wide Web, pages 417--426, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- J. Xu and W. B. Croft. Improving the effectiveness of information retrieval with local context analysis. ACM Trans. Inf. Syst., 18(1): 79--112, 2000. Google ScholarDigital Library
- Z. Ye, B. He, X. Huang, and H. Lin. Revisiting Rocchio's relevance feedback algorithm for probabilistic models. In AIRS, 2010.Google ScholarCross Ref
- Z. Ye, J. X. Huang, S. Jin, and H. Lin. Exploring social annotation tags to enhance information retrieval performance. In A. An, P. Lingras, S. Petty, and R. Huang, editors, AMT, volume 6335 of Lecture Notes in Computer Science, pages 255--266. Springer, 2010. Google ScholarDigital Library
- C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In CIKM '01: Proceedings of the tenth international conference on Information and knowledge management, pages 403--410. ACM, 2001. Google ScholarDigital Library
- C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst., 22(2): 179--214, 2004. Google ScholarDigital Library
Index Terms
- A Bayesian network approach to context sensitive query expansion
Recommendations
Query Context Expansion for Open-Domain Question Answering
Humans are accustomed to autonomously associating prior knowledge with the text in a query when answering questions. However, for machines lacking cognition and common sense, a query is merely a combination of some words. Although we can enrich the ...
A new approach for evaluating query expansion: query-document term mismatch
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrievalThe effectiveness of information retrieval (IR) systems is influenced by the degree of term overlap between user queries and relevant documents. Query-document term mismatch, whether partial or total, is a fact that must be dealt with by IR systems. ...
Comments