Skip to main content

Pseudo-Relevance Feedback Based on mRMR Criteria

  • Conference paper
Information Retrieval Technology (AIRS 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6458))

Included in the following conference series:

  • 1405 Accesses

Abstract

Pseudo-relevance feedback has shown to be an effective method in many information retrieval tasks. Various criteria have been proposed to rank terms extracted from the top ranked document of the initial retrieval results. However, most existing methods extract terms individually and do not consider the impacts of relationships among terms and their combinations. In this study, we first re-examine this assumption and show that combinations of terms may heavily impact the final results. We then present a novel clustering based method to select expansion terms as a whole set. The main idea is to use first simultaneously cluster terms and documents using non-negative matrix factorization, and then use the Maximum Relevance and Minimum Redundancy criteria to select terms based on their clusters, term distributions, and other features. Experimental results on serval TREC collections show that our proposed method significantly improves performances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Beitzel, S.M., Jensen, E.C., Chowdhury, A., Grossman, D., Frieder, O.: Hourly analysis of a very large topically categorized web query log. In: SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 321–328. ACM, New York (2004)

    Google Scholar 

  2. Xu, Y., Jones, G.J., Wang, B.: Query dependent pseudo-relevance feedback based on wikipedia. In: SIGIR 2009: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 59–66. ACM, New York (2009)

    Google Scholar 

  3. Lee, K.S., Croft, W.B., Allan, J.: A cluster-based resampling method for pseudo-relevance feedback. In: SIGIR 2008: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 235–242. ACM, New York (2008)

    Google Scholar 

  4. Collins-Thompson, K., Callan, J.: Estimation and use of uncertainty in pseudo-relevance feedback. In: SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and evelopment in Information Retrieval, pp. 303–310. ACM, New York (2007)

    Google Scholar 

  5. Cao, G., Nie, J.Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: SIGIR 2008: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 243–250. ACM, New York (2008)

    Google Scholar 

  6. Tao, T., Zhai, C.: Regularized estimation of mixture models for robust pseudo-relevance feedback. In: Proceedings of SIGIR 2006, pp. 162–169. ACM, New York (2006)

    Google Scholar 

  7. Bhogal, J., Macfarlane, A., Smith, P.: A review of ontology based query expansion. Information Processing & Management 43(4), 866–886 (2007)

    Article  Google Scholar 

  8. Carpineto, C., de Mori, R., Romano, G., Bigi, B.: An information-theoretic approach to automatic query expansion. ACM Transactions on Information Systems 19(1), 1–27 (2001)

    Article  Google Scholar 

  9. Buckley, C.: Automatic query expansion using SMART: TREC 3. In: Proceedings of The Third Text REtrieval Conference (TREC-3), pp. 69–80 (1994)

    Google Scholar 

  10. Yu, S., Cai, D., Wen, J.R., Ma, W.Y.: Improving pseudo-relevance feedback in web information retrieval using web page segmentation. In: Proceedings of WWW 2003, pp. 11–18. ACM, New York (2003)

    Google Scholar 

  11. Zhang, Q., Wang, B., Huang, X.H., Wu, L.: FDU at TREC 2007: opinion retrieval of blog track. In: Proceedings of The Sixteen Text REtrieval Conference, TREC-2007 (2007)

    Google Scholar 

  12. Robertson, S.E., Walker, S., Hancock-Beaulieu, M.M., Gatford, M., Payne, A.: Okapi at TREC-4. In: Proceedings of The Fourth Text REtrieval Conference, TREC-4 (1996)

    Google Scholar 

  13. Moldovan, D.I., Mihalcea, R.: Using wordnet and lexical operators to improve internet searches. IEEE Internet Computing 4(1), 34–43 (2000)

    Article  Google Scholar 

  14. Sun, R., Ong, C.H., Chua, T.S.: Mining dependency relations for query expansion in passage retrieval. In: Proceedings of SIGIR 2006, pp. 382–389. ACM, New York (2006)

    Google Scholar 

  15. Collins-Thompson, K., Callan, J.: Query expansion using random walk models. In: CIKM 2005: Proceedings of the 14th ACM international conference on Information and knowledge management, pp. 704–711. ACM, New York (2005)

    Google Scholar 

  16. Sakai, T., Manabe, T., Koyama, M.: Flexible pseudo-relevance feedback via selective sampling. ACM Transactions on Asian Language Information Processing (TALIP) 4(2), 111–135 (2005)

    Article  Google Scholar 

  17. Huang, X., Croft, W.B.: A unified relevance model for opinion retrieval. In: Proceedings of 16th Conference on Information and Knowledge Management (CIKM 2009), Hong Kong, China (2009)

    Google Scholar 

  18. Udupa, R., Bhole, A., Bhattacharyya, P.: A term is known by the company it keeps: On selecting a good expansion set in pseudo relevance feedback. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 104–115. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  19. Robertson, S.E.: On term selection for query expansion. Journal of Documentation 46(4), 359–364 (1990)

    Article  Google Scholar 

  20. Buckley, C., Mitra, M., Walz, J.A., Cardie, C.: Using clustering and superconcepts within SMART: TREC 6. Inf. Process. Manage. 36(1), 109–131 (2000)

    Article  Google Scholar 

  21. Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. In: CSB 2003: Proceedings of the IEEE Computer Society Conference on Bioinformatics, Washington, DC, USA, p. 523. IEEE Computer Society Press, Los Alamitos (2003)

    Google Scholar 

  22. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8), 1226–1238 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, Y., Zhang, Q., Zhou, Y., Huang, X. (2010). Pseudo-Relevance Feedback Based on mRMR Criteria. In: Cheng, PJ., Kan, MY., Lam, W., Nakov, P. (eds) Information Retrieval Technology. AIRS 2010. Lecture Notes in Computer Science, vol 6458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17187-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17187-1_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17186-4

  • Online ISBN: 978-3-642-17187-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics