Skip to main content

Optimal IR: How Far Away?

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6008))

Abstract

There exists a gap between what a human user wants in mind and what (s)he could get from the information retrieval (IR) systems by his/her queries. We say an IR system is perfect if it could always provide the users with what they want in their minds if available in corpus, and optimal if it could present to the users what it finds in an optimal way. In this paper, we empirically study how far away we are still from the optimal IR or the perfect IR based on submitted runs to TREC Genomics track 2007. We assume perfect IR would always achieve a score of 100% for given evaluation methods. The optimal IR is simulated by optimized runs based on the evaluation methods provided by TREC. Then the average performance difference between submitted runs and the perfect or optimal runs can be obtained. Given annual average performance improvement made by reranking from literature, we figure out how far away we are from the optimal or the perfect IRs. The study indicates we are about 7 and 16 years away from the optimal and the perfect IRs, respectively. These are absolutely not exact distances, but they do give us a partial perspective regarding where we are in the IR development path. This study also provides us with the lowest upper bound on IR performance improvement by reranking.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Yang, L., Ji, D., Tang, L.: Document re-ranking based on automatically acquired key terms in chinese information retrieval. In: COLING 2004, pp. 480–486 (2004)

    Google Scholar 

  2. Shi, Z., Gu, B., Popowich, F., Sarkar, A.: Synonym-based query expansion and boosting-based re-ranking: A two-phase approach for genomic information retrieval. In: TREC 2005 (2005)

    Google Scholar 

  3. Hu, Q., Huang, X.: A reranking model for genomics aspect search. In: SIGIR 2008, pp. 783–784 (2008)

    Google Scholar 

  4. Zhai, C., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: SIGIR 2003, pp. 10–17 (2003)

    Google Scholar 

  5. Clarke, C.L.A., Kolla, M., Gormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: SIGIR 2008, pp. 659–666 (2008)

    Google Scholar 

  6. Hersh, W., Cohen, A., Roberts, P.: TREC 2007 genomics track overview. In: TREC 2007, pp. 98–115 (2007)

    Google Scholar 

  7. Hersh, W., Cohen, A., Roberts, P., Rekapalli, H.K.: TREC 2006 genomics track overview. In: TREC 2006, pp. 68–87 (2006)

    Google Scholar 

  8. Boyce, B.: Beyond topicality: a two stage view of relevance and the retrieval process. Information Processing & Management 18, 105–109 (1982)

    Article  Google Scholar 

  9. Xu, Y., Yin, H.: Novelty and topicality in interactive information retrieval. Journal of the American Society for Information Science and Technology 59, 201–215 (2008)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

An, X., Huang, X., Cercone, a.N. (2010). Optimal IR: How Far Away?. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12116-6_51

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12115-9

  • Online ISBN: 978-3-642-12116-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics