Upper bounds for retrieval performance and their use measuring performance and generating optimal boolean queries: Can it get any better than this?

https://doi.org/10.1016/0306-4573(94)90064-7Get rights and content

Abstract

The best-case, random, and worst-case document rankings and retrieval performance may be determined using a method discussed here. Knowledge of the best case performance allows users and system designers to (a) determine how close to optimality their search is and (b) select queries and matching functions that will produce the best results. A method for deriving the optimal Boolean query for a given level of recall is suggested, as is a method for determining the quality of a Boolean query. Measures are proposed that modify conventional text retrieval measures such as precision, E, and average search length, so that the values for these measures are 1 when retrieval is optimal, 0 when retrieval is random, and −1 when worst-case. Tests using one of these measures show that many retrievals are optimal. Consequences for retrieval research are examined.

References (26)

  • R.M. Losee

    Parameter estimation for probabilistic document retrieval models

    Journal of the American Society for Information Science

    (1988)
  • S.E. Robertson et al.

    Weighting, ranking and relevance feedback in a front-end system

    Journal of Information Science

    (1986)
  • S.E. Robertson

    The probability ranking principle in IR

    Journal of Documentation

    (1977)
  • Cited by (12)

    • Is 1 noun worth 2 adjectives? Measuring relative feature utility

      2006, Information Processing and Management
    View all citing articles on Scopus
    View full text