Performance Measures Used in Image Information Retrieval

Sanderson, Mark

doi:10.1007/978-3-642-15181-1_5

Mark Sanderson⁵

Part of the book series: The Information Retrieval Series ((INRE,volume 32))

1001 Accesses
3 Citations

Abstract

Although during the running of the ImageCLEF tracks there was no explicit co–ordination on the types of evaluation measures employed, the same statistics were often used across ImageCLEF. Therefore, in this chapter, the range of measures used in the evaluation exercise is described. The original research defining a measure, together with their formulations and the relative pros and cons of the measures, are also detailed. Research that both compares the measures and attempts to determine the best is also outlined. Finally, the use of measures in the different tracks and years of ImageCLEF is tabulated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal R, Gollapudi S, Halverson A, Leong S (2009) Diversifying search results. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining. ACM, pp 5–14
Google Scholar
Buckley C, Voorhees EM (2004) Retrieval evaluation with incomplete information. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 25–32
Google Scholar
Buckley C, Voorhees EM (2005) Retrieval system evaluation. In: TREC: Experiment and Evaluation in Information Retrieval. MIT Press, pp 53–75. 0262220733
Google Scholar
Buckley C, Dimmick D, Soboroff I, Voorhees EM (2007) Bias and the limits of pooling for large collections. Information Retrieval 10(6):491–508
Article Google Scholar
Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22nd international conference on Machine learning, Bonn, Germany, pp 89–96
Google Scholar
Chapelle O, Metlzer D, Zhang Y, Grinspan P (2009) Expected reciprocal rank for graded relevance. In: Proceeding of the 18th ACM conference on Information and knowledge management. ACM press, pp 621–630
Google Scholar
Clarke CLA, Kolla M, Vechtomova O (2010) An effectiveness measure for ambiguous and underspecified queries. In: Advances in Information Retrieval Theory Lecture Notes in Computer Science (LNCS). Springer, pp 188–199
Google Scholar
Cleverdon CW, Keen M (1966) Factors affecting the performance of indexing systems, vol 2. ASLIB, Cranfield Research Project. Bedford, UK 37–59
Google Scholar
Cooper WS (1968) Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation 19(1):30–41
Article Google Scholar
Fairthorne RA (1963) Implications of test procedures. In: Information Retrieval in Action. Western Reserve UP, Cleveland, Ohio, USA, pp 109–113
Google Scholar
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognition Letters 27(8):861–874
Article MathSciNet Google Scholar
Goffman W (1964) On relevance as a measure. Information Storage and Retrieval 2(3):201–203
Article MathSciNet Google Scholar
Green DM, Swets JA (1966) Signal detection theory and psychophysics. John Wiley & Sons, Inc.
Google Scholar
Harman DK (1993) Overview of the second text retrieval conference (TREC–2). In: TREC Proceedings. NIST Special Publication. Department of Commerce, National Institute of Standards and Technology, Gaithersburg, MD, USA
Google Scholar
Harman DK (1995) Overview of the second text retrieval conference (TREC–2). Information Processing and Management 31(3):271–289
Article Google Scholar
Hawking D, Robertson SE (2003) On collection size and retrieval effectiveness. Information Retrieval 6(1):99–105
Article Google Scholar
Hull D (1993) Using statistical testing in the evaluation of retrieval experiments. In: Proceedings of the 16th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 329–338
Google Scholar
Järvelin K, Kekäläinen J (2000) IR evaluation methods for retrieving highly relevant documents. In: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval. ACM press, pp 41–48
Google Scholar
Järvelin K, Kekäläinen J (2002) Cumulated gain–based evaluation of IR techniques. ACM Transactions on Information Systems 20(4):422–446
Article Google Scholar
Kent A, Berry MM, Luehrs Jr FU, Perry JW (1955) Machine literature searching VIII. Operational criteria for designing information retrieval systems. American Documentation 6(2):93–101
Article Google Scholar
Moffat A, Zobel J (2008) Rank–biased precision for measurement of retrieval effectiveness. ACM Transactions on Information Systems 27(1)
Google Scholar
Müller H, Deselaers T, Deserno T, Kalpathy-Cramer J, Kim E, Hersh W (2008) Overview of the ImageCLEFmed 2007 medical retrieval and medical annotation tasks. Advances in Multilingual and Multimodal Information Retrieval. pp 472–491
Google Scholar
Pearson WR (1995) Comparison of methods for searching protein sequence databases. Protein Science: A Publication of the Protein Society 4(6):1145
Google Scholar
van Rijsbergen CJ (1979) Information retrieval. Butterworth-Heinemann Ltd., p 224. 0408709294
Google Scholar
Rijsbergen CJV (1974) Foundation of evaluation. Journal of Documentation 30(4):365–373
Article Google Scholar
Robertson SE (2006) On GMAP: and other transformations. In: Proceedings of the 15th ACM international conference on Information and knowledge management. ACM press, pp 78–83
Google Scholar
Salton G (1968) Automatic information organization and retrieval. McGraw Hill Text
Google Scholar
Sanderson M, Zobel J (2005) Information retrieval system evaluation: effort, sensitivity, and reliability. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 162–169
Google Scholar
Sanderson M, Lestari Paramita M, Clough P, Kanoulas E (2010) Do user preferences and perfromance measures line up? In: Proceedings of the 33rd annual international ACM SIGIR conference on research and development in information retrieval. ACM press
Google Scholar
Soboroff I (2004) On evaluating web search with very few relevant documents. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 530–531
Google Scholar
Soboroff I (2006) Dynamic test collections: measuring search effectiveness on the live web. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 276–283
Google Scholar
Stevens SS (1946) On the theory of scales of measurement. Science 103(2684):677–680
Article Google Scholar
Swets JA (1963) Information retrieval systems. Science 141(3577):245–250
Article Google Scholar
Tague–Sutcliffe JM, Blustein J (1994) A statistical analysis of the TREC–3 data. In: TREC Proceedings. NIST Special Publication. Department of Commerce, National Institute of Standards and Technology, pp 385–398
Google Scholar
Thom JA, Scholer F (2007) A comparison of evaluation measures given how users perform on search tasks. In: The Twelfth Australasian Document Computing Symposium (ADCS 2007), pp 56–63
Google Scholar
Verhoeff J, Goffman W, Belzer J (1961) Inefficiency of the use of boolean functions for information retrieval systems. Communications of the ACM 4(12):557–558
Article MathSciNet Google Scholar
Voorhees EM (2005) Overview of the TREC 2004 robust retrieval track. In: TREC Proceedings. NIST Special Publication. Department of Commerce, National Institute of Standards and Technology, Gaithersburg, MD, USA
Google Scholar
Yilmaz E, Aslam JA (2006) Estimating average precision with incomplete and imperfect judgments. In: Proceedings of the 15th ACM international conference on information and knowledge management. ACM press, pp 102–111
Google Scholar
Yilmaz E, Robertson SE (2009) Learning to rank for information retrieval. In: Workshop in Conjunction with the ACM SIGIR conference on information retrieval. ACM press, Boston, MA, USA
Google Scholar
Zhai CX, Cohen WW, Lafferty J (2003) Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 10–17
Google Scholar

Download references

Author information

Authors and Affiliations

University of Sheffield, Sheffield, United Kingdom
Mark Sanderson

Authors

Mark Sanderson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mark Sanderson .

Editor information

Editors and Affiliations

HES-SO Business Information Systems, TechnoArk 3, Sierre, 3960, Switzerland
Henning Müller
Dept. Information Studies, University of Sheffield, Portobello Street 211, Sheffield, S1 4DP, United Kingdom
Paul Clough
, Computer Vision Lab/ETF-C 113.2, ETH Zürich, Zürich, 8092, Switzerland
Thomas Deselaers
Idiap Research Institute, rue Marconi 19, Martigny, 1920, Switzerland
Barbara Caputo

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sanderson, M. (2010). Performance Measures Used in Image Information Retrieval. In: Müller, H., Clough, P., Deselaers, T., Caputo, B. (eds) ImageCLEF. The Information Retrieval Series, vol 32. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15181-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-15181-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15180-4
Online ISBN: 978-3-642-15181-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics