Abstract
Although during the running of the ImageCLEF tracks there was no explicit co–ordination on the types of evaluation measures employed, the same statistics were often used across ImageCLEF. Therefore, in this chapter, the range of measures used in the evaluation exercise is described. The original research defining a measure, together with their formulations and the relative pros and cons of the measures, are also detailed. Research that both compares the measures and attempts to determine the best is also outlined. Finally, the use of measures in the different tracks and years of ImageCLEF is tabulated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal R, Gollapudi S, Halverson A, Leong S (2009) Diversifying search results. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining. ACM, pp 5–14
Buckley C, Voorhees EM (2004) Retrieval evaluation with incomplete information. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 25–32
Buckley C, Voorhees EM (2005) Retrieval system evaluation. In: TREC: Experiment and Evaluation in Information Retrieval. MIT Press, pp 53–75. 0262220733
Buckley C, Dimmick D, Soboroff I, Voorhees EM (2007) Bias and the limits of pooling for large collections. Information Retrieval 10(6):491–508
Burges C, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender G (2005) Learning to rank using gradient descent. In: Proceedings of the 22nd international conference on Machine learning, Bonn, Germany, pp 89–96
Chapelle O, Metlzer D, Zhang Y, Grinspan P (2009) Expected reciprocal rank for graded relevance. In: Proceeding of the 18th ACM conference on Information and knowledge management. ACM press, pp 621–630
Clarke CLA, Kolla M, Vechtomova O (2010) An effectiveness measure for ambiguous and underspecified queries. In: Advances in Information Retrieval Theory Lecture Notes in Computer Science (LNCS). Springer, pp 188–199
Cleverdon CW, Keen M (1966) Factors affecting the performance of indexing systems, vol 2. ASLIB, Cranfield Research Project. Bedford, UK 37–59
Cooper WS (1968) Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation 19(1):30–41
Fairthorne RA (1963) Implications of test procedures. In: Information Retrieval in Action. Western Reserve UP, Cleveland, Ohio, USA, pp 109–113
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognition Letters 27(8):861–874
Goffman W (1964) On relevance as a measure. Information Storage and Retrieval 2(3):201–203
Green DM, Swets JA (1966) Signal detection theory and psychophysics. John Wiley & Sons, Inc.
Harman DK (1993) Overview of the second text retrieval conference (TREC–2). In: TREC Proceedings. NIST Special Publication. Department of Commerce, National Institute of Standards and Technology, Gaithersburg, MD, USA
Harman DK (1995) Overview of the second text retrieval conference (TREC–2). Information Processing and Management 31(3):271–289
Hawking D, Robertson SE (2003) On collection size and retrieval effectiveness. Information Retrieval 6(1):99–105
Hull D (1993) Using statistical testing in the evaluation of retrieval experiments. In: Proceedings of the 16th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 329–338
Järvelin K, Kekäläinen J (2000) IR evaluation methods for retrieving highly relevant documents. In: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval. ACM press, pp 41–48
Järvelin K, Kekäläinen J (2002) Cumulated gain–based evaluation of IR techniques. ACM Transactions on Information Systems 20(4):422–446
Kent A, Berry MM, Luehrs Jr FU, Perry JW (1955) Machine literature searching VIII. Operational criteria for designing information retrieval systems. American Documentation 6(2):93–101
Moffat A, Zobel J (2008) Rank–biased precision for measurement of retrieval effectiveness. ACM Transactions on Information Systems 27(1)
MĂ¼ller H, Deselaers T, Deserno T, Kalpathy-Cramer J, Kim E, Hersh W (2008) Overview of the ImageCLEFmed 2007 medical retrieval and medical annotation tasks. Advances in Multilingual and Multimodal Information Retrieval. pp 472–491
Pearson WR (1995) Comparison of methods for searching protein sequence databases. Protein Science: A Publication of the Protein Society 4(6):1145
van Rijsbergen CJ (1979) Information retrieval. Butterworth-Heinemann Ltd., p 224. 0408709294
Rijsbergen CJV (1974) Foundation of evaluation. Journal of Documentation 30(4):365–373
Robertson SE (2006) On GMAP: and other transformations. In: Proceedings of the 15th ACM international conference on Information and knowledge management. ACM press, pp 78–83
Salton G (1968) Automatic information organization and retrieval. McGraw Hill Text
Sanderson M, Zobel J (2005) Information retrieval system evaluation: effort, sensitivity, and reliability. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 162–169
Sanderson M, Lestari Paramita M, Clough P, Kanoulas E (2010) Do user preferences and perfromance measures line up? In: Proceedings of the 33rd annual international ACM SIGIR conference on research and development in information retrieval. ACM press
Soboroff I (2004) On evaluating web search with very few relevant documents. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 530–531
Soboroff I (2006) Dynamic test collections: measuring search effectiveness on the live web. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 276–283
Stevens SS (1946) On the theory of scales of measurement. Science 103(2684):677–680
Swets JA (1963) Information retrieval systems. Science 141(3577):245–250
Tague–Sutcliffe JM, Blustein J (1994) A statistical analysis of the TREC–3 data. In: TREC Proceedings. NIST Special Publication. Department of Commerce, National Institute of Standards and Technology, pp 385–398
Thom JA, Scholer F (2007) A comparison of evaluation measures given how users perform on search tasks. In: The Twelfth Australasian Document Computing Symposium (ADCS 2007), pp 56–63
Verhoeff J, Goffman W, Belzer J (1961) Inefficiency of the use of boolean functions for information retrieval systems. Communications of the ACM 4(12):557–558
Voorhees EM (2005) Overview of the TREC 2004 robust retrieval track. In: TREC Proceedings. NIST Special Publication. Department of Commerce, National Institute of Standards and Technology, Gaithersburg, MD, USA
Yilmaz E, Aslam JA (2006) Estimating average precision with incomplete and imperfect judgments. In: Proceedings of the 15th ACM international conference on information and knowledge management. ACM press, pp 102–111
Yilmaz E, Robertson SE (2009) Learning to rank for information retrieval. In: Workshop in Conjunction with the ACM SIGIR conference on information retrieval. ACM press, Boston, MA, USA
Zhai CX, Cohen WW, Lafferty J (2003) Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval. ACM press, pp 10–17
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Sanderson, M. (2010). Performance Measures Used in Image Information Retrieval. In: MĂ¼ller, H., Clough, P., Deselaers, T., Caputo, B. (eds) ImageCLEF. The Information Retrieval Series, vol 32. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15181-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-15181-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15180-4
Online ISBN: 978-3-642-15181-1
eBook Packages: Computer ScienceComputer Science (R0)