Abstract
One of the main components of any Text REtrieval Conference (TREC)–style information retrieval benchmark is a collection of documents, such as images, texts, sounds or videos that is representative of a particular domain. Although many image collections exist both on–line and off–line, finding visual resources suitable for evaluation benchmarks such as ImageCLEF is challenging. For example, these resources are often expensive to purchase and subject to specific copyright licenses, restricting both the distribution and future access of such data for evaluation purposes. However, the various ImageCLEF evaluation tasks have managed to create and/or acquire almost a dozen document collections since 2003. This chapter begins by discussing the requirements and specifications for creating a suitable document collection for evaluating multi–modal and cross–lingual image retrieval systems. It then describes each of the eleven document collections created and used for ImageCLEF tasks between 2003 and 2009. The description includes the origins of each document collection, a summary of its content, as well as details regarding the distribution, benefits and limitations of each resource.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Candler CS, Uijtdehaage SHJ, Dennis SE (2003) Introducing HEAL: The Health Education Assets Library. Academic Medicine 78(3):249–253
Clough PD, Sanderson M, Reid N (2006) The Eurovision St. Andrews Collection of Photographs. SIGIR Forum 40(1):21–30
Denoyer L, Gallinari P (2006) The Wikipedia XML Corpus. SIGIR Forum 40(1):64–69
Everingham M, Zisserman A, Williams CKI, van Gool L (2006) The PASCAL Visual Object Classes Challenge 2006 (VOC2006) Results. Tech. rep., University of Oxford, Oxford, UK
Glatz-Krieger K, Glatz D, Gysel M, Dittler M, Mihatsch MJ (2003) Webbasierte Lernwerkzeuge für die Pathologie. Der Pathologe 24(5):394–399
Grubinger M (2007) Analysis and evaluation of visual information systems performance. PhD thesis, School of Computer Science and Mathematics. Faculty of Health, Engineering and Science. Victoria University, Melbourne, Australia
Grubinger M, Clough PD, Müller H, Deselaers T (2006) The IAPR TC–12 Benchmark: A New Evaluation Resource for Visual Information Systems. In: International Workshop OntoImage 2006 Language Resources for Content–Based Image Retrieval, held in conjunction with LREC 2006, Genoa, Italy, pp 13–23
Hersh W, Müller H, Kalpathy-Cramer J, Kim E (2007) Consolidating the ImageCLEF Medical Task Test Collection: 2005–2007. In: Proceedings of the Third Workshop on Image and Video Retrieval Evaluation. MUSCLE, Budapest, Hungary, pp 31–39
Hersh W, Müller H, Kalpathy-Cramer J (2009) The ImageCLEFmed Medical Image Retrieval Task Test Collection. Digital Imaging 22(6):648–655
Huiskes MJ, Lew MS (2008) The MIR FlickR Retrieval Evaluation. In: Proceedings of the 2008 ACM international conference on multimedia information retrieval. ACM press, New York, NY, USA, pp 39–43
Lehmann TM, Deselaers T, Schubert H, Güld MO, Thies C, Fischer B, Spitzer K (2003) The IRMA Code for Unique Classification of Medical Images. In: Huang HK, Ratib OM (eds) Medical Imaging 2003: PACS and Integrated Medical Information Systems: Design and Evaluation. SPIE Proceedings, vol 5033, San Diego, CA, USA, pp 440–451
Lehmann TM, Deselaers T, Schubert H, Güld MO, Thies C, Fischer B, Spitzer K (2006) IRMA — a Content–Based Approach to Image Retrieval in Medical Applications. In: IRMA International Conference 2006, Washington, DC, USA, pp 911–912
Lestari Paramita M, Sanderson M, Clough PD (2009) Developing a Test Collection to Support Diversity Analysis. In: Proceedings of the ACM SIGIR 2009 Workshop: Redundancy, Diversity, and Interdependence Document Relevance. ACM press, Boston, MA, USA, pp 39–45
Lestari Paramita M, Sanderson M, Clough PD (2010) Diversity in Photo Retrieval: Overview of the ImageCLEFphoto Task 2009. In: Peters C, Tsikrika T, Müller H, Kalpathy-Cramer J, Jones JFG, Gonzalo J, Caputo B (eds) Multilingual Information Access Evaluation Vol. II Multimedia Experiments: Proceedings of the 10th Workshop of the Cross–Language Evaluation Forum (CLEF 2009), Revised Selected Papers. Lecture Notes in Computer Science (LNCS), Corfu, Greece
Leung CHC, Ip H (2000) Benchmarking for Content–Based Visual Information Search. In: Laurini R (ed) Fourth International Conference On Visual Information Systems (VISUAL 2000). Lecture Notes in Computer Science (LNCS), vol 1929. Springer, Lyon, France, pp 442–456
Luo J, Pronobis A, Caputo B, Jensfelt P (2006) The KTH–IDOL2 Database. Tech. Rep. CVAP304, Kungliga Tekniska Hoegskolan, Stockholm, Sweden
Luo J, Pronobis A, Caputo B, Jensfelt P (2007) Incremental Learning for Place Recognition in Dynamic Environments. In: Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS07). IEEE, San Diego, CA, USA, pp 721–728
Markkula M, Tico M, Sepponen B, Nirkkonen K, Sormunen E (2001) A Test Collection for the Evaluation of Content–Based Image Retrieval Algorithms — A User and Task–Based Approach. Information Retrieval 4(3–4):275–293
Müller H, Rosset A, Vallée JP, Terrier F, Geissbuhler A (2004) A reference data set for the evaluation of medical image retrieval systems. Journal of Computerized Medical Imaging and Graphics 28:65–77
Müller H, Clough PD, Hersh W, Deselaers T, Lehmann T, Geissbuhler A (2006) Using Heterogeneous Annotation and Visual Information for the Benchmarking of Image Retrieval Systems. In: Santini S, Schettini R, Gevers T (eds) Internet Imaging VII. SPIE Proceedings, vol 6061, San José, CA, USA
Nowak S, Dunker P (2010) Overview of the CLEF 2009 Large–Scale Visual Concept Detection and Annotation Task. In: Peters C, Tsikrika T, Müller H, Kalpathy-Cramer J, Jones JFG, Gonzalo J, Caputo B (eds) Multilingual Information Access Evaluation Vol. II Multimedia Experiments: Proceedings of the 10th Workshop of the Cross–Language Evaluation Forum (CLEF 2009), Revised Selected Papers. Lecture Notes in Computer Science (LNCS), Corfu, Greece
Reid N (1999) The Photographic Collections in St Andrews University Library. Scottish Archives 5:83–90
Rosset A, Müller H, Martins M, Vallée JP, Ratib O (2004) Casimage Project—A Digital Teaching Files Authoring Environment. Journal of Thoracic Imaging 19(2):103–108
Wallis JM, Miller MM, Miller TR, Vreeland TH (1995) An Internet–based Nuclear Medicine Teaching File. The Journal of Nuclear Medicine 36(8):1520–1527
Westerveld T, van Zwol R (2007) The INEX 2006 Multimedia Track. In: Fuhr N, Lalmas M, Trotman A (eds) Advances in XML Information Retrieval: Fifth International Workshop of the Initiative for the Evaluation of XML Retrieval. INEX 2006. Revised Selected Papers. Lecture Notes in Computer Science/Lecture Notes in Artificial Intelligence (LNCS/LNAI), vol 4518. Springer, Schloss Dagstuhl, Germany, pp 331–344
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Grubinger, M., Nowak, S., Clough, P. (2010). Data Sets Created in ImageCLEF. In: Müller, H., Clough, P., Deselaers, T., Caputo, B. (eds) ImageCLEF. The Information Retrieval Series, vol 32. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15181-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-15181-1_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15180-4
Online ISBN: 978-3-642-15181-1
eBook Packages: Computer ScienceComputer Science (R0)