Skip to main content

Automatic Image Annotation at ImageCLEF

  • Chapter
  • First Online:
Book cover Information Retrieval Evaluation in a Changing World

Part of the book series: The Information Retrieval Series ((INRE,volume 41))

Abstract

Automatic image annotation is the task of automatically assigning some form of semantic label to images, such as words, phrases or sentences describing the objects, attributes, actions, and scenes depicted in the image. In this chapter, we present an overview of the various automatic image annotation tasks that were organized in conjunction with the ImageCLEF track at CLEF between 2009–2016. Throughout the 8 years, the image annotation tasks have evolved from annotating Flickr photos by learning from clean data to annotating web images by learning from large-scale noisy web data. The tasks are divided into three distinct phases, and this chapter will provide a discussion for each of these phases. We will also compare and contrast other related benchmarking challenges, and provide some insights into the future of automatic image annotation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Antol S, Agrawal A, Lu J, Mitchell M, Batra D, Zitnick CL, Parikh D (2015) VQA: visual question answering. In: Proceedings of the IEEE international conference on computer vision (ICCV). IEEE, Piscataway, pp 2425–2433. https://doi.org/10.1109/ICCV.2015.279

    Google Scholar 

  • Caesar H, Uijlings J, Ferrari V (2018) COCO-stuff: thing and stuff classes in context. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 1209–1218. http://openaccess.thecvf.com/content_cvpr_2018/html/Caesar_COCO-Stuff_Thing_and_CVPR_2018_paper.html

  • Chen X, Fang H, Lin T, Vedantam R, Gupta S, Dollár P, Zitnick CL (2015) Microsoft COCO captions: data collection and evaluation server. CoRR abs/1504.00325. http://arxiv.org/abs/ 1504.00325. 1504.00325.

    Google Scholar 

  • Clough P, Grubinger M, Deselaers T, Hanbury A, Müller H (2007) Overview of the ImageCLEF 2006 photographic retrieval and object annotation tasks. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval: seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS), vol 4730. Springer, Heidelberg, pp 223–256

    Chapter  Google Scholar 

  • Dang-Nguyen DT, Piras L, Riegler M, Boato G, Zhou L, Gurrin C (2017) Overview of ImageCLEFlifelog 2017: lifelog retrieval and summarization. In: Cappellato L, Ferro N, Goeuriot L, Mandl T (eds) CLEF 2017 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1866/

  • Das A, Kottur S, Gupta K, Singh A, Yadav D, Moura JMF, Parikh D, Batra D (2017) Visual dialog. In: Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE, Piscataway, pp 1080–1089. https://doi.org/10.1109/CVPR.2017.121

    Google Scholar 

  • Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), IEEE, Piscataway, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848

    Google Scholar 

  • Denkowski M, Lavie A (2014) Meteor universal: language specific translation evaluation for any target language. In: Proceedings of the ninth workshop on statistical machine translation. Association for computational linguistics, pp 376–380. https://doi.org/10.3115/v1/W14-3348

  • Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A (2010) The PASCAL visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338. https://doi.org/10.1007/s11263-009-0275-4

    Article  Google Scholar 

  • Everingham M, Eslami SMA, Van Gool L, Williams CKI, Winn J, Zisserman A (2015) The PASCAL visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136. https://doi.org/10.1007/s11263-014-0733-5

    Article  Google Scholar 

  • Fellbaum C (ed) (1998) WordNet an electronic lexical database. MIT Press, Cambridge

    MATH  Google Scholar 

  • Gilbert A, Piras L, Wang J, Yan F, Dellandrea E, Gaizauskas R, Villegas M, Mikolajczyk K (2015) Overview of the ImageCLEF 2015 scalable image annotation, localization and sentence generation task. In: Cappellato L, Ferro N, Jones GJF, SanJuan E (eds) CLEF 2015 labs and workshops, Notebook papers. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1391/

  • Gilbert A, Piras L, Wang J, Yan F, Ramisa A, Dellandrea E, Gaizauskas R, Villegas M, Mikolajczyk K (2016) Overview of the ImageCLEF 2016 scalable concept image annotation task. In: Balog K, Cappellato L, Ferro N, Macdonald C (eds) CLEF 2016 working notes. CEUR workshop proceedings (CEUR-WS.org), pp 254–278. ISSN 1613-0073. http://ceur-ws.org/Vol-1609/

  • Goëau H, Bonnet P, Joly A, Boujemaa N, Barthelemy D, Molino JF, Birnbaum P, Mouysset E, Picard M (2011) The CLEF 2011 plant images classification task. In: Petras V, Forner P, Clough P, Ferro N (eds) CLEF 2011 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1177/

  • He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1026–1034. https://doi.org/10.1109/ICCV.2015.123

  • Huiskes MJ, Lew MS (2008) The MIR flickr retrieval evaluation. In: Proceedings of the ACM international conference on multimedia information retrieval, pp 39–43

    Google Scholar 

  • Huiskes MJ, Thomee B, Lew MS (2010) New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative. In: Proceedings of the ACM international conference on multimedia information retrieval, pp 527–536

    Google Scholar 

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems, vol 25. Curran Associates, pp 1097–1105

    Google Scholar 

  • Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft COCO: common objects in context. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds) Proceedings of the European conference on computer vision (ECCV). Springer, Berlin, pp 740–755

    Google Scholar 

  • Müller H, Deselaers T, Deserno TM, Clough P, Kim E, Hersh WR (2007) Overview of the ImageCLEFmed 2006 medical retrieval and medical annotation tasks. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval: seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS), vol 4730. Springer, Heidelberg, pp 595–608

    Chapter  Google Scholar 

  • Müller H, Deselaers T, Deserno TM, Kalpathy-Cramer J, Kim E, Hersh WR (2008) Overview of the ImageCLEFmed 2007 medical retrieval and medical annotation tasks. In: Peters C, Jijkoun V, Mandl T, Müller H, Oard DW, Peñas A, Petras V, Santos D (eds) Advances in multilingual and multimodal information retrieval: eighth workshop of the cross–language evaluation forum (CLEF 2007). Revised selected papers. Lecture notes in computer science (LNCS), vol 5152. Springer, Heidelberg, pp 472–491

    Chapter  Google Scholar 

  • Müller H, Kalpathy-Cramer J, Kahn CE, Hatt W, Bedrick S, Hersh W (2009) Overview of the ImageCLEFmed 2008 medical image retrieval task. In: Peters C, Deselaers T, Ferro N, Gonzalo J, Jones GJF, Kurimo M, Mandl T, Peñas A (eds) Evaluating systems for multilingual and multimodal information access: ninth workshop of the cross–language evaluation forum (CLEF 2008). Revised selected papers. Lecture notes in computer science (LNCS), vol 5706. Springer, Heidelberg, pp 512–522

    Chapter  Google Scholar 

  • Nowak S, Dunker P (2010) Overview of the CLEF 2009 large-scale visual concept detection and annotation task. In: Peters C, Tsikrika T, Müller H, Kalpathy-Cramer J, Jones GJF, Gonzalo J, Caputo B (eds) Multilingual information access evaluation vol. II multimedia experiments – tenth workshop of the cross–language evaluation forum (CLEF 2009). Revised selected papers. Lecture notes in computer science (LNCS). Springer, Heidelberg, pp 94–109

    Google Scholar 

  • Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis. 42(3):145–175. https://doi.org/10.1023/A:1011139631724

    Article  Google Scholar 

  • Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the association for computational linguistics (ACL), pp 311–318

    Google Scholar 

  • Reshma IA, Ullah MZ, Aono M (2014) KDEVIR at ImageCLEF 2014 scalable concept image annotation task: ontology based automatic image annotation. In: Cappellato L, Ferro N, Halvey M, Kraaij W (eds) CLEF 2014 labs and workshops, Notebook papers, CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1180/

  • Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  • Sahbi H (2013) CNRS - TELECOM ParisTech at ImageCLEF 2013 scalable concept image annotation task: winning annotations with context dependent SVMs. In: Forner P, Navigli R, Tufis D, Ferro N (eds) CLEF 2013 evaluation labs and workshop, Online working notes, CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1179/

  • Thomee B, Popescu A (2012) Overview of the ImageCLEF 2012 flickr photo annotation and retrieval task. In: Forner P, Karlgren J, Womser-Hacker C, Ferro N (eds) CLEF 2012 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1178/

  • van de Sande KE, Gevers T, Snoek CG (2010) Evaluating color descriptors for object and scene recognition. IEEE Trans Pattern Anal Mach Intell 32:1582–1596. https://doi.org/10.1109/TPAMI.2009.154

    Article  Google Scholar 

  • Villegas M, Paredes R (2012a) Image-text dataset generation for image annotation and retrieval. In: Berlanga R, Rosso P (eds) II Congreso Español de Recuperación de Información, CERI 2012, Universidad Politécnica de Valencia, Valencia, pp 115–120

    Google Scholar 

  • Villegas M, Paredes R (2012b) Overview of the ImageCLEF 2012 scalable web image annotation task. In: Forner P, Karlgren J, Womser-Hacker C, Ferro N (eds) CLEF 2012 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1178/

  • Villegas M, Paredes R (2014) Overview of the ImageCLEF 2014 scalable concept image annotation task. In: Cappellato L, Ferro N, Halvey M, Kraaij W (eds) CLEF 2014 labs and workshops, Notebook papers, CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1180/, pp 308–328

  • Villegas M, Paredes R, Thomee B (2013) Overview of the ImageCLEF 2013 scalable concept image annotation subtask. In: Forner P, Navigli R, Tufis D, Ferro N (eds) CLEF 2013 working notes. CEUR workshop proceedings (CEUR-WS.org). ISSN 1613-0073. http://ceur-ws.org/Vol-1179/

  • Wang J, Gaizauskas R (2015) Generating image descriptions with gold standard visual inputs: motivation, evaluation and baselines. In: Proceedings of the 15th European workshop on natural language generation (ENLG). Association for computational linguistics, pp 117–126

    Google Scholar 

  • Wang J, Yan F, Aker A, Gaizauskas R (2014) A poodle or a dog? Evaluating automatic image annotation using human descriptions at different levels of granularity. In: Proceedings of the third workshop on vision and language, Dublin City University and the association for computational linguistics, pp 38–45

    Google Scholar 

Download references

Acknowledgements

The Concept Annotation, Localization and Sentence Generation task in ImageCLEF 2015 and 2016 were co-organized by the VisualSense (ViSen) consortium under the ERA-NET CHIST-ERA D2K 2011 Programme, jointly supported by UK EPSRC Grants EP/K019082/1 and EP/K01904X/1, French ANR Grant ANR-12-CHRI-0002-04 and Spanish MINECO Grant PCIN-2013-047.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Josiah Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wang, J., Gilbert, A., Thomee, B., Villegas, M. (2019). Automatic Image Annotation at ImageCLEF. In: Ferro, N., Peters, C. (eds) Information Retrieval Evaluation in a Changing World. The Information Retrieval Series, vol 41. Springer, Cham. https://doi.org/10.1007/978-3-030-22948-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-22948-1_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-22947-4

  • Online ISBN: 978-3-030-22948-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics