About Sound and Vision: CLEF Beyond Text Retrieval Tasks

Jones, Gareth J. F.

doi:10.1007/978-3-030-22948-1_13

About Sound and Vision: CLEF Beyond Text Retrieval Tasks

Gareth J. F. Jones⁹

Chapter
First Online: 14 August 2019

688 Accesses

Part of the book series: The Information Retrieval Series ((INRE,volume 41))

Abstract

CLEF was initiated with intention of providing a catalyst to research in Cross-Language Information Retrieval (CLIR) and Multilingual Information Retrieval (MIR). Focusing principally on European languages, it initially provided CLIR benchmark tasks to the research community within an annual cycle of task design, conduct and reporting. While the early focus was on textual data, the emergence of technologies to enable collection, archiving and content processing of multimedia content led to several initiatives which sought to address search for spoken and visual content. Similar to the interest in multilingual search for text, interest arose in working multilingually with multimedia content. To support research in these areas CLEF introduced a number of tasks in multilingual search for multimedia content. While investigation of image retrieval has formed the focus of the ImageCLEF task over many years, this chapter reviews tasks examining speech and video retrieval carried out within CLEF during its first 10 years, and overviews related work reported at other information retrieval benchmarks.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Akiba T, Nishizaki H, Aikawa K, Kawahara T, Matsui T (2011) Overview of the IR for spoken documents task in NTCIR-9 workshop. In: Kando N, Ishikawa D, Sugimoto M (eds) Proceedings of the 9th NTCIR workshop meeting on evaluation of information access technologies: information retrieval, question answering and cross-lingual information access. National Institute of Informatics, Tokyo
Google Scholar
Akiba T, Nishizaki H, Aikawa K, Hu X, Itoh Y, Kawahara T, Nakagawa S, Nanjo H, Yamashita Y (2013) Overview of the NTCIR-10 spokendoc-2 task. In: Kando N, Kishida K (eds) Proceedings of the 10th NTCIR conference on evaluation of information access technologies. National Institute of Informatics, Tokyo
Google Scholar
Akiba T, Nishizaki H, Nanjo H, Jones GJF (2014) Overview of the NTCIR-11 spokenquery&doc task. In: Kando N, Joho H, Kishida K (eds) Proceedings of the 11th NTCIR conference on evaluation of information access technologies. National Institute of Informatics, Tokyo
Google Scholar
Akiba T, Nishizaki H, Nanjo H, Jones GJF (2016) Overview of the ntcir-12 spokenquery&doc-2 task. In: Kando N, Sakai T, Sanderson M (eds) Proceedings of the 12th NTCIR conference on evaluation of information access technologies. National Institute of Informatics, Tokyo
Google Scholar
Aly R, Ordelman R, Eskevich M, Jones GJF, Chen S (2013) Linking inside a video collection - what and how to measure? In: Proceedings of the first worldwide web workshop on linked media (LiME-2013), International World Wide Web Conference Committee (IW3C2), Geneva
Google Scholar
Awad G, Fiscus J, Joy D, Michel M, Smeaton AF, Kraaij W, Eskevich M, Aly R, Ordelman R, Jones GJF, Huet B, Larson M (2016) TRECVID 2016: evaluating video search, video event detection, localization, and hyperlinking. In: The sixteenth international workshop on video retrieval evaluation (TRECVID 2016). National Institute of Standards and Technology (NIST), Special Publication 500–321, Washington
Google Scholar
Awad G, Butt A, Fiscus J, Joy D, Delgado A, Mcclinton W, Michel M, Smeaton A, Graham Y, Kraaij W, Quénot G, Eskevich M, Roeland Ordelman GJFJ, Huet B (2017) Trecvid 2017: evaluating ad-hoc and instance video search, events detection, video captioning, and hyperlinking. In: The seventeenth international workshop on video retrieval evaluation (TRECVID 2017). National Institute of Standards and Technology (NIST), Special Publication 500–321, Washington
Google Scholar
Byrne W, Doermann D, Franz M, Member S, Gustman S, Soergel D, Ward T, jing Zhu W (2004) Automatic recognition of spontaneous speech for access to multilingual oral history archives. IEEE Trans Speech Audio Process 12(4):420–435
Article Google Scholar
Clough P, Sanderson M, Reid N (2006) The Eurovision St Andrews collection of photographs. SIGIR Forum 40(1):21–30
Article Google Scholar
Eskevich M, Jones GJF (2014) Exploring speech retrieval from meetings using the AMI corpus. Comput Speech Lang (Special Issue on Information Extraction and Retrieval) 28(5):1021–1044
Google Scholar
Eskevich M, Jones GJF, Chen S, Aly R, Ordelman R, Larson M (2012a) Search and hyperlinking task at mediaeval 2012. In: Larson MA, Schmiedeke S, Kelm P, Rae A, Mezaris V, Piatrik T, Soleymani M, Metze F, Jones GJF (eds) Working Notes Proceedings of the MediaEval 2012 multimedia benchmark workshop. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073, http://ceur-ws.org/Vol-927/
Eskevich M, Jones GJF, Larson M, Wartena C, Aly R, Verschoor T, Ordelman R (2012b) Comparing retrieval effectiveness for alternative content segmentation methods for internet video. In: Proceedings of the 10th workshop on content-based multimedia indexing. IEEE, New Jersey, CBMI 2012
Google Scholar
Eskevich M, Jones GJF, Chen S, Aly R, Ordelman R (2013) The search and hyperlinking task at mediaeval 2013. In: Larson M, Anguera X, Reuter T, Jones GJF, Ionescu B, Schedl M, Piatrik T, Hauff C, Soleymani M (eds) Working notes proceedings of the MediaEval 2013 multimedia benchmark workshop. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073, http://ceur-ws.org/Vol-1043/
Eskevich M, Aly R, Racca DN, Ordelman R, Chen S, Jones GJF (2014) The search and hyperlinking task at mediaeval 2014. In: Larson M, Ionescu B, Anguera X, Eskevich M, Korshunov P, Schedl M, Soleymani M, Petkos P, Sutcliffe R, Choi J, Jones GJF (eds) Working notes proceedings of the MediaEval 2014 multimedia benchmark workshop. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073, http://ceur-ws.org/Vol-1263/
Eskevich M, Aly R, Ordelman R, Racca DN, Chen S, Jones GJF (2015) SAVA at Mediaeval 2015: Search and anchoring in video archives. In: Larson M, Ionescu B, Sjöberg M, Anguera X, Poignant J, Riegler M, Eskevich M, Hauff C, Sutcliffe R, Jones GJF, Yang YH, Soleymani M, Papadopoulos S (eds) Working notes proceedings of the MediaEval 2015 multimedia benchmark workshop. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073, http://ceur-ws.org/Vol-1436/
Federico M, Jones GJF (2004) The CLEF 2003 cross-language spoken document retrieval track. In: Peters C, Braschler M, Gonzalo J, Kluck M (eds) Comparative evaluation of multilingual information access systems: fourth workshop of the cross–language evaluation forum (CLEF 2003) revised selected papers. Lecture notes in computer science (LNCS), vol 3237. Springer, Heidelberg, p 646
Chapter Google Scholar
Federico M, Bertoldi N, Levow GA, Jones GJF (2005) CLEF 2004 cross-language spoken document retrieval track. In: Peters C, Clough P, Gonzalo J, Jones GJF, Kluck M, Magnini B (eds) Multilingual information access for text, speech and images: fifth workshop of the cross–language evaluation forum (CLEF 2004) revised selected papers. Lecture notes in computer science (LNCS), vol 3491. Springer, Heidelberg, pp 816–820
Chapter Google Scholar
Garofolo JS, Auzanne CGP, Voorhees EM (2000) The trec spoken document retrieval track: a success story. In: Content-Based Multimedia Information Access - vol 1, LE CENTRE DE HAUTES ETUDES INTERNATIONALES D’INFORMATIQUE DOCUMENTAIRE, Paris, France, France, RIAO ‘00, pp 1–20
Google Scholar
Glavitsch U, Schäuble P (1992) A system for retrieving speech documents. In: Belkin NJ, Ingwersen P, Mark Pejtersen A, Fox EA (eds) Proceedings of the 15th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 1992). ACM Press, New York, pp 168–176
Chapter Google Scholar
Hauptmann AG, Witbrock MJ (1997) Informedia: news-on-demand multimedia information acquisition and retrieval. In: Maybury MT (ed) Intelligent multimedia information retrieval. MIT Press, Cambridge, pp 215–239
Google Scholar
James DA (1995) The application of classical information retrieval techniques to spoken documents. PhD thesis, Cambridge University
Google Scholar
Jones GJF (2000) Applying machine translation resources for cross-language information access from spoken documents. In: Proceedings of MT 2000: machine translation and multilingual applications in the new millennium. British Computer Society, pp 4-1–4-9
Google Scholar
Jones GJF (2001) New challenges for cross-language information retrieval: multimedia data and the user experience. In: Peters C (ed) Cross-language information retrieval and evaluation: workshop of cross-language evaluation forum (CLEF 2000). Lecture notes in computer science (LNCS), vol 2069. Springer, Heidelberg, pp 71–81
Chapter Google Scholar
Jones GJF (2013) An introduction to crowdsourcing for language and multimedia technology research. In: Agosti M, Ferro N, Forner P, Müller H, Santucci G (eds) Information retrieval meets information visualization – PROMISE Winter School 2012, Revised Tutorial Lectures. Lecture notes in computer science (LNCS), vol 7757. Springer, Heidelberg, pp 132–154
Google Scholar
Jones GJF, Federico M (2003) CLEF 2002 cross-language spoken document retrieval pilot track report. In: Peters C, Braschler M, Gonzalo J, Kluck M (eds) Advances in cross-language information retrieval: third workshop of the cross–language evaluation forum (CLEF 2002) Revised Papers. Lecture notes in computer science (LNCS), vol 2785. Springer, Heidelberg, pp 446–457
Chapter Google Scholar
Jones GJF, Foote JT, Spärck Jones K, Young SJ (1996) Retrieving spoken documents by combining multiple index sources. In: Frei HP, Harman D, Schaübie P, Wilkinson R (eds) Proceedings of the 19th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 1996). ACM Press, New York, pp 30–38
Google Scholar
Kekäläinen J, Järvelin K (2002) Using graded relevance assessments in IR evaluation. J Am Soc Inf Sci Technol 53(13):1120–1129
Article Google Scholar
Khwileh A, Jones GJF (2016) Investigating segment-based query expansion for user-generated spoken content retrieval. In: 14th international workshop on content-based multimedia indexing, IEEE, CBMI 2016, pp 1–6
Google Scholar
Khwileh A, Afli H, Jones GJF, Way A (2017) Identifying effective translations for cross-lingual arabic-to-english user-generated speech search. In: Proceedings of the third arabic natural language processing workshop. Association for Computational Linguistics, pp 100–109
Google Scholar
Larson M, Jones GJF (2011) Spoken content retrieval: a survey of techniques and technologies. Found Trends Inf Retr 5(4–5):235—422
Google Scholar
Larson M, Newman E, Jones GJF (2009) Overview of VideoCLEF 2008: automatic generation of topic-based feeds for dual language audio-visual content. In: Peters C, Deselaers T, Ferro N, Gonzalo J, Jones GJF, Kurimo M, Mandl T, Peñas A (eds) Evaluating systems for multilingual and multimodal information access: ninth workshop of the cross-language evaluation forum (CLEF 2008). Revised selected papers. Lecture notes in computer science (LNCS), vol 5706. Springer, Heidelberg, pp 906–917
Chapter Google Scholar
Larson M, Newman E, Jones GJF (2010) Overview of VideoCLEF 2009: new perspectives on speech-based multimedia content enrichment. In: Peters C, Tsikrika T, Müller H, Kalpathy-Cramer J, Jones GJF, Gonzalo J, Caputo B (eds) Multilingual information access evaluation Vol. II multimedia experiments – tenth workshop of the cross–language evaluation forum (CLEF 2009). Revised selected papers. Lecture notes in computer science (LNCS). Springer, Heidelberg, pp 354–368
Google Scholar
Larson M, Eskevich M, Ordelman R, Kofler C, Schmiedeke S, Jones GJF (2011) Overview of mediaeval 2011 rich speech retrieval task and genre tagging task. In: Larson M, Rae A, Demarty CH, Kofler C, Metze F, Troncy R, Mezaris V, Jones GJF (eds) Working notes proceedings of the MediaEval 2011 multimedia benchmark workshop. CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073, http://ceur-ws.org/Vol-807/
Marchand-Maillet S (2000) Content-based video retrieval: an overview. Technical report, Computer Vision Group, Computing Science Center, University of Geneva
Google Scholar
Marge M, Banerjee S, Rudnicky AI (2010) Using the Amazon Mechanical Turk for transcription of spoken language. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP 2010). IEEE, Piscataway, pp 5270–5273
Chapter Google Scholar
Oard DW, Wang J, Jones GJF, White RW, Pecina P, Soergel D, Huang X, Shafran I (2007) Overview of the CLEF-2006 cross-language speech retrieval track. In: Peters C, Clough P, Gey FC, Karlgren J, Magnini B, Oard DW, de Rijke M, Stempfhuber M (eds) Evaluation of multilingual and multi-modal information retrieval: seventh workshop of the cross–language evaluation forum (CLEF 2006). Revised selected papers. Lecture notes in computer science (LNCS), vol 4730. Springer, Heidelberg, pp 744–758
Chapter Google Scholar
Ordelman RJF, Eskevich M, Aly R, Huet B, Jones GJF (2015) Defining and evaluating video hyperlinking for navigating multimedia archives. In: Proceedings of the 24th international conference on world wide web. ACM, New York, WWW ‘15 Companion, pp 727–732
Google Scholar
Over P, Fiscus J, Joy D, Michel M, Awad G, Smeaton A, Kraaij W, Quénot G, Ordelman R, Aly R (2015) Trecvid 2015 – an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: The fifteenth international workshop on video retrieval evaluation (TRECVID 2015). National Institute of Standards and Technology (NIST), Special Publication 500-321, Washington
Google Scholar
Pecina P, Hoffmannová P, Jones GJF, Zhang Y, Oard DW (2008) Overview of the CLEF-2007 cross-language speech retrieval track. In: Peters C, Jijkoun V, Mandl T, Müller H, Oard DW, Peñas A, Petras V, Santos D (eds) Advances in multilingual and multimodal information retrieval: eighth workshop of the cross–language evaluation forum (CLEF 2007). Revised selected papers. Lecture notes in computer science (LNCS), vol 5152. Springer, Heidelberg, pp 674–686
Chapter Google Scholar
Racca DN, Jones GJ (2016) On the effectiveness of contextualisation techniques in spoken query spoken content retrieval. In: Perego R, Sebastiani F, Aslam J, Ruthven I, Zobel J (eds) Proceedings of the 39th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 2016). ACM Press, New York, pp 933–936
Google Scholar
Rashtchian C, Young P, Hodosh M, Hockenmaier J (2010) Collecting image annotations using Amazon’s Mechanical Turk. In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk, Association for Computational Linguistics, pp 139–147
Google Scholar
Sanderson M, Shou XM (2007) Search of spoken documents retrieves well recognized transcripts. In: Amati G, Carpineto C, Romano G (eds) Advances in information retrieval. Proceedings of the 29th European conference on IR research (ECIR 2007). Lecture notes in computer science (LNCS), vol 4425. Springer, Heidelberg, pp 505–516
Chapter Google Scholar
Schmiedeke S, Xu P, Ferrané I, Eskevich M, Kofler C, Larson M, Estève Y, Lamel L, Jones GJF, Sikora T (2013) Blip10000: a social video dataset containing SPUG content for tagging and retrieval. In: Proceedings of ACM multimedia systems. ACM, New York, MMSys’13
Google Scholar
Schoeffmann K, Hopfgartner F, Marques O, Böszörmenyi L, Jose JM (2010) Video browsing interfaces and applications: a review. SPIE Rev 1(1):1–35
Google Scholar
Sheridan P, Wechsler M, Schäuble P (1997) Cross-language speech retrieval: establishing a baseline performance. In: Belkin NJ, Narasimhalu AD, Willett P, Hersh W, Can F, Voorhees EM (eds) Proceedings of the 20th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR 1997). ACM Press, New York, pp 99–108
Chapter Google Scholar
Smeaton AF, Over P, Kraaij W (2006) Evaluation campaigns and trecvid. In: Proceedings of the 8th ACM international workshop on multimedia information retrieval. ACM, New York, MIR ‘06, pp 321–330
Google Scholar
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Article Google Scholar
White RW, Oard DW, Jones GJF, Soergel D, Huang X (2006) Overview of the CLEF-2005 cross-language speech retrieval track. In: Peters C, Gey FC, Gonzalo J, Jones GJF, Kluck M, Magnini B, Müller H, de Rijke M (eds) Accessing multilingual information repositories: sixth workshop of the cross–language evaluation forum (CLEF 2005). Revised selected papers. Lecture notes in computer science (LNCS), vol 4022. Springer, Heidelberg, pp 744–759
Chapter Google Scholar

Download references

Acknowledgements

The success of the CLEF and MediaEval tasks described in this chapter would not have been possible without the work of the task co-chairs Marcello Federico, Douglas W. Oard, Martha Larson, Maria Eskevich, Robin Aly and Roeland Ordelman.

Author information

Authors and Affiliations

ADAPT Centre, School of Computing, Dublin City University, Dublin, Ireland
Gareth J. F. Jones

Authors

Gareth J. F. Jones
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gareth J. F. Jones .

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell’Informazione, Università degli Studi di Padova , Padova, Italy
Nicola Ferro
Consiglio Nazionale delle Ricerche, Istituto di Scienza e Tecnologie dell’Informazione, Pisa, Italy
Carol Peters

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jones, G.J.F. (2019). About Sound and Vision: CLEF Beyond Text Retrieval Tasks. In: Ferro, N., Peters, C. (eds) Information Retrieval Evaluation in a Changing World. The Information Retrieval Series, vol 41. Springer, Cham. https://doi.org/10.1007/978-3-030-22948-1_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-22948-1_13
Published: 14 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22947-4
Online ISBN: 978-3-030-22948-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics