Multimodal retrieval with relevance feedback based on genetic programming

Calumby, Rodrigo Tripodi; da Silva Torres, Ricardo; Gonçalves, Marcos André

doi:10.1007/s11042-012-1152-7

Multimodal retrieval with relevance feedback based on genetic programming

Published: 23 June 2012

Volume 69, pages 991–1019, (2014)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Rodrigo Tripodi Calumby^1,2,
Ricardo da Silva Torres² &
Marcos André Gonçalves³

505 Accesses
14 Citations
Explore all metrics

Abstract

This paper presents a framework for multimodal retrieval with relevance feedback based on genetic programming. In this supervised learning-to-rank framework, genetic programming is used for the discovery of effective combination functions of (multimodal) similarity measures using the information obtained throughout the user relevance feedback iterations. With these new functions, several similarity measures, including those extracted from different modalities (e.g., text, and content), are combined into one single measure that properly encodes the user preferences. This framework was instantiated for multimodal image retrieval using visual and textual features and was validated using two image collections, one from the Washington University and another from the ImageCLEF Photographic Retrieval Task. For this image retrieval instance several multimodal relevance feedback techniques were implemented and evaluated. The proposed approach has produced statistically significant better results for multimodal retrieval over single modality approaches and superior effectiveness when compared to the best submissions of the ImageCLEF Photographic Retrieval Task 2008.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

Evaluating multimodal relevance feedback techniques for medical image retrieval

Article 01 August 2015

Use of Stochastic Optimization Algorithms in Image Retrieval Problems

Notes

http://www.cs.washington.edu/research/imagedatabase/groundtruth (as of 11/16/2011).
http://snowball.tartarus.org/algorithms/english/stemmer.html (as of 11/16/2011).
http://trec.nist.gov/trec_eval/index.html (as of 11/16/2011).
We also did not compute the C20 and F1 measures because the information about the subtopics for each image was not available for this collection.
http://www.imageclef.org/2008/results-photo (as of 11/16/2011).

References

Agrawal R, Grosky W, Fotouhi F (2006) Image retrieval using multimodal keywords. In: ISM ’06: Proceedings of the eighth IEEE international symposium on multimedia. Washington, DC, USA, pp 817–822. doi:10.1109/ISM.2006.91
Ah-Pine J, Cifarelli C, Clinchant S, Csurka G, Renders JM (2008) Xrce’s participation to imageclef 2008. In: Working notes for the CLEF 2008 workshop
Atrey P, Hossain M, Saddik AE, Kankanhalli M (2010) Multimodal fusion for multimedia analysis: a survey. Multimedia Syst 16:1–35. doi:10.1007/s00530-010-0182-0
Article Google Scholar
Baeza-Yates RA, Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley Longman Publishing Co, Inc, Boston, MA, USA
Google Scholar
Banzhaf W, Nordin P, Keller R, Francone F (1998) Genetic programming—an introduction. Morgan Kaufmann Publishers, Inc, San Francisco, CA
Book MATH Google Scholar
Bhanu B, Lin Y (2004) Object detection in multi-modal images using genetic programming. Appl Soft Comput 4(2):175–201
Article Google Scholar
Bottoni P, Ferri F, Grifoni P, Marcante A, Mussio P, Padula M, Reggiori A (2009) e-document management in situated interactivity: the wil approach. Univers Access Inf Soc 8:137–153. doi:10.1007/s10209-008-0142-z, URL:http://dl.acm.org/citation.cfm?id=1613120.1613126
Article Google Scholar
Bruno E, Kludas J, Marchand-Maillet S (2007) Combining multimodal preferences for multimedia information retrieval. In: MIR ’07: proceedings of the international workshop on workshop on multimedia information retrieval. New York, NY, USA, pp 71–78. doi:10.1145/1290082.1290095
Buckley C, Voorhees EM (2004) Retrieval evaluation with incomplete information. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR’04. ACM, New York, NY, USA, pp 25–32. doi:10.1145/1008992.1009000
Chapter Google Scholar
Caschera MC, D’Ulizia A (2007) Information extraction based on personalization and contextualization models for multimodal data. In: Proceedings of the 18th international conference on database and expert systems applications. IEEE Computer Society, Washington, DC, USA, pp 114–118. doi:10.1109/DEXA.2007.89, URL:http://dl.acm.org/citation.cfm?id=1302492.1302591
Google Scholar
Chai JY, Hong P, Zhou MX (2004) A probabilistic approach to reference resolution in multimodal user interfaces. In: Proceedings of the 9th international conference on intelligent user interfaces, IUI ’04. ACM, New York, NY, USA, pp 70–77. doi:10.1145/964442.964457
Google Scholar
Clinchant S, Csurka1 G, Ah-Pine J, Jacquet G, Perronnin F, Sánchez J, Minoukadeh K (2010) Xrce’s participation in Wikipedia retrieval, medical image modality classification and ad-hoc retrieval tasks of imageclef 2010. In: CLEF (Notebook Papers/LABs/Workshops)
Clough P, Grubinger M, Deselaers T, Hanbury A, Mller H (2007) Overview of the ImageCLEF 2006 photographic retrieval and object annotation tasks. In: Evaluation of multilingual and multi-modal information retrieval. Lecture notes in computer science, vol 4730. Springer Berlin / Heidelberg, pp 579–594. doi:10.1007/978-3-540-74999-8_71, URL:http://www.springerlink.com/content/e081998770x6566p
Chapter Google Scholar
Coelho TAS, Calado PP, Souza LV, Ribeiro-Neto B, Muntz R (2004) Image retrieval using multiple evidence ranking. IEEE Trans Knowl Data Eng 16(4):408–417. doi:10.1109/TKDE.2004.1269666
Article Google Scholar
Cooke T, Jkel F, Wallraven C, Blthoff HH (2007) Multimodal similarity and categorization of novel, three-dimensional objects. Neuropsychologia 45(3):484–495. http://www.ncbi.nlm.nih.gov/pubmed/16580027
Article Google Scholar
Corradini A, Mehta M, Bernsen NO, Martin JC, Abrilian S (2003) Multimodal input fusion in humancomputer interaction on the example of the on-going nice project. In: Proceedings of the NATO-ASI conference on data fusion for situation monitoring, incident detection, alert and response management
Deb S, Zhang Y (2004) An overview of content-based image retrieval techniques. In: Proceedings of the 18th international conference on advanced information networking and applications, vol 1, pp 59–64
Dorairaj R, Namuduri K (2004) Compact combination of MPEG-7 color and texture descriptors for image retrieval. In: Conference record of the thirty-eighth asilomar conference on signals, systems and computers, vol 1, pp 387–391
D’Ulizia A, Ferri F, Grifoni P (2010) Generating multimodal grammars for multimodal dialogue processing. Trans Sys Man Cyber Part A 40:1130–1145. doi:10.1109/TSMCA.2010.2041227
Google Scholar
Equitz W, Niblack W (1994) Retrieving images from a database using texture-algorithms from the QBIC system. IBM Research Report Technical Report RJ 9805, IBM
Fan W, Fox EA, Pathak P, Wu H (2004) The effects of fitness functions on genetic programming-based ranking discovery for Web search. J Am Soc Inf Sci Technol 55(7):628–636
Article Google Scholar
Ferecatu M, Sahbi H (2008) Telecom paristech at imageclefphoto 2008: bi-modal text and image retrieval with diversity enhancement. In: Working notes for the CLEF 2008 workshop
Ferreira CD, dos Santos JA, da Silva Torres R, Gonçalves MA, Rezende RC, Fan W (2011) Relevance feedback based on genetic programming for image retrieval. Pattern Recogn Lett 32(1):27–37
Article Google Scholar
Ferri F, Grifoni P, Padula M (2002) Using shape to index and query Web document contents. J Vis Lang Comput 13(4):355–373. doi:10.1006/jvlc.2002.0221, URL:http://www.sciencedirect.com/science/article/pii/S1045926X02902211
Article Google Scholar
Flickner M, Sawhney H, Niblack W, Ashley JQH, Dom B, Gorkani M, Hafner J, Lee D, Petkovic D, Steele D, Yanker P (1995) Query by image and video content: the QBIC system. Computer 28(9):23–32
Article Google Scholar
Freitas RB, da Silva Torres R (2005) OntoSAIA: Um ambiente Baseado em Ontologias para Recuperao e Anotao Semi-Automtica de Imagens. In: Proceedings of primeiro workshop de bibliotecas digitais, Simpsio Brasileiro de Banco de Dados, pp 60–79. Uberlandia, MG, Brazil
Grubinger M, Clough P, Hanbury A, Mller H (2008) Overview of the ImageCLEFphoto 2007 photographic retrieval task. In: Advances in multilingual and multimodal information retrieval. Lecture notes in computer science, vol 5152. Springer Berlin / Heidelberg, pp 433–444. doi:10.1007/978-3-540-85760-0_57, URL:http://www.springerlink.com/content/p4u1737885747w75
Harman D (1992) Relevance feedback revisited. In: Proceedings of the 15th annual international ACM SIGIR conference on research and development in information retrieval. Copenhagen, Denmark, pp 1–10. doi:10.1145/133160.133167
Huang C, Liu Q (2007) An orientation independent texture descriptor for image retireval. In: International conference on computational science, pp 772–776
Huang J, Kumar R, Mitra M, Zhu W, Zabih R (1997) Image indexing using color correlograms. In: Proceedings of the IEEE international conference on computer vision and pattern recognition, pp 762–768
Jiang W, Er G, Dai Q, Gu J (2005) Hidden annotation for image retrieval with long-term relevance feedback learning. Pattern Recogn 38(11):2007–2021
Article Google Scholar
Johnston M, Bangalore S (2005) Finite-state multimodal integration and understanding. Nat Lang Eng 11:159–187. doi:10.1017/S1351324904003572, URL:http://dl.acm.org/citation.cfm?id=1064781.1064784
Article Google Scholar
Kak A, Pavlopoulou C (2002) Content-based image retrieval from large medical databases. In: First international symposium on 3D data processing visualization and transmission, vol 10(1), pp 138–147
Kim DH, Chung CW, Barnard K (2005) Relevance feedback using adaptive clustering for image similarity retrieval. J Syst Softw 78(1):9–23
Article Google Scholar
Kovaćević A, Milosavljevć B, Konjović Z, Vidaković M (2010) Adaptive content-based music retrieval system. Multimed Tools Appl 47:525–544. doi:10.1007/s11042-009-0336-2
Article Google Scholar
Kovalev V, Volmer S (1998) Color co-occurence descriptors for querying-by-example. In: Proceedings of the 1998 conference on multimedia modeling, pp 32–38
Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, MA, USA
MATH Google Scholar
Lew MS (ed) (2001) Principles of visual information retrieval—advances in pattern recognition. Springer-Verlag, London Berlin Heidelberg
Google Scholar
Lewis J, Ossowski S, Hicks J, Errami M, Garner H (2006) Text similarity: an alternative way to search MEDLINE. Bioinformatics 22(18):2298–2304. http://bioinformatics.oxfordjournals.org/cgi/content/full/22/18/2298
Article Google Scholar
Li B, Yuan S (2004) A novel relevance feedback method in content-based image retrieval. In: Proceedings of international conference on information technology: coding an computing, pp 120–123
Lieberman H, Rosenzweig E, Singh P (2001) Aria: an agent for annotating and retrieving images. Computer 34(7):57–62
Article Google Scholar
Loncaric S (1998) A survey of shape analysis techniques. Pattern Recogn 31(8):983–1190
Article Google Scholar
Lu K, He X (2005) Image retrieval based on incremental subspace learning. Pattern Recogn 38(11):2047–2054
Article Google Scholar
Mankoff J, Hudson SE, Abowd GD (2000) Providing integrated toolkit-level support for ambiguity in recognition-based interfaces. In: Proceedings of the SIGCHI conference on human factors in computing systems, CHI ’00. ACM, New York, NY, USA, pp 368–375. doi:10.1145/332040.332459
Chapter Google Scholar
Meffert K (2010) Jgap—Java genetic algorithms and genetic programming package. http://jgap.sf.net. Accessed 15 Jan 2011
Ogle VE, Stonebraker M (1995) Chabot: retrieval from relational database of images. Computer 28(9):40–48
Article Google Scholar
Oviatt S (2008) The human-computer interaction handbook: fundamentals, evolving technologies and emerging applications, chap multimodal interfaces. CRC Press
Penatti OB, da Silva Torres R (2008) Color descriptors for Web image retrieval: a comparative study. In: XXI Brazilian symposium on computer graphics and image processing
Penatti OB, Valle EA, da Silva Torres R (2012) Comparative study of global color and texture descriptors for Web image retrieval. J Vis Commun Image Represent 23:359–380
Article Google Scholar
Porter MF (1997) An algorithm for suffix stripping. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA, pp 313–316. http://portal.acm.org/citation.cfm?id=275537.275705
Google Scholar
Robertson SE, Walker S, Jones S, Hancock-beaulieu MM, Gatford M (1995) Okapi at trec-3. In: Proceedings of the Third Text REtrieval Conference (TREC-3), pp 109–126
Rui Y, Huang TS, Ortega M, Mehrotra S (1998) Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans Circuits Syst Video Technol 8(5):644–655
Article Google Scholar
Rui Y, Huang TS, Chang SF (1999) Image retrieval: current techniques, promising directions, and open issues. J Visual Commun Image Represent 10(1):39–62
Article Google Scholar
da Silva Torres R (2004) Integrating image and spatial data for biodiversity information management. PhD thesis, Institute of Computing, University of Campinas
da Silva Torres R, Falcão AX (2006) Content-based image retrieval: theory and applications. Rev Inform Teór Apl 13(2):161–185
Google Scholar
da Silva Torres R, Falcão AX, Gonalves MA, Papa JP, Zhang B, Fan W, Fox EA (2009) A genetic programming framework for content-based image retrieval. Pattern Recogn 42(2):283–292
Article Google Scholar
Santos KL, Almeida H, da Silva Torres R, Gonalves MA (2009) Recuperao de imagens da Web utilizando múltiplas evidncias textuais e programao gentica. In: Brazilian symposium on databases. Fortaleza, Brazil, pp 91–105
Smeulders A, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Article Google Scholar
Stehling R, Nascimento M, Falcão A (2002) A compact and efficient image retrieval approach based on border/interior pixel classification. In: Proceedings of the eleventh international conference on information and knowledge management, pp 102–109
Swain M, Ballard D (1991) Color indexing. Int J Comput Vis 7(1):11–32
Article Google Scholar
Tamura H, Mori S, Yamawaki T (1978) Texture features corresponding to visual perception. IEEE Trans Syst Man Cybern 8(6):460–473
Article Google Scholar
Tao B, Dickinson B (2000) Texture recognition and image retrieval using gradient indexing. J Vis Commun Image Represent 11(3):327–342
Article Google Scholar
Thomas A, Paul C, Sanderson M, Grubinger M (2009) Overview of the ImageCLEFphoto 2008 photographic retrieval task. In: Evaluating systems for multilingual and multimodal information access. Lecture notes in computer science, vol 5706. Springer Berlin / Heidelberg, pp 500–511. doi:10.1007/978-3-642-04447-2_62, URL:http://www.springerlink.com/content/w62642627246m817/
Chapter Google Scholar
Tong H, He J, Li M, Zhang C, Ma W (2005) Graph based multi-modality learning. In: MULTIMEDIA ’05: Proceedings of the 13th annual ACM international conference on multimedia. New York, NY, USA, pp 862–871. doi:10.1145/1101149.1101337
Vadivel A, Majumdar A, Sural S (2004) Characteristics of weighted feature vector in content-based image retrieval applications. In: International conference intelligent sensing and information processing, pp 127–132
Williams A, Yoon P (2007) Content-based image retrieval using joint correlograms. Multimed Tools Appl 34(2):239–248
Article Google Scholar
Wu P, Manjunanth BS, Newsam SD, Shin HD (1999) A texture descriptor for image retrieval and browsing. In: CBAIVL ’99: proceedings of the IEEE workshop on content-based access of image and video libraries. IEEE Computer Society, Washington, DC, USA, p 3
Chapter Google Scholar
Xu Z, Xu X, Yu K, Tresp V (2003) A hybrid relevance-feedback approach to text retrieval. In: Proceedings of the 25th European conference on information retrieval research. Lecture notes in computer science, vol 2633, pp 81–293
Yan R, Hauptmann AG (2007) A review of text and image retrieval approaches for broadcast news video. Inf Retr 10(4–5):445–484. doi:10.1007/s10791-007-9031-y, URL:http://www.springerlink.com/content/r742245481q23631/
Article Google Scholar
Zeng Z, Hu Y, Liu M, Fu Y, Huang TS (2006) Training combination strategy of multi-stream fused hidden markov model for audio-visual affect recognition. In: Proceedings of the 14th annual ACM international conference on multimedia, MULTIMEDIA ’06, pp 65–68. ACM, New York, NY, USA. doi:10.1145/1180639.1180661
Chapter Google Scholar
Zhai CX, Cohen WW, Lafferty J (2003) Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval, SIGIR ’03. ACM, New York, NY, USA, pp 10–17. doi:10.1145/860435.860440
Chapter Google Scholar
Zhang D, Lu G (2004) Review of shape representation and description. Pattern Recogn 37(1):1–19
Article MATH Google Scholar
Zhang B, Gonçalves MA, Fan W, Chen Y, Fox EA, Calado P, Cristo M (2004) Combining structural and citation-based evidence for text classification. In: Proceedings of the 13th ACM conference on information and knowledge management, pp 162–163
Zhang R, Zhang Z, Li M, Ma W, Zhang H (2006) A probabilistic semantic model for image annotation and multi-modal image retrieval. Multimedia Syst 12(1):27–33. doi:10.1007/s00530-006-0025-1, URL:http://www.springerlink.com/content/u1t220x838372257/
Article Google Scholar
Zhou XS, Huang TS (2003) Relevance feedback in image retrieval: a comprehensive review. Multimedia Syst 8(6): 536–544
Article Google Scholar

Download references

Acknowledgements

We would like to thank all partners from LIS (Laboratory of Information Systems - IC/UNICAMP), RECOD (Reasoning for Complex Data - IC/UNICAMP), LDB (Databases Lab - DCC/UFMG). This work was supported by The National Council for Scientific and Technological Development (CNPq), Coordination for the Improvement of Higher Level Personnel (CAPES), São Paulo Research Foundation (FAPESP), and Minas Gerais Agency for Research and Development (FAPEMIG).

Author information

Authors and Affiliations

Department of Exact Sciences, University of Feira de Santana, Feira de Santana, Brazil
Rodrigo Tripodi Calumby
RECOD Lab, Institute of Computing, University of Campinas, Campinas, Brazil
Rodrigo Tripodi Calumby & Ricardo da Silva Torres
Department of Computer Science, Federal University of Minas Gerais, Belo Horizonte, Brazil
Marcos André Gonçalves

Authors

Rodrigo Tripodi Calumby
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo da Silva Torres
View author publications
You can also search for this author in PubMed Google Scholar
Marcos André Gonçalves
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rodrigo Tripodi Calumby.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Calumby, R.T., da Silva Torres, R. & Gonçalves, M.A. Multimodal retrieval with relevance feedback based on genetic programming. Multimed Tools Appl 69, 991–1019 (2014). https://doi.org/10.1007/s11042-012-1152-7

Download citation

Published: 23 June 2012
Issue Date: April 2014
DOI: https://doi.org/10.1007/s11042-012-1152-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal retrieval with relevance feedback based on genetic programming

Abstract

Access this article

Similar content being viewed by others

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

Evaluating multimodal relevance feedback techniques for medical image retrieval

Use of Stochastic Optimization Algorithms in Image Retrieval Problems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multimodal retrieval with relevance feedback based on genetic programming

Abstract

Access this article

Similar content being viewed by others

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

Evaluating multimodal relevance feedback techniques for medical image retrieval

Use of Stochastic Optimization Algorithms in Image Retrieval Problems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation