Skip to main content

Advertisement

Log in

A Learning to Rank framework applied to text-image retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

We present a framework based on a Learning to Rank setting for a text-image retrieval task. In Information Retrieval, the goal is to compute the similarity between a document and an user query. In the context of text-image retrieval where several similarities exist, human intervention is often needed to decide on the way to combine them. On the other hand, with the Learning to Rank approach the combination of the similarities is done automatically. Learning to Rank is a paradigm where the learnt objective function is able to produce a ranked list of images when a user query is given. These score functions are generally a combination of similarities between a document and a query. In the past, Learning to Rank algorithms were successfully applied to text retrieval where they outperformed baselines such as BM25 or TFIDF. This inspired us to apply our state-of-the-art algorithm, called OWPC (Usunier et al. 2009), to the text-image retrieval task. At this time, no benchmarks are available, therefore we present a framework for building one. The empirical validation of this algorithm is done on the dataset constructed through comparison of typical text-image retrieval similarities. In both cases, visual only and text and visual, our algorithm performs better than a simple baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://trecvid.nist.gov

  2. http://www.imageclef.org

  3. Constituting pairs is the more expensive part of Learning to Rank where in the worst case we need to create \(O({{[{{\bf y}}}]}*{{[{\bar {\bf y}}]}})\) pairs with [y] the number of relevant elements and \({{[{\bar {\bf y}}]}}\) irrelevant ones.

  4. http://www.imageclef.org/

  5. For clarity, we restricted ourselves to the case where the relevance judgements are binary. We can also notice that this count ignores the relative positions of the relevant documents.

  6. We preferred use the 2006 collection where more information and assessed queries are available than for the 2007 or 2008 collections.

  7. http://ir.dcs.gla.ac.uk/ressources/test_collections/cacm

  8. The score function is a scalar product between a weight vector and the feature vector, so the results of this product is dependent of the scale of the values.

References

  1. Aslam JA, Kanoulas E, Pavlu V, Savev S, Yilmaz E (2009) Document selection methodologies for efficient and effective learning-to-rank. In: SIGIR ’09: proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval. ACM, New York, NY, USA, pp 468–475

    Chapter  Google Scholar 

  2. Burges CJC, Ragno R, Le QV (2006) Learning to rank with nonsmooth cost functions. In: NIPS, pp 193–200

  3. Burges CJC, Shaked T, Renshaw E, Lazier A, Deeds M, Hamilton N, Hullender GN (2005) Learning to rank using gradient descent. In: ICML, pp 89–96

  4. Cao Y, Xu J, Liu T-Y, Li H, Huang Y, Hon H-W (2006) Adapting ranking svm to document retrieval. In: SIGIR, pp 186–193

  5. Cao Z, Qin T, Liu T-Y, Tsai M-F, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In: ICML, pp 129–136

  6. La Cascia M, Sethi S, Sclaroff S (1998) Combining textual and visual cues for content-based image retrieval on the world wide web. In: In IEEE workshop on content-based access of image and video libraries, pp 24–28

  7. Clough P, Grubinger M, Deselaers T, Hanbury A, Müller H (2006) Overview of the imageclef 2006 photographic retrieval and object annotation tasks. In: CLEF, pp 579–594

  8. Cohen WW, Schapire RE, Singer Y (1997) Learning to order things. In: NIPS

  9. Cossock D, Zhang T (2006) Subset ranking using regression. In: COLT, pp 605–619

  10. Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):1–60

    Article  Google Scholar 

  11. Faria FF, Veloso A, Almeida HM, Valle E, da Silva Torres R, Gonçalves MA, Meira Jr W (2010) Learning to rank for content-based image retrieval. In: MIR ’10: proceedings of the international conference on multimedia information retrieval. ACM, New York, NY, USA, pp 285–294

    Chapter  Google Scholar 

  12. Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. JMLR 4:933–969

    MathSciNet  Google Scholar 

  13. Grubinger M, Clough PD, Müller H, Deselaers T (2006) The iapr benchmark: a new evaluation resource for visual information systems. In: International conference on language resources and evaluation

  14. Har-Peled S, Roth D, Zimak D (2002) Constraint classification for multiclass classification and ranking. In: NIPS, pp 785–792

  15. Hu Y, Li MJ, Yu N (2008) Multiple-instance ranking: learning to rank images for image retrieval. In: CVPR08, pp 1–8

  16. Järvelin K, Kekäläinen J (2000) Ir evaluation methods for retrieving highly relevant documents. In: SIGIR. ACM, New York, NY, USA, pp 41–48

    Chapter  Google Scholar 

  17. Joachims T (2002) Optimizing search engines using clickthrough data. In: KDD, pp 133–142

  18. Li M, Zheng Y-T, Lin S-X, Zhang Y-D, Chua T-S (2008) Multimedia evidence fusion for video concept detection via owa operator. In: MMM ’09: proceedings of the 15th international multimedia modeling conference on advances in multimedia modeling. Springer, Berlin, Heidelberg, pp 208–216

    Google Scholar 

  19. Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137

    Article  Google Scholar 

  20. Robertson SE, Walker S, Hancock-Beaulieu M, Gull A, Lau M (1992) Okapi at trec. In: TREC, pp 21–30

  21. Rui Y, Huang T (2000) Optimizing learning in image retrieval. In: CVPR, pp 236–243

  22. Taylor M, Guiver J, Robertson S, Minka T (2008) Softrank: optimizing non-smooth rank metrics. In: WSDM ’08. ACM, pp 77–86

  23. Tollari S, Detyniecki M, Fakeri-Tabrizi A, Marsala C, Amini M-R, Gallinari P (2008) Using visual concepts and fast visual diversity to improve image retrieval. In: Peters C, Deselaers T, Ferro N, Gonzalo J, Jones GJF, Kurimo M, Mandl T, Peñas A, Petras V (eds) CLEF. Lecture notes in computer science, vol 5706. Springer, pp 577–584

  24. Tollari S, Glotin H (2007) Web image retrieval on imageval: evidences on visualness and textualness concept dependency in fusion model. In: ACM international conference on image and video retrieval (ACM CIVR)

  25. Tollari S, Glotin H (2008) Learning optimal visual features from web sampling in online image retrieval. In: IEEE international conference on acoustics, speech and signal processing (ICASSP)

  26. Tong S, Chang E (2001) Support vector machine active learning for image retrieval. In: MULTIMEDIA ’01: proceedings of the ninth ACM international conference on multimedia. ACM, New York, NY, USA, pp 107–118

    Chapter  Google Scholar 

  27. Tsochantaridis I, Joachims T, Hofmann T, Altun Y (2005) Large margin methods for structured and interdependent output variables. J Mach Learn Res 6:1453–1484

    MathSciNet  MATH  Google Scholar 

  28. Usunier N, Buffoni D, Gallinari P (2009) Ranking with ordered weighted pairwise classification. In: Danyluk AP, Bottou L, Littman ML (eds) ICML. ACM international conference proceeding series, vol 382. ACM, p 133

  29. Xu J, Li H (2007) Adarank: a boosting algorithm for information retrieval. In: SIGIR, pp 391–398

  30. Xu J, Liu T-Y, Lu M, Li H, Ma W-Y (2008) Directly optimizing evaluation measures in learning to rank. In: SIGIR, pp 107–114

  31. Yager RR (1988) On ordered weighted averaging aggregation operators in multi-criteria decision making. IEEE Trans Syst Man Cybern 18:183–190

    Article  MathSciNet  MATH  Google Scholar 

  32. Yates RB, Ribeiro-Neto B (1999) Modern information retrieval. Addison Wesley

  33. Yue Y, Finley T, Radlinski F, Joachims T (2007) A support vector method for optimizing average precision. In: SIGIR, pp 271–278

  34. Zhai C, Lafferty J (2004) A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst 22(2):179–214

    Article  Google Scholar 

  35. Zhou XS, Huang TS (2002) Unifying keywords and visual contents in image retrieval. IEEE Multimed 9(2):23–33

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgement

This work was partially supported by the French National Agency of Research (ANR-06-MDCA-002 AVEIR project).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Buffoni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Buffoni, D., Tollari, S. & Gallinari, P. A Learning to Rank framework applied to text-image retrieval. Multimed Tools Appl 60, 161–180 (2012). https://doi.org/10.1007/s11042-011-0806-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-011-0806-1

Keywords

Navigation