Skip to main content
Log in

Media objects for user-centered similarity matching

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The increase of digital image and video acquisition devices, combined with the growth of the World Wide Web, requires the definition of user-relevant similarity matching methods providing meaningful access to documents searched by users among large amounts of data. The aim of our work is to define media objects for document description suited to images and videos, integrating a user-centered definition of importance for similarity matching. The importance is defined according to criteria and hypotheses, which have been experimentally validated. This leads to a definition of a weighting scheme for media objects (based on objects size, position, and scene homogeneity), which has also been validated with users in a second experiment. This model allows for meaningful similarity matching between document pairs and between users’ queries and documents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Defined as “excessive demand made on the cognitive processes, in particular memory” in [25] page 717.

  2. This choice is motivated by the fact that one author was present in the same room as the participants during experiments, and therefore his face is more easily recognizable for the participants than another arbitrarily chosen face. In the experiment section, this face is labeled “Jean”.

  3. We do not consider the empty set that corresponds to combinations involving no criterion.

  4. Since the fragmentation criterion has been proved ineffective in the experiment (see Section 4.4), no formal modeling for F is given here.

  5. In our experiments, image collections have been annotated manually in order to ensure a high annotation quality, so that the experiments can evaluate the quality of the model without being biased by errors of an automatic annotation process.

  6. These four assessors are different from the ones who have participated in the first experiment.

  7. The maximum divergence value is reached when the two rankings are in a reverse order.

References

  1. Ayache S, Quénot G, Gensel J (2007) Classifier fusion for svm-based multimedia semantic indexing. In: European conference of information retrieval (ECIR’2007), pp 494–504

  2. Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley

  3. Baillargeon G (1998) Probabilités, statistique et techniques de régression. SMG, Québec

  4. Bastan M, Duygulu P (2006) Recognizing objects and scenes in news videos. In: Proceedings of CIVR’2006, Phoenix AZ

  5. Carson C, Thomas M, Belongie S, Hellerstein JM, Malik J (1999) Blobworld: a system for region-based image indexing and retrieval. In: International conference on visual information and information systems, Springer

  6. Hoerster E, Lienhart R, Slaney M (2007) Image retrieval on large-scale image databases. In: Proceedings of CIVR’2007, Amsterdam, The Netherlands, pp 17–24

  7. Hollink L, Screiber ATh, Wielinga BJ, Worring M (2004) Classification of user image descriptions. Intl J Human-Comput Stud 61:601–626

    Article  Google Scholar 

  8. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  9. Tian Q, Lim JH, Mulhem P (2003) Home photo content modeling for personalized event-based retrieval. IEEE Multimed Spec Issue Multimed Content Model Personalization 10(4):28–37

    Google Scholar 

  10. Lee JH (1997) Analyses of multiple evidence combination. In: SIGIR’97, Philadelphia, USA, pp 267–276

  11. Lim J-H (2000) Photograph retrieval and classification by visual keywords and thesaurus. New Generat Comput 18:147–156

    Article  Google Scholar 

  12. Lim J-H (2001) Building visual vocabulary for image indexation and query formulation. Pattern Anal Appl (Special Issue on Image Indexation) 4(2/3):125–139

    MATH  Google Scholar 

  13. Lothar K-H (2001) Empiristic theory of visual gestalt perception. Hierarchy and interactions of visual functions. Koeln, Enane

  14. Lowe D (1999) Object recognition from local scale-invariant features. In: Proceedings of ICCV, pp 1150–1157

  15. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110

    Article  Google Scholar 

  16. Lu Y, Guo H (1999) Background removal in image indexing and retrieval. In: International conference on image analysis and processing (ICIAP 1999). Venice, Italy, pp 933–938

  17. Martin P, Bateson P (1986) Measuring behaviour: an introductory guide. Cambridge University Press

  18. Martinet J, Chiaramella Y, Mulhem P (2005) A model for weighting image objects in home photographs. In: ACM-CIKM’2005. Bremen, Germany, pp 760–767

  19. Martinet J, Chiaramella Y, Mulhem P, Ounis I (2003) Photograph indexing and retrieval using star-graphs. In: Proceedings of CBMI’03—third international workshop on content-based multimedia indexing. Rennes, pp 335–341

  20. Martinet J, Satoh S (2007) Using visual-textual mutual information for inter-modal document indexing. In: ECIR’07. Rome, Italy

  21. Mitra M, Singhal A, Buckley C (1998) Improving automatic query expansion. In: Research and development in information retrieval, pp 206–214

  22. Mulhem P, Lim JH, Leow WK, Kankanhalli M (2003) Advances in digital home photo albums. In: Deb S (ed) Multimedia systems and content-based image retrieval. Idea Group Publishing

  23. Osberger W, Maeder AJ (1998) Automatic identification of perceptually important regions in an image using a model of the human visual system. In: ICPR, Brisbane, Australia

  24. Ounis I, Pasca M (1998) Finding the best parameters for image ranking: a user-oriented approach. In: Proceedings of The IEEE knowledge and data engineering exchange conference (KDEX’98). Taipei, Taiwan, pp 50–59

  25. Preece J (1994) Human-computer interaction. Addison-Wesley

  26. Qiu Y, Frei H-P (1993) Concept-based query expansion. In: SIGIR’93. Pittsburgh, USA, pp 160–169

  27. Quelhas P, Monay F, Odobez J-M, Gatica-Perez D, Tuytelaars T (2007) A thousand words in a scene. IEEE Trans Pattern Anal Mach Intell 29(9):1575–1589

    Google Scholar 

  28. Quelhas P, Odobez J-M (2006) Natural scene image modeling using color and texture visterms. In: CIVR’2006. Phoenix AZ

  29. Rodden K (1999) How do people organise their photographs? In: Proceedings of 25th BCS-IRSG Colloquium on IR

  30. Rodden K, Wood KR (2003) How do people manage their digital photographs? In: ACM conference on human factors in computing systems—CHI’03. Florida, USA, pp 409–416

  31. Rojet AS, Schwartz EL (1990) Design considerations for a space-variant visual sensor with complex-logarithmic geometry. 10th Int Conf Pattern Recognit 2:278–285

    Google Scholar 

  32. Salton G (1971) The SMART retrieval system. Prentice Hall

  33. Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. In: Information processing and management, pp 513–523

  34. Salton G, McGill M (1983) Introduction to modern information retrieval. McGraw-Hill

  35. Salton G, Wong A, Yang C (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620

    Article  MATH  Google Scholar 

  36. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423, 623–656

    MathSciNet  MATH  Google Scholar 

  37. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  38. Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: Ninth IEEE international conference on computer vision, vol 2, pp 1470–1477

  39. Smeulders AW, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380, December 2000

    Google Scholar 

  40. Snoek CGM, Worring M (2005) Multimodal video indexing: a review of the state-of-the-art. Multimed Tools Appl 25(1):5–35

    Article  Google Scholar 

  41. Stentiford F (2003) An attention based similarity measure with application to content based information retrieval. In: Proceedings of SPIE storage and retrieval for media databases, vol 5021. Santa Clara, CA, USA

  42. ter Haar Romeny BM (2003) Front-end vision and multi-scale image analysis. Kluwer

  43. van Rijsbergen CJ (1979) Information retrieval, 2nd edn. Butterworths, London

    Google Scholar 

  44. Vapnik V (1995) The nature of statistical learning theory. Springer, New York

    MATH  Google Scholar 

  45. Vogel J, Schiele B (2007) Semantic modeling of natural scenes for content-based image retrieval. Int J Comput Vision 72(2):133–157

    Article  Google Scholar 

  46. Wang JZ, Du Y (2001) RF × IPF: a weighting scheme for multimedia information retrieval. In: ICIAP, pp 380–385

  47. Wang JZ, Li J, Wiederhold G (2001) SIMPLIcity: semantics-sensitive integrated matching for picture libraries. IEEE Trans Pattern Anal Mach Intell 23(9):947–963

    Article  Google Scholar 

  48. Yahiaoui I, Merialdo B, Huet B (2003) Comparison of multi-episode video summarisation algorithms. EURASIP J Appl Signal Process 1:48–55

    Article  Google Scholar 

  49. Zheng Q-F, Wang W-Q, Gao W (2006) Effective and efficient object-based image retrieval using visual phrases. In: ACM multimedia’2006. Santa Barbara, California

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean Martinet.

Additional information

Jean Martinet is currently supported by the Japan Society for the Promotion of Science (JSPS). The authors wish to thank the helpful comments of the reviewers, which helped to improve the quality of this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Martinet, J., Satoh, S., Chiaramella, Y. et al. Media objects for user-centered similarity matching. Multimed Tools Appl 39, 263–291 (2008). https://doi.org/10.1007/s11042-008-0200-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-008-0200-9

Keywords

Navigation