Abstract
The increase of digital image and video acquisition devices, combined with the growth of the World Wide Web, requires the definition of user-relevant similarity matching methods providing meaningful access to documents searched by users among large amounts of data. The aim of our work is to define media objects for document description suited to images and videos, integrating a user-centered definition of importance for similarity matching. The importance is defined according to criteria and hypotheses, which have been experimentally validated. This leads to a definition of a weighting scheme for media objects (based on objects size, position, and scene homogeneity), which has also been validated with users in a second experiment. This model allows for meaningful similarity matching between document pairs and between users’ queries and documents.
Similar content being viewed by others
Notes
Defined as “excessive demand made on the cognitive processes, in particular memory” in [25] page 717.
This choice is motivated by the fact that one author was present in the same room as the participants during experiments, and therefore his face is more easily recognizable for the participants than another arbitrarily chosen face. In the experiment section, this face is labeled “Jean”.
We do not consider the empty set that corresponds to combinations involving no criterion.
Since the fragmentation criterion has been proved ineffective in the experiment (see Section 4.4), no formal modeling for F is given here.
In our experiments, image collections have been annotated manually in order to ensure a high annotation quality, so that the experiments can evaluate the quality of the model without being biased by errors of an automatic annotation process.
These four assessors are different from the ones who have participated in the first experiment.
The maximum divergence value is reached when the two rankings are in a reverse order.
References
Ayache S, Quénot G, Gensel J (2007) Classifier fusion for svm-based multimedia semantic indexing. In: European conference of information retrieval (ECIR’2007), pp 494–504
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley
Baillargeon G (1998) Probabilités, statistique et techniques de régression. SMG, Québec
Bastan M, Duygulu P (2006) Recognizing objects and scenes in news videos. In: Proceedings of CIVR’2006, Phoenix AZ
Carson C, Thomas M, Belongie S, Hellerstein JM, Malik J (1999) Blobworld: a system for region-based image indexing and retrieval. In: International conference on visual information and information systems, Springer
Hoerster E, Lienhart R, Slaney M (2007) Image retrieval on large-scale image databases. In: Proceedings of CIVR’2007, Amsterdam, The Netherlands, pp 17–24
Hollink L, Screiber ATh, Wielinga BJ, Worring M (2004) Classification of user image descriptions. Intl J Human-Comput Stud 61:601–626
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Tian Q, Lim JH, Mulhem P (2003) Home photo content modeling for personalized event-based retrieval. IEEE Multimed Spec Issue Multimed Content Model Personalization 10(4):28–37
Lee JH (1997) Analyses of multiple evidence combination. In: SIGIR’97, Philadelphia, USA, pp 267–276
Lim J-H (2000) Photograph retrieval and classification by visual keywords and thesaurus. New Generat Comput 18:147–156
Lim J-H (2001) Building visual vocabulary for image indexation and query formulation. Pattern Anal Appl (Special Issue on Image Indexation) 4(2/3):125–139
Lothar K-H (2001) Empiristic theory of visual gestalt perception. Hierarchy and interactions of visual functions. Koeln, Enane
Lowe D (1999) Object recognition from local scale-invariant features. In: Proceedings of ICCV, pp 1150–1157
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110
Lu Y, Guo H (1999) Background removal in image indexing and retrieval. In: International conference on image analysis and processing (ICIAP 1999). Venice, Italy, pp 933–938
Martin P, Bateson P (1986) Measuring behaviour: an introductory guide. Cambridge University Press
Martinet J, Chiaramella Y, Mulhem P (2005) A model for weighting image objects in home photographs. In: ACM-CIKM’2005. Bremen, Germany, pp 760–767
Martinet J, Chiaramella Y, Mulhem P, Ounis I (2003) Photograph indexing and retrieval using star-graphs. In: Proceedings of CBMI’03—third international workshop on content-based multimedia indexing. Rennes, pp 335–341
Martinet J, Satoh S (2007) Using visual-textual mutual information for inter-modal document indexing. In: ECIR’07. Rome, Italy
Mitra M, Singhal A, Buckley C (1998) Improving automatic query expansion. In: Research and development in information retrieval, pp 206–214
Mulhem P, Lim JH, Leow WK, Kankanhalli M (2003) Advances in digital home photo albums. In: Deb S (ed) Multimedia systems and content-based image retrieval. Idea Group Publishing
Osberger W, Maeder AJ (1998) Automatic identification of perceptually important regions in an image using a model of the human visual system. In: ICPR, Brisbane, Australia
Ounis I, Pasca M (1998) Finding the best parameters for image ranking: a user-oriented approach. In: Proceedings of The IEEE knowledge and data engineering exchange conference (KDEX’98). Taipei, Taiwan, pp 50–59
Preece J (1994) Human-computer interaction. Addison-Wesley
Qiu Y, Frei H-P (1993) Concept-based query expansion. In: SIGIR’93. Pittsburgh, USA, pp 160–169
Quelhas P, Monay F, Odobez J-M, Gatica-Perez D, Tuytelaars T (2007) A thousand words in a scene. IEEE Trans Pattern Anal Mach Intell 29(9):1575–1589
Quelhas P, Odobez J-M (2006) Natural scene image modeling using color and texture visterms. In: CIVR’2006. Phoenix AZ
Rodden K (1999) How do people organise their photographs? In: Proceedings of 25th BCS-IRSG Colloquium on IR
Rodden K, Wood KR (2003) How do people manage their digital photographs? In: ACM conference on human factors in computing systems—CHI’03. Florida, USA, pp 409–416
Rojet AS, Schwartz EL (1990) Design considerations for a space-variant visual sensor with complex-logarithmic geometry. 10th Int Conf Pattern Recognit 2:278–285
Salton G (1971) The SMART retrieval system. Prentice Hall
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. In: Information processing and management, pp 513–523
Salton G, McGill M (1983) Introduction to modern information retrieval. McGraw-Hill
Salton G, Wong A, Yang C (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–423, 623–656
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: Ninth IEEE international conference on computer vision, vol 2, pp 1470–1477
Smeulders AW, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380, December 2000
Snoek CGM, Worring M (2005) Multimodal video indexing: a review of the state-of-the-art. Multimed Tools Appl 25(1):5–35
Stentiford F (2003) An attention based similarity measure with application to content based information retrieval. In: Proceedings of SPIE storage and retrieval for media databases, vol 5021. Santa Clara, CA, USA
ter Haar Romeny BM (2003) Front-end vision and multi-scale image analysis. Kluwer
van Rijsbergen CJ (1979) Information retrieval, 2nd edn. Butterworths, London
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Vogel J, Schiele B (2007) Semantic modeling of natural scenes for content-based image retrieval. Int J Comput Vision 72(2):133–157
Wang JZ, Du Y (2001) RF × IPF: a weighting scheme for multimedia information retrieval. In: ICIAP, pp 380–385
Wang JZ, Li J, Wiederhold G (2001) SIMPLIcity: semantics-sensitive integrated matching for picture libraries. IEEE Trans Pattern Anal Mach Intell 23(9):947–963
Yahiaoui I, Merialdo B, Huet B (2003) Comparison of multi-episode video summarisation algorithms. EURASIP J Appl Signal Process 1:48–55
Zheng Q-F, Wang W-Q, Gao W (2006) Effective and efficient object-based image retrieval using visual phrases. In: ACM multimedia’2006. Santa Barbara, California
Author information
Authors and Affiliations
Corresponding author
Additional information
Jean Martinet is currently supported by the Japan Society for the Promotion of Science (JSPS). The authors wish to thank the helpful comments of the reviewers, which helped to improve the quality of this work.
Rights and permissions
About this article
Cite this article
Martinet, J., Satoh, S., Chiaramella, Y. et al. Media objects for user-centered similarity matching. Multimed Tools Appl 39, 263–291 (2008). https://doi.org/10.1007/s11042-008-0200-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-008-0200-9