Image similarity: from syntax to weak semantics

Perkiö, Jukka; Tuominen, Antti; Vähäkangas, Taneli; Myllymäki, Petri

doi:10.1007/s11042-010-0562-7

Image similarity: from syntax to weak semantics

Published: 14 July 2010

Volume 57, pages 5–27, (2012)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jukka Perkiö¹,
Antti Tuominen¹,
Taneli Vähäkangas¹ &
…
Petri Myllymäki¹

227 Accesses
2 Citations
Explore all metrics

Abstract

Measuring image similarity is an important task for various multimedia applications. Similarity can be defined at two levels: at the syntactic (lower, context-free) level and at the semantic (higher, contextual) level. As long as one deals with the syntactic level, defining and measuring similarity is a relatively straightforward task, but as soon as one starts dealing with the semantic similarity, the task becomes very difficult. We examine the use of simple readily available syntactic image features combined with other multimodal features to derive a similarity measure that captures the weak semantics of an image. The weak semantics can be seen as an intermediate step between low level image understanding and full semantic image understanding. We investigate the use of single modalities alone and see how the combination of modalities affect the similarity measures. We also test the measure on multimedia retrieval task on a tv series data, even though the motivation is in understanding how different modalities relate to each other.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

Evaluating Image Similarity Using Contextual Information of Images with Pre-trained Models

Double-scale similarity with rich features for cross-modal retrieval

Article 13 May 2022

Notes

In this paper we use both the concept of image similarity and the concept of distance between images quite freely. The relation between the two concepts is an inverse relation, however. The higher the similarity the smaller the distance and vice versa.
http://images.google.com
http://images.search.yahoo.com/images
ISO/IEC 13818-2:2000—information technology—generic coding of moving pictures and associated audio information: video.
Called slices in the standard
The standard specifies different kinds of blocks, but since we are considering only I-pictures, all blocks are so called intra blocks.
DC-coefficients are the zero frequency coefficients for the DCT and AC-coefficients the rest of the coefficients.
We limit ourselves to the use of visual words as the basic syntactic feature for images.
Feature counts are always positive or zero.
The query features can be both visual features or textual features. In the experiments of this paper we use only textual query features, however.
For our purposes we would need a multimedia data that contains video and text. The TREC video track data is not usable for us, since e.g. in the TRECVID ASR corpora the amount of text is too low for meaningful textual modeling. Even with the additional text that we currently use, the amount of text is at the lower side.

References

Batko M, Falchi F, Lucchese C, Novak D, Perego R, Rabitti F, Sedmidubsky J, Zezula P (2009) Building a web-scale image similarity search system. Multimed Tools Appl 47(3):599–629
Article Google Scholar
Blei D, Ng A, Jordan M (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022. MIT Press
MATH Google Scholar
Brin S, Page L (1998) The anatomy of a large-scale hypertextual Web search engine. Comput Netw ISDN Syst 30:107–117
Article Google Scholar
Buntine W, Jakulin A (2006) Discrete component analysis, subspace, latent structure and feature selection techniques, pp 1–33
Chen W, Liu C, Lander K, Fu X (2009) Comparison of human face matching behavior and computational image similarity measure. Science China Information Sciences 52(2):316–321
Article MATH Google Scholar
Csillaghy A, Hinterberger H, Benz AO (2000) Content-based image retrieval in astronomy. In: Information retrieval, vol 3(3). Kluwer Academic Publishers, pp 229–241
Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: ECCV international workshop on statistical learning in computer vision, pp 1–22
Durán ML, Rodríguez PG, Arias-Nicolás JP, Martín J, Disdier C (2009) A perceptual similarity method by pairwise comparison in a medical image case. Mach Vis Appl. doi:10.1007/s00138-009-0201-3
Google Scholar
Felipe JC, Traina Jr C, Machado Traina AJ (2009) A new family of distance functions for perceptual similarity retrieval of medical images. J Digit Imaging 22(2):183–201
Article Google Scholar
Fu KS (1974) Syntactic methods in pattern recognition. Academic, NY
MATH Google Scholar
Gile N, Wang N, Nathalie C, Siewe F, Lin X, Xu D (2008) A case study of image retrieval on lung cancer chest X-ray pictures. In: 9th international conference on signal processing 2008 (ICSP 2008), pp 924–927
Grigorova A, De Natale F, Dagli C, Huang T (2007) Content-based image retrieval by feature adaptation and relevance feedback. IEEE Trans Multimedia 9:1183–1192
Article Google Scholar
Hofmann T (1999) Probabilistic latent semantic indexing. In: SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, pp 50–57
Hyvärinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley Interscience
Jing Y, Baluja S (2008) PageRank for product image search. In: WWW ’08: Proceeding of the 17th international conference on World Wide Web, pp 307–315
Kak A, Pavlopoulou C (2002) Content-based image retrieval from large medical databases. In: First international symposium on 3D data processing visualization and transmission 2002, pp 138–147
Li M, Chen X, Li X, Ma B, Vitányi P (2004) The similarity metric. IEEE Trans Inf Theory 50:3250–3264
Article Google Scholar
Lin W, Jin R, Hauptmann A (2003) Web image retrieval re-ranking with relevance model. In: IEEE/WIC/ACM international conference on Web intelligence
Lu SY, Fu KS (1978) A syntactic approach to texture analysis. CGIP 7:303–330
Google Scholar
Marchand-Maillet S, Worring M (2006) Benchmarking image and video retrieval: an overview. In: MIR ’06: Proceedings of the 8th ACM international workshop on multimedia information retrieval. Santa Barbara, CA, USA, pp 297–300
Chapter Google Scholar
McDonald K, Smeaton AF (2005) A comparison of score, rank and probability-based fusion methods for video shot retrieval. In: 4th international conference on image and video retrieval (CIVR), pp 61–70
Perkiö J, Hyvärinen A (2009) Modelling image complexity by independent component analysis, with application to content-based image retrieval. In: ICANN ’09: Proceedings of the 19th international conference on artificial neural networks, pp 704–714
Perkiö J, Tuominen A, Myllymäki P (2009) Image similarity: from syntax to weak semantics using multimodal features with application to multimedia retrieval. In: International conference on multimedia information networking and security, pp 213–219
Porter MF (1980) An algorithm for suffix stripping. Program 14:130–137
Article Google Scholar
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: Proceedings of the international conference on computer vision, pp 1470–1477
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22:1349–1380. IEEE Computer Society
Article Google Scholar
Souvannavong F, Merialdo B, Huet B (2004) Latent semantic analysis for an effective region-based video shot retrieval system. In: MIR ’04: Proceedings of the 6th ACM SIGMM international workshop on multimedia information retrieval, pp 243–250
Tao D, Tang X, Li X, Rui Y (2006) Direct kernel biased discriminant analysis: a new content-based image retrieval relevance feedback algorithm. IEEE Trans Multimedia 8:716–727. IEEE Computer Society
Article Google Scholar
Zhang J, Ye L (2009) Content based image retrieval using unclean positive examples. IEEE Trans Image Process 18(10):2370–2375
Article MathSciNet Google Scholar
Zhang RF, Zhang ZFM (2004) Hidden semantic concept discovery in region based image retrieval. In: CVPR04, pp 996–1001

Download references

Acknowledgements

This work was supported in part by the IST Programme of the European Community under the PASCAL Network of Excellence and under the CLASS project, and by the Academy of Finland under projects VISCI and HPE, and by the Finnish Funding Agency for Technology and Innovation under the project MIFSAS.

Author information

Authors and Affiliations

Helsinki Institute for Information Technology, Helsinki, Finland
Jukka Perkiö, Antti Tuominen, Taneli Vähäkangas & Petri Myllymäki

Authors

Jukka Perkiö
View author publications
You can also search for this author in PubMed Google Scholar
Antti Tuominen
View author publications
You can also search for this author in PubMed Google Scholar
Taneli Vähäkangas
View author publications
You can also search for this author in PubMed Google Scholar
Petri Myllymäki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jukka Perkiö.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Perkiö, J., Tuominen, A., Vähäkangas, T. et al. Image similarity: from syntax to weak semantics. Multimed Tools Appl 57, 5–27 (2012). https://doi.org/10.1007/s11042-010-0562-7

Download citation

Published: 14 July 2010
Issue Date: March 2012
DOI: https://doi.org/10.1007/s11042-010-0562-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image similarity: from syntax to weak semantics

Abstract

Access this article

Similar content being viewed by others

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

Evaluating Image Similarity Using Contextual Information of Images with Pre-trained Models

Double-scale similarity with rich features for cross-modal retrieval

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Image similarity: from syntax to weak semantics

Abstract

Access this article

Similar content being viewed by others

Multimodal Image Retrieval Based on Keywords and Low-Level Image Features

Evaluating Image Similarity Using Contextual Information of Images with Pre-trained Models

Double-scale similarity with rich features for cross-modal retrieval

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation