Not Just a Matter of Semantics: The Relationship Between Visual and Semantic Similarity

Brust, Clemens-Alexander; Denzler, Joachim

doi:10.1007/978-3-030-33676-9_29

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11824))

Included in the following conference series:

German Conference on Pattern Recognition

1839 Accesses
6 Citations

Abstract

Knowledge transfer, zero-shot learning and semantic image retrieval are methods that aim at improving accuracy by utilizing semantic information, e.g., from WordNet. It is assumed that this information can augment or replace missing visual data in the form of labeled training images because semantic similarity correlates with visual similarity.

This assumption may seem trivial, but is crucial for the application of such semantic methods. Any violation can cause mispredictions. Thus, it is important to examine the visual-semantic relationship for a certain target problem. In this paper, we use five different semantic and visual similarity measures each to thoroughly analyze the relationship without relying too much on any single definition.

We postulate and verify three highly consequential hypotheses on the relationship. Our results show that it indeed exists and that WordNet semantic similarity carries more information about visual similarity than just the knowledge of “different classes look different”. They suggest that classification is not the ideal application for semantic methods and that wrong semantic information is much worse than none.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barz, B., Denzler, J.: Hierarchy-based image embeddings for semantic image retrieval. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 638–647. IEEE (2019)
Google Scholar
Bilal, A., Jourabloo, A., Ye, M., Liu, X., Ren, L.: Do convolutional neural networks learn class hierarchy? 24(1), 152–162. https://doi.org/10.1109/TVCG.2017.2744683
Van den Branden Lambrecht, C.J., Verscheure, O.: Perceptual quality measure using a spatiotemporal model of the human visual system. In: Digital Video Compression: Algorithms and Technologies 1996, vol. 2668, pp. 450–462. International Society for Optics and Photonics (1996)
Google Scholar
Brust, C.A., et al.: Towards automated visual monitoring of individual gorillas in the wild. In: International Conference on Computer Vision Workshop (ICCV-WS) (2017)
Google Scholar
Chen, G., Han, T.X., He, Z., Kays, R., Forrester, T.: Deep convolutional neural network based species recognition for wild animal monitoring. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 858–862. IEEE (2014)
Google Scholar
Deselaers, T., Ferrari, V.: Visual and semantic similarity in ImageNet. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1777–1784. IEEE (2011)
Google Scholar
Freytag, A., Rodner, E., Simon, M., Loos, A., Kühl, H., Denzler, J.: Chimpanzee faces in the wild: Log-Euclidean CNNs for predicting identities and attributes of primates. In: German Conference on Pattern Recognition (GCPR), pp. 51–63 (2016)
Google Scholar
Frome, A., et al.: Devise: a deep visual-semantic embedding model. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 2121–2129. Curran Associates, Inc. (2013)
Google Scholar
Harispe, S., Ranwez, S., Janaqi, S., Montmain, J.: Semantic similarity from natural language and ontology analysis. Synth. Lect. Hum. Lang. Technol. 8(1), 1–254 (2015)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint arXiv:cmp-lg/9709008 (1997)
Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. (IJCV) 123(1), 32–73 (2017)
Article MathSciNet Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report 4, University of Toronto (2009)
Google Scholar
Kumar, A.: Computer-vision-based fabric defect detection: a survey. IEEE Trans. Industr. Electron. 55(1), 348–363 (2008)
Article Google Scholar
Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)
Article Google Scholar
Liu, Y., Zhang, D., Lu, G., Ma, W.Y.: A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40(1), 262–282 (2007)
Article Google Scholar
Maedche, A., Staab, S.: Comparing ontologies-similarity measures and a comparison study. Technical report, Institute AIFB, University of Karlsruhe (2001)
Google Scholar
Malamas, E.N., Petrakis, E.G., Zervakis, M., Petit, L., Legat, J.D.: A survey on industrial vision systems, applications and tools. Image Vis. Comput. 21(2), 171–188 (2003)
Article Google Scholar
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995). https://doi.org/10.1145/219717.219748
Niemann, H.: Pattern Analysis. Springer Series in Information Sciences. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-96650-7. https://books.google.de/books?id=mdOoCAAAQBAJ
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vision 42(3), 145–175 (2001)
Article Google Scholar
Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cybern. 19(1), 17–30 (1989)
Article Google Scholar
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. arXiv preprint arXiv:cmp-lg/9511007 (1995)
Rohrbach, M., Stark, M., Schiele, B.: Evaluating knowledge transfer and zero-shot learning in a large-scale setting. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1641–1648. IEEE (2011)
Google Scholar
Ross, S.M.: A First Course in Probability. Macmillan, New York (1976)
MATH Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision (IJCV) 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Salem, M.A.M., Atef, A., Salah, A., Shams, M.: Recent survey on medical image segmentation. In: Computer Vision: Concepts, Methodologies, Tools, and Applications, pp. 129–169. IGI Global (2018)
Google Scholar
Sánchez, D., Batet, M., Isern, D., Valls, A.: Ontology-based semantic similarity: a new feature-based approach. Expert Syst. Appl. 39(9), 7718–7728 (2012)
Article Google Scholar
Silla, C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Disc. 22(1), 31–72 (2011)
Article MathSciNet Google Scholar
Spearman, C.: The proof and measurement of association between two things. Am. J. Psychol. 15(1), 72–101 (1904)
Article Google Scholar
Thevenot, J., López, M.B., Hadid, A.: A survey on computer vision for assistive medical diagnosis from faces. IEEE J. Biomed. Health Inform. 22(5), 1497–1511 (2018)
Article Google Scholar
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. Trans. Pattern Anal. Mach. Intell. (PAMI) 30(11), 1958–1970 (2008)
Article Google Scholar
Tversky, A.: Features of similarity. Psychol. Rev. 84(4), 327 (1977)
Article Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. arXiv preprint arXiv:1801.03924
Zhou, Z., Wang, Y., Gu, J.: A new model of information content for semantic similarity in WordNet. In: Second International Conference on Future Generation Communication and Networking Symposia, 2008, FGCNS 2008, vol. 3, pp. 85–89. IEEE (2008)
Google Scholar

Download references

Acknowledgements

This work was supported by the DAWI research infrastructure project, funded by the federal state of Thuringia (grant no. 2017 FGI 0031), including access to computing and storage facilities.

Author information

Authors and Affiliations

Computer Vision Group, Friedrich Schiller University Jena, Jena, Germany
Clemens-Alexander Brust & Joachim Denzler
Michael Stifel Center Jena, Jena, Germany
Joachim Denzler

Authors

Clemens-Alexander Brust
View author publications
You can also search for this author in PubMed Google Scholar
Joachim Denzler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Clemens-Alexander Brust .

Editor information

Editors and Affiliations

TU Dortmund University, Dortmund, Germany
Gernot A. Fink
University of Hamburg, Hamburg, Germany
Simone Frintrop
University of Münster, Münster, Germany
Xiaoyi Jiang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brust, CA., Denzler, J. (2019). Not Just a Matter of Semantics: The Relationship Between Visual and Semantic Similarity. In: Fink, G., Frintrop, S., Jiang, X. (eds) Pattern Recognition. DAGM GCPR 2019. Lecture Notes in Computer Science(), vol 11824. Springer, Cham. https://doi.org/10.1007/978-3-030-33676-9_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-33676-9_29
Published: 25 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33675-2
Online ISBN: 978-3-030-33676-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics