Semantic image retrieval for complex queries using a knowledge parser

Chen, Hua; Trouve, Antoine; Murakami, Kazuaki J.; Fukuda, Akira

doi:10.1007/s11042-017-4932-2

Semantic image retrieval for complex queries using a knowledge parser

Published: 21 June 2017

Volume 77, pages 10733–10751, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hua Chen¹,
Antoine Trouve²,
Kazuaki J. Murakami¹ &
…
Akira Fukuda¹

468 Accesses
8 Citations
Explore all metrics

Abstract

In order to improve the retrieval accuracy of image retrieval systems, research focus has been shifted from designing sophisticated low-level feature extraction algorithms to combining image retrieval processing with rich semantics and knowledge-based methods. In this paper, we aim at improving text-based image retrieval for complex natural language queries by using a semantic parser (Knowledge Parser or K-Parser). From text written in natural language, the K-parser extracts a graphical semantic representation of the objects involved, their properties as well as their relations. We analyze both the image textual captions and the natural language queries with the K-parser. As a technical solution, we leverage RDF in two ways: first, we store the parsed image captions as RDF triples; second, we translate image queries into SPARQL queries. When applied to the Flickr8k dataset with a set of 16 custom queries, we notice that the K-parser exhibits some biases that negatively affect the accuracy of the queries. We propose two techniques to address the weaknesses: (1) we introduce a set of rules to transform the output of K-parser and fix some basic, recurrent parsing mistakes that occur on the captions of Flickr8k; (2) we leverage two popular commonsense knowledge databases, ConceptNet and WordNet, to raise the accuracy of queries on broad concepts. Using those two techniques, we can fix most of the initial retrieval errors, and accurately execute our set of 16 queries on the Flickr8k dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing image retrieval for complex queries using external knowledge sources

Article 28 July 2020

A generic framework for ontology-based information retrieval and image retrieval in web data

Article Open access 05 November 2016

Using Object Detection, NLP, and Knowledge Bases to Understand the Message of Images

Notes

http://www.flickr.com
http://kparser.org
Refer to http://nlp.cs.illinois.edu/HockenmaierGroup/8k-pictures.html
Turtle (Terse RDF Triple Language) can display RDF triples in a concise format.
More detailed experiments are presented in Sections 6 and 7.
In our work, we only consider the relation “CapableOf” from ConceptNet.
A method can refer to http://cs.stanford.edu/people/karpathy/deepimagesent/rankingdemo/
In our work, we consider that “run on grass” and “run through grass” have the same meaning.

References

Aditya S, Yang Y, Baral C, Fermuller C, Aloimonos Y (2015) From images to sentences through scene description graphs using commonsense reasoning and knowledge. arXiv:151103292
Chen H, Trouve A, Murakami KJ, Fukuda A (2016) An intelligent annotation-based image retrieval system based on rdf descriptions. Comput Electr Eng
Clark P, Porter B, Works BP (2004) Km–the knowledge machine 2.0: Users manual. Department of Computer Science, University of Texas at Austin 2:5
Dasiopoulou S, Giannakidou E, Litos G, Malasioti P, Kompatsiaris Y (2011) A survey of semantic image and video annotation tools. In: Knowledge-driven multimedia information extraction and ontology evolution. Springer, pp 196–239
Grobe M (2009) Rdf, jena, sparql and the ‘semantic web’. In: Proceedings of the 37th annual ACM SIGUCCS fall conference: communication and collaboration. ACM, pp 131–138
Hodosh M, Young P, Hockenmaier J (2013) Framing image description as a ranking task: data, models and evaluation metrics. J Artif Intell Res 47:853–899
MathSciNet MATH Google Scholar
Hsu MH, Tsai MF, Chen HH (2006) Query expansion with conceptnet and wordnet: an intrinsic comparison. In: Asia information retrieval symposium. Springer, pp 1–13
Im DH, Park GD (2015) Linked tag: image annotation using semantic relationships between image tags. Multimedia Tools Appl 74(7):2273–2287
Article Google Scholar
Johnson J, Krishna R, Stark M, Li LJ, Shamma D, Bernstein M, Fei-Fei L (2015) Image retrieval using scene graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3668–3678
Li Y, Lu H, Li J, Li X, Li Y, Serikawa S (2016) Underwater image de-scattering and classification by deep neural network. Comput Electr Eng 54:68–77
Article Google Scholar
Liu H, Singh P (2004) Conceptnet–a practical commonsense reasoning tool-kit. BT Technology Journal 22(4):211–226
Article Google Scholar
Lu H, Li Y, Nakashima S, Serikawa S (2016) Single image dehazing through improved atmospheric light estimation. Multimedia Tools Appl 75(24):17,081–17,096
Article Google Scholar
Magesh N, Thangaraj P (2011) Semantic image retrieval based on ontology and sparql query. In: Proceedings of International Journal of Computer Applications (IJCA)–ICACT, pp 12–16
Manola F, Miller E (2004) Resource description framework (rdf) primer. W3C Recommendation 10:5
Google Scholar
McBride B, Boothby D, Dollin C (2004) An introduction to rdf and the jena rdf api. Retrieved August 1:2007
Miller GA (1995) Wordnet: a lexical database for english. Commun ACM 38 (11):39–41
Article Google Scholar
Prud E, Seaborne A (2006) Sparql query language for rdf. W3C Recommendation
Sankar S, Sayed A, Bani-Younis JA (2014) A schematic analysis on selective-rdf database stores. Int J Comput Appl 86(11)
Scherp A (2013) Semantic technologies for multimedia content: foundations and applications. In: Proceedings of the 21st ACM international conference on multimedia. ACM, pp 1107–1108
Schuster S, Krishna R, Chang A, Fei-Fei L, Manning CD (2015) Generating semantically precise scene graphs from textual descriptions for improved image retrieval. In: Proceedings of the fourth workshop on vision and language, pp 70–80
Serikawa S, Lu H (2014) Underwater image dehazing using joint trilateral filter. Comput Electr Eng 40(1):41–50
Article Google Scholar
Sharma A, Vo NH, Aditya S, Baral C (2015) Towards addressing the winograd schema challenge-building and using a semantic parser and a knowledge hunting module. In: IJCAI, pp 1319–1325
Speer R, Havasi C (2012) Representing general relational knowledge in conceptnet 5. In: LREC, pp 3679–3686
Xu X, He L, Lu H, Shimadam A, Taniguchi R (2016) Non-linear matrix completion for social image tagging. IEEE Access
Xu X, He L, Shimada A, Taniguchi R, Lu H (2016) Learning unified binary codes for cross-modal retrieval via latent semantic hashing. Neurocomputing 213:191–203
Article Google Scholar
Xu X, Shen F, Yang Y, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process
Xu X, Shen F, Yang Y, Zhang D, Shen HT, Song J (2017) Matrix tri-factorization with manifold regularizations for zero-shot learning. In: Proceeding of the IEEE conference on computer vision and pattern recognition. CVPR
Yang Y, EDU U, Aloimonos Y, Fermuller C (2016) Deepiu: an architecture for image understanding. Adv Cogn Syst

Download references

Author information

Authors and Affiliations

Graduate School of Information Science and Electrical Engineering, Kyushu University, 744, Motooka, Nishi-ku, Fukuoka, 819-0395, Fukuoka, Japan
Hua Chen, Kazuaki J. Murakami & Akira Fukuda
Institute of Systems, Information Technologies and Nanotechnologies (ISIT), 2-1-22-7F, Momochihama, Sawara-ku, Fukuoka, 814-0001, Fukuoka, Japan
Antoine Trouve

Authors

Hua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Antoine Trouve
View author publications
You can also search for this author in PubMed Google Scholar
Kazuaki J. Murakami
View author publications
You can also search for this author in PubMed Google Scholar
Akira Fukuda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hua Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, H., Trouve, A., Murakami, K.J. et al. Semantic image retrieval for complex queries using a knowledge parser. Multimed Tools Appl 77, 10733–10751 (2018). https://doi.org/10.1007/s11042-017-4932-2

Download citation

Received: 01 March 2017
Revised: 05 June 2017
Accepted: 06 June 2017
Published: 21 June 2017
Issue Date: May 2018
DOI: https://doi.org/10.1007/s11042-017-4932-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic image retrieval for complex queries using a knowledge parser

Abstract

Access this article

Similar content being viewed by others

Enhancing image retrieval for complex queries using external knowledge sources

A generic framework for ontology-based information retrieval and image retrieval in web data

Using Object Detection, NLP, and Knowledge Bases to Understand the Message of Images

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Semantic image retrieval for complex queries using a knowledge parser

Abstract

Access this article

Similar content being viewed by others

Enhancing image retrieval for complex queries using external knowledge sources

A generic framework for ontology-based information retrieval and image retrieval in web data

Using Object Detection, NLP, and Knowledge Bases to Understand the Message of Images

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation