Skip to main content

Image Captioning-Based Image Search Engine: An Alternative to Retrieval by Metadata

  • Conference paper
  • First Online:
Soft Computing for Problem Solving

Abstract

Image retrieval is an integral part of many different search engines. Search based on metadata of the image has been a primary approach in the process of image retrieval. In this work, we implement a search engine for better quality image retrieval using query image. Our implementation uses elastic search for indexing of the available images in the server and intermediate captioning mechanism for both search and retrieval process. The image captioning has been carried out using VGG16 Convolutional Neural Network. The implemented engine has been implemented and tested using the popular benchmark dataset called Flickr-8k dataset. The retrieved image quality demonstrated promising performance and suggests that an intermediate captioning-based image search could be an alternative to metadata-based search engines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://images.google.com.

  2. 2.

    https://www.elastic.co/products/elasticsearch.

  3. 3.

    The input is image and output is a set of words.

References

  1. Yee, K.P., Swearingen, K., Li, K., Hearst, M.: Faceted metadata for image search and browsing. In: Proceedings of the SIGCHI conference on Human factors in computing systems. pp. 401–408. ACM (2003)

    Google Scholar 

  2. Kherfi, M.L., Ziou, D., Bernardi, A.: Image retrieval from the world wide web: issues, techniques, and systems. ACM Comput. Surv. (CSUR) 36(1), 35–67 (2004)

    Article  Google Scholar 

  3. Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., Bengio, Y.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning. pp. 2048–2057 (2015)

    Google Scholar 

  4. Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3128–3137 (2015)

    Google Scholar 

  5. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv:1409.1556

  6. Chollet, F.: Keras (2015)

    Google Scholar 

  7. Ramos, J., et al.: Using TF-IDF to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning (2003)

    Google Scholar 

  8. Manning, C.D., Raghavan, P., Schütze, H.: Scoring, term weighting and the vector space model. Introd. Inf. Retr. 100, 2–4 (2008)

    Google Scholar 

  9. Rashtchian, C., Young, P., Hodosh, M., Hockenmaier, J.: Collecting image annotations using amazon’s mechanical turk. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk. pp. 139–147. Association for Computational Linguistics (2010)

    Google Scholar 

  10. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. pp. 311–318. Association for Computational Linguistics (2002)

    Google Scholar 

  11. Bird, S., Klein, E., Loper, E.: Natural language processing with Python: analyzing text with the natural language toolkit. O’Reilly Media, Inc., (2009)

    Google Scholar 

  12. Rui, Y., Huang, T.S., Chang, S.F.: Image retrieval: Current techniques, promising directions, and open issues. J. Vis. Commun. Image Represent. 10(1), 39–62 (1999)

    Article  Google Scholar 

  13. Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000)

    Article  Google Scholar 

  14. Vipparthi, S.K., Nagar, S.: Expert image retrieval system using directional local motif XoR patterns. Expert Syst. Appl. 41(17), 8016–8026 (2014)

    Article  Google Scholar 

  15. Guo, J.M., Prasetyo, H., Wang, N.J.: Effective image retrieval system using dot-diffused block truncation coding features. IEEE Trans. Multimed. 17(9), 1576–1590 (2015)

    Article  Google Scholar 

  16. Markonis, D., Schaer, R., de Herrera, A.G.S., Müller, H.: The parallel distributed image search engine (paradise) (2017). arXiv:1701.05596

  17. Gordo, A., Almazán, J., Revaud, J., Larlus, D.: Deep image retrieval: learning global representations for image search. In: European Conference on Computer Vision. pp. 241–257. Springer (2016)

    Google Scholar 

  18. Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules (2017). arXiv:1710.09829

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tirtharaj Dash .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Iyer, S., Chaturvedi, S., Dash, T. (2019). Image Captioning-Based Image Search Engine: An Alternative to Retrieval by Metadata. In: Bansal, J., Das, K., Nagar, A., Deep, K., Ojha, A. (eds) Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol 817. Springer, Singapore. https://doi.org/10.1007/978-981-13-1595-4_14

Download citation

Publish with us

Policies and ethics