Skip to main content
Log in

Image retrieval based on multi-concept detector and semantic correlation

基于多概念检测器与语义关联的图像检索方法

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

With the rapid development of future network, there has been an explosive growth in multimedia data such as web images. Hence, an efficient image retrieval engine is necessary. Previous studies concentrate on the single concept image retrieval, which has limited practical usability. In practice, users always employ an Internet image retrieval system with multi-concept queries, but, the related existing approaches are often ineffective because the only combination of single-concept query techniques is adopted. At present semantic concept based multi-concept image retrieval is becoming an urgent issue to be solved. In this paper, a novel Multi-Concept image Retrieval Model (MCRM) based on the multi-concept detector is proposed, which takes a multi-concept as a whole and directly learns each multi-concept from the rearranged multi-concept training set. After the corresponding retrieval algorithm is presented, and the log-likelihood function of predictions is maximized by the gradient descent approach. Besides, semantic correlations among single-concepts and multiconcepts are employed to improve the retrieval performance, in which the semantic correlation probability is estimated with three correlation measures, and the visual evidence is expressed by Bayes theorem, estimated by Support Vector Machine (SVM). Experimental results on Corel and IAPR data sets show that the approach outperforms the state-of-the-arts. Furthermore, the model is beneficial for multi-concept retrieval and difficult retrieval with few relevant images.

摘要

创新点

随着未来网络的快速发展, 可以预见 Web 图像等多媒体数据呈现爆炸式增长, 因此, 亟需一种高效的图像检索引擎. 已有研究主要关注单概念图像检索方式, 这弱化了实际可用性. 事实上, 用户使用图像检索系统, 多数以多概念检索为主. 为了克服该缺点, 已有的语义检索方法采用了单概念检索方法去完成多概念检索, 然而, 单概念检索方法并未考虑多概念场景语境, 导致检索结果常常是低效的. 当前, 基于语义概念的多概念图像检索成为一个亟待解决的研究问题. 本文提出一种基于多概念检测器的多概念图像检索方法 MCRM, 它把一个场景多概念当做一个有语境的整体, 而直接从重新整理的多概念训练集中学习出来, MCRM 检索方法首先提出了一种检索算法, 然后通过梯度下降法极大化似然函数. 此外, 在单概念和场景多概念之间的语义关联也被用来提升多概念检索的性能. 为了衡量多概念语义关联, MCRM 方法提出了三种估算语义关联概率的方法, 而场景多概念是否存在于图像中的视觉证据被贝叶斯规则转换后交由支持向量机去概率估算. 实验表明: MCRM 方法在多概念图像检索和相关图像稀少的 “困难检索” 上优势明显.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Zhang S, Huang J, Li H, et al. Automatic image annotation and retrieval using group sparsity. IEEE Trans Syst Man Cybern Part B-Cybern, 2012, 42: 838–849

    Article  Google Scholar 

  2. Grangier D, Bengio S. A discriminative kernel-based approach to rank images from text queries. IEEE Trans Patt Anal Mach Intell, 2008, 30: 1371–1384

    Article  Google Scholar 

  3. Guillaumin M, Mensink T, Verbeek J, et al. Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: Proceedings of 12th International Conference on Computer Vision, Kyoto, 2009. 309–316

    Google Scholar 

  4. Chen M, Zheng A, Weinberger K. Fast image tagging. In: Proceedings of 30th International Conference on Machine Learning, Atlanta, 2013. 1274–1282

    Google Scholar 

  5. Xu C, Wang T, Gao J, et al. An ordered-patch-based image classification approach on the image grassmannian manifold. IEEE Trans Neural Netw Learn Syst, 2014, 25: 728–737

    Article  Google Scholar 

  6. Truong B Q, Sun A X, Bhowmick S S. CASIS: a system for concept-aware social image search. In: Proceedings of 21st International World Wide Web Conference, Lyon, 2012. 425–428

    Google Scholar 

  7. Gao Y, Wang M, Zha Z J, et al. Visual-textual joint relevance learning for tag-based social image search. IEEE Trans Image Process, 2013, 22: 363–376

    Article  MathSciNet  Google Scholar 

  8. Jeon J, Lavrenko V, Manmatha R. Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of 26th Annual International ACM SIGIR Conference, Toronto, 2003. 119–126

    Google Scholar 

  9. Lavrenko V, Manmatha R, Jeon J. A model for learning the semantics of pictures. In: Thrun S, Saul L K, Schölkopf B, eds. Advances in Neural Information Processing Systems 16. Cambridge: MIT Press, 2003. 553–560

    Google Scholar 

  10. Feng S, Manmatha R, Lavrenko V. Multiple Bernoulli relevance models for image and video annotation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Washington, 2004. 995–1002

    Google Scholar 

  11. Pan J Y, Yang H J, Duygulu P, et al. Automatic image captioning. In: Proceedings of IEEE International Conference on Multimedia and Expo, Taipei, 2004. 1987–1990

    Google Scholar 

  12. Debatty T, Michiardi P, Mees W, et al. Determining the k in k-means with MapReduce. In: Proceedings of EDBT/ICDT 2014 Joint Conference, Athens, 2014. 19–28

    Google Scholar 

  13. Nguyen C T, Kaothanthong N, Tokuyama T, et al. A feature-word-topic model for image annotation and retrieval. ACM Trans Web, 2013, 7: 12–35

    Article  Google Scholar 

  14. Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation. J Mach Learn Res, 2003, 3: 993–1022

    MATH  Google Scholar 

  15. Monay F, Gatica-Perez D. Plsa-based image auto-annotation: constraining the latent space. In: Proceedings of 12th Annual ACM International Conference on Multimedia, New York, 2004. 348–351

    Chapter  Google Scholar 

  16. Blei D M, Jordan M I. Modeling annotated data. In: Proceedings of 26th Annual International ACM SIGIR Conference, Toronto, 2003. 127–134

    Google Scholar 

  17. Chang C C, Lin C J. Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol, 2011, 2: 27

    Article  Google Scholar 

  18. Lai H J, Pan Y, Tang Y, et al. Fsmrank: feature selection algorithm for learning to rank. IEEE Trans Neural Netw Learn Syst, 2013, 24: 940–952

    Article  Google Scholar 

  19. Cui C, Ma J, Lian T, et al. Ranking-oriented nearest-neighbor based method for automatic image annotation. In: Proceedings of 36th International ACM SIGIR Conference, Dublin, 2013. 957–960

    Google Scholar 

  20. Liu J, Li M, Liu Q, et al. Image annotation via graph learning. Patt Recognit, 2009, 42: 218–228

    Article  MATH  Google Scholar 

  21. Makadia A, Pavlovic V, Kumar S. Baselines for image annotation. Int J Comput Vision, 2010, 90: 88–105

    Article  Google Scholar 

  22. Jin Y, Khan L, Wang L, et al. Image annotations by combining multiple evidence & wordnet. In: Proceedings of 13th Annual ACM International Conference on Multimedia, Hilton, 2005. 706–715

    Chapter  Google Scholar 

  23. Cilibrasi R L, Vitanyi P M. The google similarity distance. IEEE Trans Knowl Data Eng, 2007, 19: 370–383

    Article  Google Scholar 

  24. Chen P I, Lin S J, Chu Y C. Using google latent semantic distance to extract the most relevant information. Expert Syst Appl, 2011, 38: 7349–7358

    Article  Google Scholar 

  25. Bishop C M. Pattern Recognition and Machine Learning. New York: Springer, 2006

    MATH  Google Scholar 

  26. Duygulu P, Barnard K, Freitas J F, et al. Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Proceedings of 7th European Conference on Computer Vision, Copenhagen, 2002. 97–112

    Google Scholar 

  27. Grubinger M, Clough P, Müller H, et al. The IAPR TC-12 Benchmark: a new evaluation resource for visual information systems. In: Proceedings of International Conference on Language Resources and Evaluation, Genoa, 2006. 13–23

    Google Scholar 

  28. Maji S, Berg A C, Malik J. Classification using intersection kernel support vector machines is efficient. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, 2008. 1–8

    Google Scholar 

  29. Das S R, Panigrahi P K, Das K, et al. Improving RBF kernel function of support vector machine using particle swarm optimization. Int J Adv Comput Res, 2012, 2: 130–135

    Google Scholar 

  30. Manning C D, Raghavan P, Schütze H. Introduction to Information Retrieval. Cambridge: Cambridge University Press, 2008

  31. Ganganwar V. An overview of classification algorithms for imbalanced datasets. Int J Emerg Technol Adv Eng, 2012, 2: 42–47

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to ChangQin Huang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, H., Huang, C., Pan, P. et al. Image retrieval based on multi-concept detector and semantic correlation. Sci. China Inf. Sci. 58, 1–15 (2015). https://doi.org/10.1007/s11432-015-5486-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11432-015-5486-4

Keywords

关键词

Navigation