A context-aware semantic modeling framework for efficient image retrieval

Arun, K. S.; Govindan, V. K.

doi:10.1007/s13042-016-0498-y

A context-aware semantic modeling framework for efficient image retrieval

Original Article
Published: 06 February 2016

Volume 8, pages 1259–1285, (2017)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

K. S. Arun¹ &
V. K. Govindan¹

450 Accesses
9 Citations
Explore all metrics

Abstract

In recent years, high-level image representation is gaining popularity in image classification and retrieval tasks. This paper proposes an efficient scheme known as semantic context model to derive high-level image descriptors well suited for the retrieval operation. Semantic context model uses an undirected graphical model based formulation which jointly exploits low-level visual features and contextual information for classifying local image blocks into some predefined concept classes. Contextual information involves concept co-occurrences and their spatial correlation statistics. More expressive potential functions are introduced to capture the structural dependencies among various semantic concepts. The proposed framework proceeds in three steps. Initially, optimal values of model parameters that impose spatial consistency of concept labels among local image blocks are learned from the training data. Then, the semantics associated with the constituent blocks of an unseen image are inferred using an improved message-passing algorithm. Finally, a compact but discriminative image signature is derived by integrating the frequency of occurrence of various regional semantics. Experimental results on various benchmark datasets show that semantic context model can effectively resolve local ambiguities and consequently improve concept recognition performance in complex images. Moreover, the retrieval efficiency of the new semantics based image feature is found to be much better than state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scene search based on the adapted triangular regions and soft clustering to improve the effectiveness of the visual-bag-of-words model

Article Open access 13 June 2018

Zahid Mehmood, Naila Gul, … Muhammad Tariq Mahmood

Semantic-Context-Based Augmented Descriptor for Image Feature Matching

Finding Image Semantics from a Hierarchical Image Database Based on Adaptively Combined Visual Features

References

Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: Proceedings of Ninth IEEE international conference on computer vision, vol 2, pp 1470–1477
Duan M, Wu X (2010) Visual polysemy and synonymy: toward near-duplicate image retrieval. Front Electr Electron Eng China 5(4):419–429
Article Google Scholar
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196
Article MATH Google Scholar
Zhang R, Zhang Z (2007) Effective image retrieval based on hidden concept discovery in image database. IEEE Trans Image Process 16(2):562–572
Article MathSciNet Google Scholar
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Biederman I, Mezzanotte R, Rabinowitz J (1982) Scene perception: detecting and judging objects undergoing relational violations. Cogn Psychol 14(2):143–177
Article Google Scholar
Kumar S, Hebert M (2006) Discriminative random fields. Int J Comput Vis 68(2):179–201
Article Google Scholar
Yu L, Xie J, Chen S (2012) Conditional random field-based image labelling combining features of pixels, segments and regions. IET Comput Vis 6(5):459–467
Article MathSciNet Google Scholar
Vogel J, Schiele B (2007) Semantic modeling of natural scenes for content-based image retrieval. Int J Comput Vis 72(2):133–157
Article Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, vol 1, pp 886–893
Bay, H., Tuytelaars, T., Van Gool, L (2006) Surf: speeded up robust features. In: Proceedings of the 9th European conference on computer vision, pp 404-417
Tola E, Lepetit V, Fua P (2010) Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans Pattern Anal Mach Intell 32(5):815–830
Article Google Scholar
Li LJ, Su H, Lim Y, Fei-Fei L (2014) Object bank: an object-level image representation for high-level visual recognition. Int J Comput Vis 107(1):20–39
Article Google Scholar
Torresani L, Szummer M, Fitzgibbon A (2010) Efficient object category recognition using classemes. In: Proceedings of 11th European conference on computer vision. Springer, Berlin, Heidelberg, pp 776–789
Chan A, A., Vasconcelos., N, (2005) Probabilistic kernels for the classification of auto-regressive visual processes. In: Proceedings of IEEE conference on computer vision and pattern recognition, vol 1, pp 846–851
Zhang H, Berg A, Maire M, Malik J (2006) Svm-knn: discriminative nearest neighbor classification for visual category recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2126–2136
Cai D, He X, Han J (2007) Efficient kernel discriminant analysis via spectral regression. In: Proceedings of Seventh IEEE international conference on data mining, pp 427–432
Grauman K, Darrell T (2007) The pyramid match kernel: efficient learning with sets of features. J Mach Learn Res 8:725–760
MATH Google Scholar
Bosch A, Zisserman A, Munoz X (2008) Scene classification using a hybrid generative/discriminative approach. IEEE Trans Pattern Anal Mach Intell 30(4):712–727
Article Google Scholar
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval, pp 119–126
Fei-Fei L, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 524–531
Sivic J, Russell B, Efros A, Zisserman A, Freeman W (2005) Discovering object and their localization in images. In: Proceedings of the tenth IEEE international conference on computer vision, vol 1, pp 370–377
Sudderth E, Torralba A, Freeman W, Willsky A (2005) Learning hierarchical models of scenes, objects and parts. In: Proceedings of the tenth IEEE international conference on computer vision, vol 2, pp 1331–1338
Carneiro G, Chan A, Moreno P, Vasconcelos N (2007) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 29(3):394–410
Article Google Scholar
Rasiwasia N, Vasconcelos N (2012) Holistic context models for visual recognition. IEEE Trans Pattern Anal Mach Intell 34(5):902–917
Article Google Scholar
Bar M (2004) Visual objects in context. Nat Rev Neurosci 5(8):617–629
Article Google Scholar
Bar M, Ullman S (1993) Spatial context in recognition. Perception 25:343–352
Article Google Scholar
Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, Cambridge, p 1280
Lafferty J, McCallum A, Pereira FC (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th international conference on machine learning, pp 282–289
Kohli P, Torr PH (2009) Robust higher order potentials for enforcing label consistency. Int J Comput Vis 82(3):302–324
Article Google Scholar
He X, Zemel RS, Carreira-Perpindn MA (2004) Multiscale conditional random fields for image labeling. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 695–702
Krhenbhl P, Koltun V (2012) Efficient inference in fully connected crfs with Gaussian edge potentials. arXiv:1210.5644
Efron B (1975) The efficiency of logistic regression compared to normal discriminant analysis. J Am Stat Assoc 70(352):892–898
Article MathSciNet MATH Google Scholar
Kindermann R, Snell JL (1980) Markov random fields and their applications, vol 1. American Mathematical Society, Providence
Book MATH Google Scholar
Dagli C, Huang TS (2004) A framework for grid-based image retrieval. In: Proceedings of the 17th IEEE international conference on pattern recognition, vol 2, pp 1021–1024
Huiskes MJ, Lew MS (2008) The MIR Flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on multimedia information retrieval, pp 39–43
Bruna J, Mallat S (2013) Invariant scattering convolution networks. IEEE Trans Pattern Anal Mach Intell 35(8):1872–1886
Article Google Scholar
Mallat S (2012) Group invariant scattering. Commun Pure Appl Math 65(10):1331–1398
Article MathSciNet MATH Google Scholar
Andn J, Mallat S (2011) Multiscale scattering for audio classification. In: ISMIR, pp 657–662
Oyallon E, Mallat S, Sifre L (2013) Generic deep networks with wavelet scattering. arXiv:1312.5940v3
Lee TS (1996) Image representation using 2D Gabor wavelets. IEEE Trans Pattern Anal Mach Intell 18(10):959–971
Article Google Scholar
Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classif 10(3):61–74
Google Scholar
Wu TF, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005
MathSciNet MATH Google Scholar
Sutton C, McCallum A (2007) Piecewise pseudo likelihood for efficient training of conditional random fields. In: Proceedings of the 24th ACM international conference on machine learning, pp 863–870
Beck A, Ben-Tal A (2006) On the solution of the Tikhonov regularization of the total least squares problem. SIAM J Optim 17(1):98–118
Article MathSciNet MATH Google Scholar
Kelley CT (1999) Iterative methods for optimization. Frontiers in applied mathematics. Siam, Philadelphia, PA
Gill PE, Murray W, Wright MH (1981) Practical optimization, vol 5. Academic press, London
MATH Google Scholar
Lempitsky V, Rother C, Roth S, Blake A (2010) Fusion moves for markov random field optimization. IEEE Trans Pattern Anal Mach Intell 32(8):1392–1405
Article Google Scholar
Murphy KP, Weiss Y, Jordan MI (1999) Loopy belief propagation for approximate inference: an empirical study. In: Proceedings of the Fifteenth International conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc, pp 467–475
Murray I, Ghahramani Z (2004) Bayesian learning in undirected graphical models: approximate MCMC algorithms. In: Proceedings of the 20th International conference on uncertainty in artificial intelligence. AUAI Press, pp 392–399
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Francisco, CA
Johnson D, Sinanovic S (2001) Symmetrizing the kullback-leibler distance. http://www-dsp.rice.edu/~dhj/resistor.pdf
Barla A, Odone F, Verri A (2003) Histogram intersection kernel for image classification. In: Proceedings of international conference on image processing, vol 3, pp 513–516
Zobel J, Moffat A, Ramamohanarao K (1998) Inverted files versus signature files for text indexing. ACM Trans Database Syst 23(4):453–490
Article Google Scholar
van Hateren JH, van der Schaaf A (1998) Independent component filters of natural images compared with simple cells in primary visual cortex. Proc R Soc Lond B Biol Sci 265(1394):359–366
Article Google Scholar
Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems (NIPS), pp 487–495
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2014) Imagenet large scale visual recognition challenge. Int J Comput Vis, pp 1–42
Kohavi R, Provost F (1998) Glossary of terms. Mach Learn 30(2–3):271–274
Google Scholar
Chum O, Philbin J, Zisserman A (2008) Near duplicate image detection: min-Hash and tf-idf weighting. In: Proceedings of British machine vision conference, vol 810, pp 812–815

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology, Calicut, India
K. S. Arun & V. K. Govindan

Authors

K. S. Arun
View author publications
You can also search for this author in PubMed Google Scholar
V. K. Govindan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. S. Arun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arun, K.S., Govindan, V.K. A context-aware semantic modeling framework for efficient image retrieval. Int. J. Mach. Learn. & Cyber. 8, 1259–1285 (2017). https://doi.org/10.1007/s13042-016-0498-y

Download citation

Received: 08 April 2015
Accepted: 18 January 2016
Published: 06 February 2016
Issue Date: August 2017
DOI: https://doi.org/10.1007/s13042-016-0498-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A context-aware semantic modeling framework for efficient image retrieval

Abstract

Access this article

Similar content being viewed by others

Scene search based on the adapted triangular regions and soft clustering to improve the effectiveness of the visual-bag-of-words model

Semantic-Context-Based Augmented Descriptor for Image Feature Matching

Finding Image Semantics from a Hierarchical Image Database Based on Adaptively Combined Visual Features

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A context-aware semantic modeling framework for efficient image retrieval

Abstract

Access this article

Similar content being viewed by others

Scene search based on the adapted triangular regions and soft clustering to improve the effectiveness of the visual-bag-of-words model

Semantic-Context-Based Augmented Descriptor for Image Feature Matching

Finding Image Semantics from a Hierarchical Image Database Based on Adaptively Combined Visual Features

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation