Optimizing visual dictionaries for effective image retrieval

Arun, K. S.; Govindan, V. K.

doi:10.1007/s13735-015-0076-1

Optimizing visual dictionaries for effective image retrieval

Regular Paper
Published: 01 February 2015

Volume 4, pages 165–185, (2015)
Cite this article

International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

K. S. Arun¹ &
V. K. Govindan¹

1402 Accesses
7 Citations
Explore all metrics

Abstract

Characterizing images by high-level concepts from a learned visual dictionary is extensively used in image classification and retrieval. This paper deals with inferring discriminative visual dictionaries for effective image retrieval and examines a non-negative visual dictionary learning scheme towards this direction. More specifically, a non-negative matrix factorization framework with \(\ell _0\)-sparseness constraint on the coefficient matrix for optimizing the dictionary is proposed. It is a two-step iterative process composed of sparse encoding and dictionary enhancement stages. An initial estimate of the visual dictionary is updated in each iteration with the proposed \(\ell _0\)-constraint gradient projection algorithm. A desirable attribute of this formulation is an adaptive sequential dictionary initialization procedure. This leads to a sharp drop down of the approximation error and a faster convergence. Finally, the proposed dictionary optimization scheme is used to derive a compact image representation for the retrieval task. A new image signature is obtained by projecting local descriptors on to the basis elements of the optimized visual dictionary and then aggregating the resulting sparse encodings in to a single feature vector. Experimental results on various benchmark datasets show that the proposed system can infer enhanced visual dictionaries and the derived image feature vector can achieve better retrieval results as compared to state-of-the-art techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Rebollo-Neira L (2004) Dictionary redundancy elimination. IEE Proc Vis Image Signal Process 151(1):31–34
Article Google Scholar
Lewicki M, Sejnowski T (2000) Learning overcomplete representations. Neural Comput 12(2):337–365
Article Google Scholar
Lee DD, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401:788–791
Article Google Scholar
Berry M, Browne M, Langville A, Pauca P, Plemmons R (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52:55–173
Article MathSciNet Google Scholar
Spratling MW (2006) Learning image components for object recognition. J Mach Learn Res 7:793–815
MathSciNet MATH Google Scholar
Xinhui H, Ryosuke I, Hisashi K Satoshi N (2010) Clustered-based language model for spoken document retrieval using NMF-based document clustering. In: Interspeech proceeding, pp 705–708
Dhillon IS, Modha DM (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42:143–175
Article MATH Google Scholar
Cadzow JA (2002) Minimum \(\ell _1\), \(\ell _2\) and \(\ell _{\infty }\) norm approximate solutions to an overdetermined system of linear equations. Digit Signal Process 12(4):524–560
Article Google Scholar
Aharon M, Elad M, Bruckstein A (2005) K-SVD and its non-negative variant for dictionary design. In: Proceedings of the SPIE conference on curvelet, directional, and sparse representations, vol 5914, pp 11.1–11.13
Peharz R, Pernkopf F (2012) Sparse nonnegative matrix factorization with \(\ell ^0\)-constraints. Neurocomput Spec Issue Mach Learn Signal Process 80(1):38–46
Google Scholar
Bevilacqua M, Roumy A, Guillemot C, Morel MLA (2013) K-WEB: nonnegative dictionary learning for sparse image representations. In: Proceedings of the IEEE international conference on image processing
Shneier M, Abdel-Mottaleb M (1996) Exploiting the JPEG compression scheme for image retrieval. IEEE Trans Pattern Anal Mach Intell 18(8):849–853
Article Google Scholar
Jacobs CE, Finkelstein A, Salesin DH (1995) Fast multi resolution image querying. In: Proceedings of the 22nd ACM annual conference on computer graphics and interactive techniques, pp 277–286
Zhou W, Sei-ichiro K (2013) Face recognition with learned local curvelet patterns and 2-directional l1-norm based 2DPCA. In: Proceedings of the 10th Asian conference on computer vision
Mallat S, Pennec EL (2005) Bandelet image approximation and compression. SIAM Multiscale Model Simul 4(3):992–1039
Article MATH Google Scholar
Mairal J, Bach F, Ponce J, Sapiro G (2010) Online learning for matrix factorization and sparse coding. J Mach Learn Res 11:19–60
MathSciNet MATH Google Scholar
Lu G, Teng S (1999) A novel image retrieval technique based on vector quantization. In: Proceedings of the international conference on computational intelligence for modelling, control and automation, pp 36–41
Belhumeur PN, Hespanha JP, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Article Google Scholar
Bartlett MS, Movellan JR, Sejnowski TJ (2002) Face recognition by independent component analysis. IEEE Trans Neural Netw 13(6):1450–1464
Article Google Scholar
Wang N, Jingdong W, Yeung DY (2013) Online robust non-negative dictionary learning for visual tracking. In: Proceedings of IEEE international conference on computer vision, pp 657–664
Ross DA, Zemel RS (2006) Learning parts-based representations of data. J Mach Learn Res 7:2369–2397
MathSciNet MATH Google Scholar
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Article Google Scholar
Lee H, Battle A, Raina R, Ng AY (2006) Efficient sparse coding algorithms. In: Advances in neural information processing systems, pp 801–808
Olshausen BA, Field DJ (1997) Sparse coding with an over complete basis set: a strategy employed by V1? Vis Res 37(23):3311–3325
Article Google Scholar
Hoyer PO (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469
MathSciNet MATH Google Scholar
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Proceedings of advances in neural information processing systems, pp 556–562
Kim H, Park H (2008) Non negative matrix factorization based on alternating non negativity constrained least squares and active set method. SIAM J Matrix Anal Appl 30(2):713–730
Article MathSciNet MATH Google Scholar
Lin CJ (2007) Projected gradient methods for non negative matrix factorization. Neural Comput 19(10):2756–2779
Article MathSciNet MATH Google Scholar
Mallat S, Zhang Z (1993) Matching pursuits with time–frequency dictionaries. IEEE Trans Signal Process 41:3397–3415
Article MATH Google Scholar
Pati YC, Rezaiifar R, Krishnaprasad PS (1993) Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: Proceedings of the twenty-seventh IEEE conference on signals, systems and computers, pp 40–44
Chen S, Donoho D, Saunders M (1998) Automatic decomposition by basis pursuit. SIAM J Sci Comput 1(3):33–61
Article MathSciNet Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58(1):267–288
MathSciNet MATH Google Scholar
Gorodnitsky IF, Rao BD (1997) Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm. IEEE Trans Signal Process 45(3):600–616
Article Google Scholar
Aharon M, Elad M, Bruckstein A (2006) K-SVD: an algorithm for designing over complete dictionaries for sparse representation. IEEE Trans Signal Process 54(11):4311–4322
Article Google Scholar
Patrik OH (2004) Non-negative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469
MATH Google Scholar
Nakayama H, Harada T, Kuniyoshi Y (2010) Dense sampling low-level statistics of local features. IEICE Trans Inf Syst 93(7):1727–1736
Article Google Scholar
Nowak E, Jurie F, Triggs B (2006) Sampling strategies for bag-of-features image classification. In: Proceedings of the European conference on computer vision, pp 490–503
Langville AN, Meyer CD, Albright R, Cox J, Duling D (2006) Initializations for the non negative matrix factorization. In: Proceedings of the twelfth ACM SIGKDD international conference on knowledge discovery and data mining, pp 23–26
Rezaei M, Boostani R, Rezaei M (2011) An efficient initialization method for non negative matrix factorization. J Appl Sci 11(2):354–359
Article Google Scholar
Jafari MG, Plumbley MD (2011) Fast dictionary learning for sparse representations of speech signals. J Sel Top Signal Process 5(5):1025–1031
Article Google Scholar
Tropp J (2004) Greed is good: algorithmic results for sparse approximation. IEEE Trans Inf Theory 50(10):2231–2242
Article MathSciNet MATH Google Scholar
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, London
Vartak MN (1955) On an application of Kronecker product of matrices to statistical designs. Ann Math Stat 26(3):420–438
Armijo L (1966) Minimization of functions having Lipschitz continuous first partial derivatives. Pac J Math 16(1):1–3
Article MathSciNet MATH Google Scholar
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 1–8
Zhao Y, Hong R, Jiang J, Wen J, Zhang H (2013) Image matching by fast random sample consensus. In: Proceedings of the fifth international conference on internet multimedia computing and service, pp 159–162
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the international conference on computer vision and pattern recognition, vol 2, pp 2169–2178
Zhang Y, Jia Z, Chen T (2011) Image retrieval with geometry-preserving visual phrases. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 809–816
Torralba A, Fergus R, Weiss Y (2008) Small codes and large image databases for recognition. In: Proceedings on computer vision and pattern recognition, pp 1–8
Jgou H, Douze M, Schmid C, Prez P (2010) Aggregating local descriptors into a compact image representation. In: Proceeding of IEEE conference on computer vision and pattern recognition (CVPR), pp 3304–3311
Perronnin F, Liu Y, Snchez J, Poirier H (2010) Large-scale image retrieval with compressed fisher vectors. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 3384–3391
Chatfield K, Lempitsky V, Vedaldi A, Zisserman A (2011) The devil is in the details: an evaluation of recent feature encoding methods. In: Proceedings of the 22nd british machine vision conference (BMVC), pp 76.1–76.12
Tamura H, Mori S, Yamawaki T (1978) Textural features corresponding to visual perception. IEEE Trans Syst Man Cybern 8:460–472
Article Google Scholar
Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. In: Proceedings of ninth IEEE international conference on computer vision, pp 1470–1477
Herve J, Matthijs D, Cordelia S (2008) Hamming embedding and weak geometric consistency for large scale image search. In: European conference on computer vision 2008 (ECCV 2008). Springer, Berlin, pp 304–317
http://www.vision.ee.ethz.ch/showroom/zubud/index.en.html
Lindeberg T (1998) Feature detection with automatic scale selection. Int J Comput Vis 30(2):79–116
Article Google Scholar
Mikolajczyk K, Schmid C (2004) Scale & affine invariant interest point detectors. Int J Comput Vis 60(1):63–86
Article Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant key points. Int J Comput Vis 60(2):91–110
Article Google Scholar
Tola E, Lepetit V, Fua P (2010) Daisy: an efficient dense descriptor applied to wide-baseline stereo. IEEE Trans Pattern Anal Mach Intell 32(5):815–830
Article Google Scholar
Bouachir W, Kardouchi M, Belacel N (2009) Improving bag of visual words image retrieval: a fuzzy weighting scheme for efficient indexation. In: Proceedings of fifth IEEE international conference on signal-image technology & internet-based systems (SITIS), pp 215–220
Chum O, Philbin J, Zisserman A (2008) Near duplicate image detection: min-Hash and tf-idf weighting. In BMVC, vol 810, pp 812–815
Ke Y, Sukthankar R (2004) PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp II-506

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology, Calicut, India
K. S. Arun & V. K. Govindan

Authors

K. S. Arun
View author publications
You can also search for this author in PubMed Google Scholar
V. K. Govindan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. S. Arun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arun, K.S., Govindan, V.K. Optimizing visual dictionaries for effective image retrieval. Int J Multimed Info Retr 4, 165–185 (2015). https://doi.org/10.1007/s13735-015-0076-1

Download citation

Received: 23 June 2014
Accepted: 16 January 2015
Published: 01 February 2015
Issue Date: September 2015
DOI: https://doi.org/10.1007/s13735-015-0076-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimizing visual dictionaries for effective image retrieval

Abstract

Access this article

Similar content being viewed by others

Image Retrieval Based on Optimized Visual Dictionary and Adaptive Soft Assignment

Collaborative Dictionary Learning and Soft Assignment for Sparse Coding of Image Features

Self-explanatory Sparse Representation for Image Classification

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimizing visual dictionaries for effective image retrieval

Abstract

Access this article

Similar content being viewed by others

Image Retrieval Based on Optimized Visual Dictionary and Adaptive Soft Assignment

Collaborative Dictionary Learning and Soft Assignment for Sparse Coding of Image Features

Self-explanatory Sparse Representation for Image Classification

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation