Large Scale Visual Classification via Learned Dictionaries and Sparse Representation

Fu, Zhenyong; Lu, Hongtao; Deng, Nan; Cai, Nengbin

doi:10.1007/978-3-642-16530-6_38

Zhenyong Fu²³,
Hongtao Lu²³,
Nan Deng²⁴ &
…
Nengbin Cai²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6319))

Included in the following conference series:

International Conference on Artificial Intelligence and Computational Intelligence

1806 Accesses
1 Citations

Abstract

We address the large scale visual classification problem. The approach is based on sparse and redundant representations over trained dictionaries. The proposed algorithm firstly trains dictionaries using the images of every visual category, one category has one dictionary. In this paper, we choose the K-SVD algorithm to train the visual category dictionary. Given a set of training images from a category, the K-SVD algorithm seeks the dictionary that leads to the best representation for each image in this set, under strict sparsity constraints. For testing images, the traditional classification method under the large scale condition is the k-nearest-neighbor method. And in our method, the category result is through the reconstruction residual using different dictionaries. To get the most effective dictionaries, we explore the large scale image database from the Internet [2] and design experiments on a nearly 1.6 million tiny images on the middle semantic level defined based on WordNet. We compare the image classification performance under different image resolutions and k-nearest-neighbor parameters. The experimental results demonstrate that the proposed algorithm outperforms k-nearest-neighbor in two aspects: 1) the discriminative capability for large scale visual classification task, and 2) the average running time of classifying one image.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aharon, M., Elad, M., Bruckstein, A.M.: The K-SVD: an algorithm for designing of overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing 54, 4311–4322 (2006)
Article Google Scholar
Torralba, A., Fergus, R., Freeman, W.T.: Tiny Images. Technical Report, Computer Science and Artificial Intelligence Lab., MIT (2007)
Google Scholar
Fellbaum, C.: Wordnet: An Electronic Lexical Database. Bradford Books (1998)
Google Scholar
Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image Processing 15, 3736–3745 (2006)
Article MathSciNet Google Scholar
Donoho, D.L., Elad, M., Temlyakov, V.: Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Transactions on Information Theory 147, 185–195 (2007)
MATH Google Scholar
Bryt, O., Elad, M.: Compression of facial images using the K-SVD algorithm. J. Vis. Commun. Image Represent 19(4), 270–283 (2008)
Article Google Scholar
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: Proc. CVPR (June 2005)
Google Scholar
Grauman, K., Darrell, T.: The pyramid match kernel: Discriminative classification with sets of image features. In: Proc. ICCV (2005)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proc. CVPR (2006)
Google Scholar
Li, L., Wang, G., Fei-Fei, L.: Optimol: automatic online picture collection via incremental model learning. In: Proc. CVPR (June 2007)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)
Article Google Scholar
Matas, J., Chum, O., Martin, U., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Proc. BMVC, vol. 1, pp. 384–393 (2002)
Google Scholar
Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. IJCV 60(1), 63–86 (2004)
Article Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Transaction on PAMI 27(10), 1615–1630 (2005)
Article Google Scholar
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proc. CVPR (June 2006)
Google Scholar
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering objects and their location in images. In: Proc. ICCV, pp. 370–377 (2005)
Google Scholar
Yeh, T., Lee, J., Darrell, T.: Adaptive vocabulary forests for dynamic indexing and category learning. In: Proc. ICCV (October 2007)
Google Scholar
Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 43, 177–196 (2001)
Article MATH Google Scholar
Hofmann, T.: Probabilistic latent semantic indexing. In: Proc. SIGIR (1999)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: A Large-Scale Hierarchical Image Database. In: Proc. CVPR (2009)
Google Scholar
Russell, B., Torralba, A., Murphy, K., Freeman, W.: Labelme: A database and web-based tool for image annotation. IJCV 77(1-3), 157–173 (2008)
Article Google Scholar
Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In: Conf. Rec. 27th Asilomar Conf. Signals, Syst. Comput., vol. 1 (1993)
Google Scholar
Tropp, J.A.: Greed is good: Algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50, 2231–2242 (2004)
Article MathSciNet MATH Google Scholar
Donoho, D.: For Most Large Underdetermined Systems of Linear Equations the Minimal l1-Norm Solution Is Also the Sparsest Solution. Comm. Pure and Applied Math. 59(6), 797–829 (2006)
Article MathSciNet MATH Google Scholar
Candes, E., Romberg, J., Tao, T.: Stable Signal Recovery from Incomplete and Inaccurate Measurements. Comm. Pure and Applied Math. 59(8), 1207–1223 (2006)
Article MathSciNet MATH Google Scholar
Chen, S., Donoho, D., Saunders, M.: Atomic Decomposition by Basis Pursuit. SIAM Rev. 43(1), 129–159 (2001)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Zhenyong Fu & Hongtao Lu
Forensic center of Shanghai Police, Shanghai, China
Nan Deng & Nengbin Cai

Authors

Zhenyong Fu
View author publications
You can also search for this author in PubMed Google Scholar
Hongtao Lu
View author publications
You can also search for this author in PubMed Google Scholar
Nan Deng
View author publications
You can also search for this author in PubMed Google Scholar
Nengbin Cai
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Business Administration, Caritas Francis Hsu College, 18 Chui Ling Road, Tseung Kwan O, Hong Kong, China
Fu Lee Wang
School of Business Information Technology, RMIT University, City Campus, 124 La Trobe Street, 3000, Melbourne, Victoria, Australia
Hepu Deng
Department of Computer Science, Nanjing University, 210093, Nanjing, China
Yang Gao
School of Computer, Nanjing University of Posts and Telecommunications, 210003, Nanjing, China
Jingsheng Lei

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fu, Z., Lu, H., Deng, N., Cai, N. (2010). Large Scale Visual Classification via Learned Dictionaries and Sparse Representation. In: Wang, F.L., Deng, H., Gao, Y., Lei, J. (eds) Artificial Intelligence and Computational Intelligence. AICI 2010. Lecture Notes in Computer Science(), vol 6319. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16530-6_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-16530-6_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16529-0
Online ISBN: 978-3-642-16530-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics