Abstract
In computer vision, the bag-of-words (BoW) model has been widely applied to image related tasks, such as large scale image retrieval, image classification, and object categorization. The sparse coding (SC) method which leverages SC as a means of feature coding can guarantee both sparsity of coding vector and lower reconstruction error in the BoW model. Thus it can achieve better performance than the traditional vector quantization method. However, it suffers from the side effect introduced by the non-smooth sparsity regularizer that quite different words may be selected for similar patches to favor sparsity, resulting in the loss of correlation between the corresponding coding vectors. To address this problem, in this paper, we propose a novel soft assignment method based on index combination of top-2 large sparse codes of local descriptors to make the SC-based BoW tolerate the case of different word selection for similar patches. To further ensure similar patches select same words to generate similar coding vectors, we propose a collaborative dictionary learning method through imposing the sparse code similarity regularization factor along with the row sparsity regularization across data instances on top of group sparse coding. Experiments on the well-known public Oxford dataset demonstrate the effectiveness of our proposed methods.
This work was supported by National Nature Science Foundation of China (61371194, 61672361, 61572472), Beijing Natural Science Foundation (4152050, 4152012), Beijing Advanced Innovation Center for Imaging Technology (BAICIT-2016009).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Xie, H., Gao, K., Zhang, Y., Tang, S., Li, J., Liu, Y.: Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Trans. Multimedia 13(6), 1319–1332 (2011)
Nie, L., Yan, S., Wang, M., Hong, R., Chua, T.-S.: Harvesting visual concepts for image search with complex queries. In: Proceedings of ACM Multimedia 2012 Conference, October 2012
Tang, S., Li, J.-T., Li, M., Xie, C., Liu, Y.-Z., Tao, K., Xu, S.-X.: TRECVID 2008 high-level feature extraction by MCG-ICT-CAS. In: Proceedings of TRECVID 2008 Workshop, November 2008
Tang, S., Zheng, Y.-T., Wang, Y., Chua, T.-S.: Sparse ensemble learning for concept detection. IEEE Trans. Multimedia 14(1), 43–54 (2012)
Li, P., Lu, X., Wang, Q.: From dictionary of visual words to subspaces: locality-constrained affine subspace coding. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2348–2357, June 2015
Mikulik, A., Perdoch, M., Chum, O., Matas, J.: Learning vocabularies over a fine quantization. Int. J. Comput. Vision 103(1), 163–175 (2013)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceedings of ICCV, pp. 1470–1477 (2003)
Jegou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87, 316–336 (2010)
Tang, S., Chen, H., Lv, K., Zhang, Y.D.: Large visual words for large scale image classification. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 1170–1174, September 2015
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proceedings of CVPR, pp. 2161–2168 (2006)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of CVPR, pp. 1–8 (2007)
Li, D., Yang, L., Hua, X.S., Zhang, H.J.: Large-scale robust visual codebook construction. In: ACM Multimedia 2010 (2010)
Avrithis, Y., Kalantidis, Y.: Approximate Gaussian mixtures for large scale vocabularies. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 15–28. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33712-3_2
Tang, S., Zhang, Y.D., Chen, H.: Scalable logo recognition based on compact sparse dictionary for mobile devices. In: 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6, October 2015
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)
Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Proceedings of ACM International Conference on Image and Video Retrieval (2007)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: Proceedings of CVPR (2008)
Strelow, D., Bengio, S., Pereira, F., Singer, Y.: Group sparse coding. In: Neural Information Processing Systems - NIPS (2009)
Petitcolas, F.A.P.: Watermarking schemes evaluation. IEEE. Sig. Process. 17(5), 117–128 (2000)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of CVPR (2009). http://image-net.org/
Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 36, 2227–2240 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Liu, J., Tang, S., Li, Y. (2017). Collaborative Dictionary Learning and Soft Assignment for Sparse Coding of Image Features. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-51811-4_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51810-7
Online ISBN: 978-3-319-51811-4
eBook Packages: Computer ScienceComputer Science (R0)