Skip to main content

Collaborative Dictionary Learning and Soft Assignment for Sparse Coding of Image Features

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10132))

Included in the following conference series:

Abstract

In computer vision, the bag-of-words (BoW) model has been widely applied to image related tasks, such as large scale image retrieval, image classification, and object categorization. The sparse coding (SC) method which leverages SC as a means of feature coding can guarantee both sparsity of coding vector and lower reconstruction error in the BoW model. Thus it can achieve better performance than the traditional vector quantization method. However, it suffers from the side effect introduced by the non-smooth sparsity regularizer that quite different words may be selected for similar patches to favor sparsity, resulting in the loss of correlation between the corresponding coding vectors. To address this problem, in this paper, we propose a novel soft assignment method based on index combination of top-2 large sparse codes of local descriptors to make the SC-based BoW tolerate the case of different word selection for similar patches. To further ensure similar patches select same words to generate similar coding vectors, we propose a collaborative dictionary learning method through imposing the sparse code similarity regularization factor along with the row sparsity regularization across data instances on top of group sparse coding. Experiments on the well-known public Oxford dataset demonstrate the effectiveness of our proposed methods.

This work was supported by National Nature Science Foundation of China (61371194, 61672361, 61572472), Beijing Natural Science Foundation (4152050, 4152012), Beijing Advanced Innovation Center for Imaging Technology (BAICIT-2016009).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Xie, H., Gao, K., Zhang, Y., Tang, S., Li, J., Liu, Y.: Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Trans. Multimedia 13(6), 1319–1332 (2011)

    Article  Google Scholar 

  2. Nie, L., Yan, S., Wang, M., Hong, R., Chua, T.-S.: Harvesting visual concepts for image search with complex queries. In: Proceedings of ACM Multimedia 2012 Conference, October 2012

    Google Scholar 

  3. Tang, S., Li, J.-T., Li, M., Xie, C., Liu, Y.-Z., Tao, K., Xu, S.-X.: TRECVID 2008 high-level feature extraction by MCG-ICT-CAS. In: Proceedings of TRECVID 2008 Workshop, November 2008

    Google Scholar 

  4. Tang, S., Zheng, Y.-T., Wang, Y., Chua, T.-S.: Sparse ensemble learning for concept detection. IEEE Trans. Multimedia 14(1), 43–54 (2012)

    Article  Google Scholar 

  5. Li, P., Lu, X., Wang, Q.: From dictionary of visual words to subspaces: locality-constrained affine subspace coding. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2348–2357, June 2015

    Google Scholar 

  6. Mikulik, A., Perdoch, M., Chum, O., Matas, J.: Learning vocabularies over a fine quantization. Int. J. Comput. Vision 103(1), 163–175 (2013)

    Article  MathSciNet  Google Scholar 

  7. Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. In: Proceedings of ICCV, pp. 1470–1477 (2003)

    Google Scholar 

  8. Jegou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. Int. J. Comput. Vis. 87, 316–336 (2010)

    Article  Google Scholar 

  9. Tang, S., Chen, H., Lv, K., Zhang, Y.D.: Large visual words for large scale image classification. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 1170–1174, September 2015

    Google Scholar 

  10. Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proceedings of CVPR, pp. 2161–2168 (2006)

    Google Scholar 

  11. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of CVPR, pp. 1–8 (2007)

    Google Scholar 

  12. Li, D., Yang, L., Hua, X.S., Zhang, H.J.: Large-scale robust visual codebook construction. In: ACM Multimedia 2010 (2010)

    Google Scholar 

  13. Avrithis, Y., Kalantidis, Y.: Approximate Gaussian mixtures for large scale vocabularies. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 15–28. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33712-3_2

    Chapter  Google Scholar 

  14. Tang, S., Zhang, Y.D., Chen, H.: Scalable logo recognition based on compact sparse dictionary for mobile devices. In: 2015 IEEE 17th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6, October 2015

    Google Scholar 

  15. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)

    Google Scholar 

  16. Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Proceedings of ACM International Conference on Image and Video Retrieval (2007)

    Google Scholar 

  17. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: improving particular object retrieval in large scale image databases. In: Proceedings of CVPR (2008)

    Google Scholar 

  18. Strelow, D., Bengio, S., Pereira, F., Singer, Y.: Group sparse coding. In: Neural Information Processing Systems - NIPS (2009)

    Google Scholar 

  19. Petitcolas, F.A.P.: Watermarking schemes evaluation. IEEE. Sig. Process. 17(5), 117–128 (2000)

    Google Scholar 

  20. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of CVPR (2009). http://image-net.org/

  21. Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 36, 2227–2240 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheng Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Liu, J., Tang, S., Li, Y. (2017). Collaborative Dictionary Learning and Soft Assignment for Sparse Coding of Image Features. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51811-4_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51810-7

  • Online ISBN: 978-3-319-51811-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics