Skip to main content

Location-Aware Image Classification

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9516))

Included in the following conference series:

Abstract

Currently, the most popular image classification methods are based on global image representations. They face an obvious contradiction between the uncertainty of object position and the global image representation. In this paper, we propose a novel location-aware image classification framework to address this problem. In our framework, an image is classified based on local image representation, and the classifier is learned using an iterative multi-instance learning with a latent SVM, i.e., we infer object location using latent SVM to improve image classification. Our method is very efficient and outperforms the popular spatial pyramid matching (SPM) method and the Region Based Latent SVM (RBLSVM) method [1] on the challenging PASCAL VOC dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yakhnenko, O., Verbeek, J., Schmid, C.: Region-based image classification with a latent SVM model. Research report RR-7665, INRIA (2011)

    Google Scholar 

  2. Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: Proceedings of ICCV, pp. 1470–1477 (2003)

    Google Scholar 

  3. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of CVPR (2006)

    Google Scholar 

  4. Grauman, K., Darrell, T.: Pyramid match kernels criminative classification with sets of image features. In: ICCV (2005)

    Google Scholar 

  5. Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Proceedings of CVPR (2010)

    Google Scholar 

  6. Song, Z., Chen, Q., Huang, Z., Hua, Y., Yan, S.: Contextualizing object detection and classification. In: Proceedings of CVPR (2011)

    Google Scholar 

  7. Xie, L., Tian, Q., Wang, M., Zhang, B.: Spatial pooling of heterogeneous features for image classification. IEEE Trans. Image Process. 23, 1994–2008 (2014)

    Article  MathSciNet  Google Scholar 

  8. Qi, G.J., Hua, X.S., Rui, Y., Tang, J., Zhang, H.J.: Image classification with kernelized spatial-context. IEEE Trans. Multimedia 12, 278–287 (2010)

    Article  Google Scholar 

  9. Dietterich, T.G., Lathrop, R.H., Lozano-Perez, T.: Solving the multiple instance problem with axis-parallel rectangles. IEEE Trans. Pattern Anal. Mach. Intell. 89, 31–71 (1997)

    MATH  Google Scholar 

  10. Wang, X., Bai, X., Liu, W., Latecki, L.J.: Feature context for image classification and object detection. In: Proceedings of CVPR (2011)

    Google Scholar 

  11. Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)

    Article  Google Scholar 

  12. Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. PAMI 24, 509–522 (2002)

    Article  Google Scholar 

  13. Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: Proceedings of Advances in Neural Information Processing Systems (2003)

    Google Scholar 

  14. Hong, R., Wang, M., Gao, Y., Tao, D., Li, X., Wu, X.: Image annotation by multiple-instance learning with discriminative feature mapping and selection. IEEE Trans. Cybern. 44, 669–680 (2014)

    Article  Google Scholar 

  15. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge (VOC2007) (2007), Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

  16. Wang, M., Li, G., Lu, Z., Gao, Y., Chua, T.S.: When amazon meets google: product visualization by exploring multiple web sources. ACM Trans. Internet Technol. (TOIT) 12, 12 (2013)

    Article  Google Scholar 

  17. Wang, M., Li, H., Tao, D., Lu, K., Wu, X.: Multimodal graph-based reranking for web image search. IEEE Trans. Image Process. 21, 4649–4661 (2012)

    Article  MathSciNet  Google Scholar 

  18. Wang, X., Feng, B., Bai, X., Liu, W., Latecki, L.J.: Bag of contour fragments for robust shape classification. Pattern Recogn. 47, 2116–2125 (2014)

    Article  Google Scholar 

  19. Zhu, J., Wu, T., Zhu, J., Yang, X., Zhang, W.: Learning reconfigurable scene representation by tangram model. In: 2012 IEEE Workshop on Applications of Computer Vision (WACV), pp. 449–456. IEEE (2012)

    Google Scholar 

  20. Fergus, R., Perona, P., Zisserman, A.: Object class recognition by unsupervised scale-invariant learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2003)

    Google Scholar 

  21. Lee, Y.J., Grauman, K.: Object-graphs for context-aware category discovery. IEEE Trans. Pattern Anal. Mach. Intell. TPAMI 34, 346–358 (2011)

    Google Scholar 

  22. Yuan, J., Wu, Y.: Spatial random partition for common visual pattern discovery. In: Proceedings of ICCV (2007)

    Google Scholar 

  23. Zhu, L.L., Lin, C.X., Huang, H., Chen, Y., Yuille, A.L.: Unsupervised structure learning: hierarchical recursive composition, suspicious coincidence and competitive exclusion. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 759–773. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  24. Zhu, J., Zou, W., Yang, X., Zhang, R., Zhou, Q., Zhang, W.: Image classification by hierarchical spatial pooling with partial least squares analysis. In: BMVC, pp. 1–11 (2012)

    Google Scholar 

  25. Khan, I., Roth, P.M., Bischof, H.: Learning object detectors from weakly-labeled internet images. In: OAGM Workshop (2010)

    Google Scholar 

  26. Alexe, B., Deselares, T., Ferrari, V.: What is an object? In: Proceedings of CVPR (2010)

    Google Scholar 

  27. Vijayanarasimhan, S., Grauman, K.: Keywords to visual categories: multiple-instance learning for weakly supervised object categorization. In: Proceedings of CVPR (2008)

    Google Scholar 

  28. Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: Proceedings of ICCV (2011)

    Google Scholar 

  29. Russakovsky, O., Lin, Y., Yu, K., Fei-Fei, L.: Object-centric spatial pooling for image classification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 1–15. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  30. Harzallah, H., Jurie, F., Schmid, C.: Combining efficient object localization and image classification. In: International Conference on Computer Vision (2009)

    Google Scholar 

  31. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004)

    Article  Google Scholar 

  32. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: Proceedings of CVPR (2009)

    Google Scholar 

  33. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)

    Google Scholar 

  34. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)

    Article  MATH  Google Scholar 

  35. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)

    MATH  Google Scholar 

  36. Quack, T., Ferrari, V., Leibe, B., Gool, L.V.: Efficient mining of frequent and distinctive feature configurations. In: International Conference on Computer Vision (ICCV 2007) (2007)

    Google Scholar 

  37. Liu, C., Yuen, J., Torralba, A., Sivic, J., Freeman, W.T.: SIFT flow: dense correspondence across different scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 28–42. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  38. Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: Proceedings of the British Machine Vision Conference (BMVC) (2011)

    Google Scholar 

Download references

Acknowledgments

This work was primarily supported by National Natural Science Foundation of China (NSFC) (No. 61503145). This material is also based upon work supported by the NSF under Grants No. IIS-1302164 and OIA-1027897.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinggang Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, X., Yang, X., Liu, W., Duan, C., Latecki, L.J. (2016). Location-Aware Image Classification. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9516. Springer, Cham. https://doi.org/10.1007/978-3-319-27671-7_69

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27671-7_69

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27670-0

  • Online ISBN: 978-3-319-27671-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics