Skip to main content
Log in

Improve scene categorization via sub-scene recognition

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Traditional scene categorization methods tend to generalize representation of the scene via a holistic approach to calculate a distribution of visual words observed in the image. They disregard spatial information within a scene and are not able to discern categories that share similar sub-scenes but different in layout; or categories that are ambiguous by nature. To address this issue, we propose to incorporate sub-scene attributes within global descriptions to improve categorization performance, especially in ambiguity cases. This is achieved by encoding sub-scenes with layout prototypes that capture the geometric essence of scenes more accurately and flexibly. The proposed method improves categorization accuracy to 92.26 % in the widely used eight scenes dataset, and outperforms all the other published methods. It is also observed that the proposed method is more accurate at detecting and evaluating ambiguity images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Boix, X., Gonfaus, J.M., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonzàlez, J.: Harmony potentials. Int. J. Comput. Vis. 96, 83–102 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  2. Manduchi, R., Castano, A., Talukder, A., Matthies, L.: Obstacle detection and terrain classification for autonomous off-road navigation. Auton. Robot. 18, 81–102 (2005)

    Article  Google Scholar 

  3. Yao, B., Fei-Fei, L.: Modeling mutual context of object and human pose in human-object interaction activities. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 17–24 (2010)

  4. Siagian, C., Itti, L.: Gist: a mobile robotics application of context-based vision in outdoor environment. In: IEEE Conference on Computer Vision and Pattern Recognition-Workshops, pp. 88–88 (2005)

  5. Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Computer Vision-ECCV 2006, pp. 288–301. Springer, Berlin (2006)

  6. Berretti, S., Del Bimbo, A., Vicario, E.: Efficient matching and indexing of graph models in content-based retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1089–1105 (2001)

    Article  Google Scholar 

  7. Cai, D., He, X., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical clustering of WWW image search results using visual, textual and link information. In: ACM International Conference on Multimedia, pp. 952–959 (2004)

  8. Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 524–531 (2005)

  9. Qin, J., Yung, N.H.C.: Scene categorization via contextual visual words. Pattern Recognit. 43, 1874–1888 (2010)

    Article  MATH  Google Scholar 

  10. Bosch, A., Zisserman, A., Muoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30, 712–727 (2008)

    Article  Google Scholar 

  11. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)

  12. Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: ACM International Conference on Image and Video Retrieval, pp. 401–408 (2007)

  13. Bosch, A., Zisserman, A., Muoz, X.: Image classification using random forests and ferns. In: IEEE International Conference on Computer Vision, pp. 1–8 (2007)

  14. Qin, J., Yung, N.H.: Feature fusion within local region using localized maximum-margin learning for scene categorization. Pattern Recognit. 45, 1671–1683 (2012)

    Article  Google Scholar 

  15. Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: IEEE International Conference on Computer Vision, pp. 1307–1314 (2011)

  16. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1794–1801 (2009)

  17. Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3360–3367 (2010)

  18. Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image classification using super-vector coding of local image descriptors. In: Computer Vision-ECCV 2010, pp. 141–154. Springer, Berlin (2010)

  19. Russakovsky, O., Lin, Y., Yu, K., Fei-Fei, L.: Object-centric spatial pooling for image classification. In: Computer Vision-ECCV 2012, pp. 1–15. Springer, Berlin (2012)

  20. Bosch, A., Zisserman, A., Muoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30, 712–727 (2008)

    Article  Google Scholar 

  21. Torralbo, A., Walther, D.B., Chai, B., Caddigan, E., Fei-Fei, L., Beck, D.M.: Good exemplars of natural scene categories elicit clearer patterns than bad exemplars but not greater BOLD activity. PloS one 8, e58594 (2013)

    Article  Google Scholar 

  22. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)

    Article  MATH  Google Scholar 

  23. Vogel, J., Schiele, B.: Semantic modeling of natural scenes for content-based image retrieval. Int. J. Comput. Vis. 72, 133–157 (2007)

    Article  Google Scholar 

  24. Juneja, M., Vedaldi, A., Jawahar, C., Zisserman, A.: Blocks that shout: distinctive parts for scene classification. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)

  25. Li, L.-J., Socher, R., Fei-Fei, L.: Towards total scene understanding: classification, annotation and segmentation in an automatic framework. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2036–2043 (2009)

  26. Li, L.-J., Fei-Fei, L.: What, where and who? Classifying events by scene and object recognition. In: International Conference on Computer Vision, pp. 1–8 (2007)

  27. Kwitt, R., Vasconcelos, N., Rasiwasia, N.: Scene recognition on the semantic manifold. In: Computer Vision-ECCV 2012, pp. 359–372. Springer, Berlin (2012)

  28. Dunlop, H.: Scene classification of images and video via semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 72–79 (2010)

  29. Zhu, J., Wu, T., Zhu, S.-C., Yang, X., Zhang, W.: Learning reconfigurable scene representation by tangram model. In: IEEE Workshop on Applications of Computer Vision, pp. 449–456 (2012)

  30. Wang, S., Wang, Y., Zhu, S.-C.: Hierarchical space tiling for scene modeling. In: Computer Vision-ACCV 2012, pp. 796–810. Springer, Berlin (2013)

  31. Wang, S., Joo, J., Wang, Y., Zhu, S. C.: Weakly supervised learning for attribute localization in outdoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3111–3118 (2013)

  32. Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int. J. Comput. Vis. 81, 2–23 (2009)

    Article  Google Scholar 

  33. Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 157–173 (2008)

    Article  Google Scholar 

  34. Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: Proceedings of the NIPS, pp. 109–117 (2011)

Download references

Acknowledgments

This work was supported in part by a Grant from the Research Grant Council of the Hong Kong Special Administrative Region, China, under Project HKU718912E, and in part by the Postgraduate Studentship of the University of Hong Kong.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shan-shan Zhu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, Ss., Yung, N.H.C. Improve scene categorization via sub-scene recognition. Machine Vision and Applications 25, 1561–1572 (2014). https://doi.org/10.1007/s00138-014-0622-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-014-0622-5

Keywords

Navigation