Improve scene categorization via sub-scene recognition

Zhu, Shan-shan; Yung, Nelson H. C.

doi:10.1007/s00138-014-0622-5

Improve scene categorization via sub-scene recognition

Original Paper
Published: 03 June 2014

Volume 25, pages 1561–1572, (2014)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Shan-shan Zhu¹ &
Nelson H. C. Yung¹

305 Accesses
3 Citations
Explore all metrics

Abstract

Traditional scene categorization methods tend to generalize representation of the scene via a holistic approach to calculate a distribution of visual words observed in the image. They disregard spatial information within a scene and are not able to discern categories that share similar sub-scenes but different in layout; or categories that are ambiguous by nature. To address this issue, we propose to incorporate sub-scene attributes within global descriptions to improve categorization performance, especially in ambiguity cases. This is achieved by encoding sub-scenes with layout prototypes that capture the geometric essence of scenes more accurately and flexibly. The proposed method improves categorization accuracy to 92.26 % in the widely used eight scenes dataset, and outperforms all the other published methods. It is also observed that the proposed method is more accurate at detecting and evaluating ambiguity images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Attention mechanisms in computer vision: A survey

Article Open access 15 March 2022

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

References

Boix, X., Gonfaus, J.M., van de Weijer, J., Bagdanov, A.D., Serrat, J., Gonzàlez, J.: Harmony potentials. Int. J. Comput. Vis. 96, 83–102 (2012)
Article MATH MathSciNet Google Scholar
Manduchi, R., Castano, A., Talukder, A., Matthies, L.: Obstacle detection and terrain classification for autonomous off-road navigation. Auton. Robot. 18, 81–102 (2005)
Article Google Scholar
Yao, B., Fei-Fei, L.: Modeling mutual context of object and human pose in human-object interaction activities. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 17–24 (2010)
Siagian, C., Itti, L.: Gist: a mobile robotics application of context-based vision in outdoor environment. In: IEEE Conference on Computer Vision and Pattern Recognition-Workshops, pp. 88–88 (2005)
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Computer Vision-ECCV 2006, pp. 288–301. Springer, Berlin (2006)
Berretti, S., Del Bimbo, A., Vicario, E.: Efficient matching and indexing of graph models in content-based retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1089–1105 (2001)
Article Google Scholar
Cai, D., He, X., Li, Z., Ma, W.-Y., Wen, J.-R.: Hierarchical clustering of WWW image search results using visual, textual and link information. In: ACM International Conference on Multimedia, pp. 952–959 (2004)
Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 524–531 (2005)
Qin, J., Yung, N.H.C.: Scene categorization via contextual visual words. Pattern Recognit. 43, 1874–1888 (2010)
Article MATH Google Scholar
Bosch, A., Zisserman, A., Muoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30, 712–727 (2008)
Article Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2169–2178 (2006)
Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: ACM International Conference on Image and Video Retrieval, pp. 401–408 (2007)
Bosch, A., Zisserman, A., Muoz, X.: Image classification using random forests and ferns. In: IEEE International Conference on Computer Vision, pp. 1–8 (2007)
Qin, J., Yung, N.H.: Feature fusion within local region using localized maximum-margin learning for scene categorization. Pattern Recognit. 45, 1671–1683 (2012)
Article Google Scholar
Pandey, M., Lazebnik, S.: Scene recognition and weakly supervised object localization with deformable part-based models. In: IEEE International Conference on Computer Vision, pp. 1307–1314 (2011)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1794–1801 (2009)
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3360–3367 (2010)
Zhou, X., Yu, K., Zhang, T., Huang, T.S.: Image classification using super-vector coding of local image descriptors. In: Computer Vision-ECCV 2010, pp. 141–154. Springer, Berlin (2010)
Russakovsky, O., Lin, Y., Yu, K., Fei-Fei, L.: Object-centric spatial pooling for image classification. In: Computer Vision-ECCV 2012, pp. 1–15. Springer, Berlin (2012)
Bosch, A., Zisserman, A., Muoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 30, 712–727 (2008)
Article Google Scholar
Torralbo, A., Walther, D.B., Chai, B., Caddigan, E., Fei-Fei, L., Beck, D.M.: Good exemplars of natural scene categories elicit clearer patterns than bad exemplars but not greater BOLD activity. PloS one 8, e58594 (2013)
Article Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)
Article MATH Google Scholar
Vogel, J., Schiele, B.: Semantic modeling of natural scenes for content-based image retrieval. Int. J. Comput. Vis. 72, 133–157 (2007)
Article Google Scholar
Juneja, M., Vedaldi, A., Jawahar, C., Zisserman, A.: Blocks that shout: distinctive parts for scene classification. In: IEEE Conference on Computer Vision and Pattern Recognition (2013)
Li, L.-J., Socher, R., Fei-Fei, L.: Towards total scene understanding: classification, annotation and segmentation in an automatic framework. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2036–2043 (2009)
Li, L.-J., Fei-Fei, L.: What, where and who? Classifying events by scene and object recognition. In: International Conference on Computer Vision, pp. 1–8 (2007)
Kwitt, R., Vasconcelos, N., Rasiwasia, N.: Scene recognition on the semantic manifold. In: Computer Vision-ECCV 2012, pp. 359–372. Springer, Berlin (2012)
Dunlop, H.: Scene classification of images and video via semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 72–79 (2010)
Zhu, J., Wu, T., Zhu, S.-C., Yang, X., Zhang, W.: Learning reconfigurable scene representation by tangram model. In: IEEE Workshop on Applications of Computer Vision, pp. 449–456 (2012)
Wang, S., Wang, Y., Zhu, S.-C.: Hierarchical space tiling for scene modeling. In: Computer Vision-ACCV 2012, pp. 796–810. Springer, Berlin (2013)
Wang, S., Joo, J., Wang, Y., Zhu, S. C.: Weakly supervised learning for attribute localization in outdoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3111–3118 (2013)
Shotton, J., Winn, J., Rother, C., Criminisi, A.: Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. Int. J. Comput. Vis. 81, 2–23 (2009)
Article Google Scholar
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77, 157–173 (2008)
Article Google Scholar
Krähenbühl, P., Koltun, V.: Efficient inference in fully connected crfs with gaussian edge potentials. In: Proceedings of the NIPS, pp. 109–117 (2011)

Download references

Acknowledgments

This work was supported in part by a Grant from the Research Grant Council of the Hong Kong Special Administrative Region, China, under Project HKU718912E, and in part by the Postgraduate Studentship of the University of Hong Kong.

Author information

Authors and Affiliations

Laboratory for Intelligent Transportation Systems Research, Department of Electrical and Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China
Shan-shan Zhu & Nelson H. C. Yung

Authors

Shan-shan Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Nelson H. C. Yung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shan-shan Zhu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhu, Ss., Yung, N.H.C. Improve scene categorization via sub-scene recognition. Machine Vision and Applications 25, 1561–1572 (2014). https://doi.org/10.1007/s00138-014-0622-5

Download citation

Received: 11 September 2013
Revised: 12 March 2014
Accepted: 07 May 2014
Published: 03 June 2014
Issue Date: August 2014
DOI: https://doi.org/10.1007/s00138-014-0622-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improve scene categorization via sub-scene recognition

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

ImageNet Large Scale Visual Recognition Challenge

Image Matching from Handcrafted to Deep Features: A Survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improve scene categorization via sub-scene recognition

Abstract

Access this article

Similar content being viewed by others

Attention mechanisms in computer vision: A survey

ImageNet Large Scale Visual Recognition Challenge

Image Matching from Handcrafted to Deep Features: A Survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation