Abstract
In this paper, we demonstrate the effectiveness of a customized ResNet to address the problem of indoor–outdoor scene classification both for color images as well as depth images. Such an approach can serve as an initial step in a scene classification/retrieval pipeline or a single-image depth estimation task. The classification framework is developed based on Residual Convolutional Neural Network (ResNet-18) to classify any random scene as indoor or outdoor. We also demonstrate the invariance of the classification performance with respect to different weather conditions of outdoor scenes (which one can commonly encounter). The performance of our classification strategy is analyzed on different varieties of publicly available datasets of indoor and outdoor scenes that also have corresponding depth maps. The suggested approach achieves almost an ideal performance in many scenarios, for both color and depth images, across datasets. We also show positive comparisons with other state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pillai, I., Satta, R., Fumera, G., Roli, F.: Exploiting depth information for indoor-outdoor scene classification. In: Maino, G., Foresti, G.L. (eds.) Image Analysis and Processing—ICIAP 2011, pp. 130–139. Springer, Berlin (2011)
Bianco, S., Ciocca, G., Cusano, C., Schettini, R.: Improving color constancy using indoor-outdoor image classification. IEEE Trans. Image Process. 17(12), 2381–2392 (2008)
Das, S., Ahuja, N.: Performance analysis of stereo, vergence, and focus as depth cues for active vision. IEEE Trans. Pattern Anal. Mach. Intell. 17(12), 1213–1219 (1995)
Saxena, A., Sun, M., Ng, A.Y.: Make3d: learning 3d scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 824–840 (2009)
Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems, pp. 2366–2374 (2014)
Liu, F., Shen, C., Lin, G.: Deep convolutional neural fields for depth estimation from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5162–5170 (2015)
Saber, E., Tekalp, A.M.: Integration of color, shape, and texture for image annotation and retrieval. In: Proceedings of 3rd IEEE International Conference on Image Processing, vol. 3, pp. 851–854 (1996)
Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer (1996)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification (2nd edn). Wiley-Interscience (2000)
Payne, A., Singh, S.: Indoor vs. outdoor scene classification in digital photographs. Pattern Recogn. 38(10), 1533–1545 (2005)
Gupta, L., Pathangay, V., Patra, A., Dyana, A., Das, S.: Indoor versus outdoor scene classification using probabilistic neural network. EURASIP J. Adv. Signal Process. 2007(1), 094298 (2006)
Havens, T.C., Bezdek, J.C., Leckie, C., Hall, L.O., Palaniswami, M.: Fuzzy c-means algorithms for very large data. IEEE Trans. Fuzzy Syst. 20(6), 1130–1146 (2012)
Tao, L., Kim, Y.H., Kim, Y.T.: An efficient neural network based indoor-outdoor scene classification algorithm. In: 2010 Digest of Technical Papers International Conference on Consumer Electronics (ICCE), pp. 317–318 (2010)
Wu, J., Rehg, J.M.: Centrist: a visual descriptor for scene categorization. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1489–1501 (2011)
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, vol. 1. NIPS’14, pp. 487–495. MIT Press, Cambridge, MA, USA (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision—ECCV 2014, pp. 346–361. Springer International Publishing, Cham (2014)
Yoo, D., Park, S., Lee, J., Kweon, I.: Fisher kernel for deep neural activations. CoRR abs/1412.1628 (2014)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR abs/1311.2524 (2013)
Kang, K., Wang, X.: Fully convolutional neural networks for crowd segmentation. CoRR abs/1411.4464 (2014)
Zhao, F., Huang, Y., Wang, L., Tan, T.: Deep semantic ranking based hashing for multi-label image retrieval. CoRR abs/1501.06272 (2015)
Dixit, M., Chen, S., Gao, D., Rasiwasia, N., Vasconcelos, N.: Scene classification with semantic fisher vectors. In: CVPR, IEEE Computer Society, pp. 2974–2983 (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc. (2012)
Yoo, D., Park, S., Lee, J.Y., Kweon, I.S.: Multi-scale pyramid pooling for deep convolutional representation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 71–80 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. CoRR abs/1406.4729 (2014)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: A 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. pp. 1–1 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, IEEE Computer Society, pp. 770–778 (2016)
Gao, B., Wei, X., Wu, J., Lin, W.: Deep spatial pyramid: The devil is once again in the details. CoRR abs/1504.05277 (2015)
Gupta, S., Pradhan, D., Dileep, A.D., Thenkanidiyoor, V.: Deep spatial pyramid match kernel for scene classification. In: ICPRAM, pp. 141–148 (2018)
Zhu, C.: Place recognition: an overview of vision perspective. CoRR abs/1707.03470 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Computer Vision and Pattern Recognition (CVPR) (2015)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. In: CVPR (2016)
Silberman, N., Fergus, R.: Indoor scene segmentation using a structured light sensor. In: Proceedings of the International Conference on Computer Vision—Workshop on 3D Representation and Recognition (2011)
Chang, A.X., Dai, A., Funkhouser, T.A., Halber, M., Nießner, M., Savva, M., Song, S., Zeng, A., Zhang, Y.: Matterport3d: learning from RGB-D data in indoor environments. CoRR abs/1709.06158 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kumari, S., Jha, R.R., Bhavsar, A., Nigam, A. (2020). Indoor–Outdoor Scene Classification with Residual Convolutional Neural Network. In: Chaudhuri, B., Nakagawa, M., Khanna, P., Kumar, S. (eds) Proceedings of 3rd International Conference on Computer Vision and Image Processing. Advances in Intelligent Systems and Computing, vol 1024. Springer, Singapore. https://doi.org/10.1007/978-981-32-9291-8_26
Download citation
DOI: https://doi.org/10.1007/978-981-32-9291-8_26
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9290-1
Online ISBN: 978-981-32-9291-8
eBook Packages: EngineeringEngineering (R0)