Abstract
Scene recognition is one of the hallmark tasks in computer vision, as it provides rich information beyond object recognition and action recognition. It is easy to accept that scene images from the same class always include the same essential objects and relations, for example, scene images of “wedding” usually have bridegroom and bride next to him. Following this observation, we introduce a novel idea to boost the accuracy of scene recognition by mining essential scene sub-graph and learning a bi-enhanced knowledge space. The essential scene sub-graph describes the essential objects and their relations for each scene class. The learned knowledge space is bi-enhanced by global representation on the entire image and local representation on the corresponding essential scene sub-graph. Experimental results on the constructed dataset called Scene 30 demonstrate the effectiveness of our proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Chen, P.H., Lin, C.J., Schölkopf, B.: A tutorial on-support vector machines. Appl. Stoch. Models Bus. Ind. 21(2), 111–136 (2005)
Cheng, X., Lu, J., Feng, J., Yuan, B., Zhou, J.: Scene recognition with objectness. Pattern Recogn. 74, 474–487 (2018)
Chollet, F., et al.: Keras (2015). https://github.com/keras-team/keras
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detectionwith discriminatively trained part-based models. PAMI 32(9), 1627–1645 (2010)
Geman, S., Graffigne, C.: Markov random field image models and their applications to computer vision. In: Proceedings of the International Congress of Mathematicians, vol. 1, p. 2 (1986)
Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 392–407. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_26
Herranz, L., Jiang, S., Li, X.: Scene recognition with CNNs: objects, scales and dataset bias. In: CVPR, pp. 571–579 (2016)
Huang, S., Xu, Z., Tao, D., Zhang, Y.: Part-stacked CNN for fine-grained visual categorization. In: CVPR, pp. 1173–1182 (2016)
Krishna, R., et al.: Visual genome: connecting language and vision using crowdsourced dense image annotations. IJCV 123(1), 32–73 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Bao, B.-K., Zhu, G., Shen, J., Yan, S.: Robust image analysis with sparse representation on quantized visual features. IEEE Trans. Image Process. 22(3), 860–871 (2013)
Li, L.J., Su, H., Fei-Fei, L., Xing, E.P.: Object bank: A high-level image representation for scene classification and semantic feature sparsification. In: Advances in Neural Information Processing Systems, pp. 1378–1386 (2010)
Margolin, R., Zelnik-Manor, L., Tal, A.: OTC: a novel local descriptor for scene classification. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 377–391. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10584-0_25
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representationof the spatial envelope. IJCV 42(3), 145–175 (2001)
Parizi, S.N., Oberlin, J.G., Felzenszwalb, P.F.: Reconfigurable models for scene recognition. In: CVPR 2012, pp. 2775–2782. IEEE (2012)
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Stamp, M., Professor, A.: A revealing introduction to hidden Markov models. IEEE ASSP Magruine 1(24), 258–261 (2004)
Sudderth, E.B., Torralba, A., Freeman, W.T., Willsky, A.S.: Learning hierarchical models of scenes, objects, and parts. In: ICCV 2005, vol. 2, pp. 1331–1338. IEEE (2005)
Wang, Z., Wang, L., Wang, Y., Zhang, B., Qiao, Y.: Weakly supervised patchnets: describing and aggregating local patches for scene recognition. TIP 26(4), 2028–2041 (2017)
Wu, J., Rehg, J.M.: Centrist: a visual descriptor for scene categorization. PAMI 33(8), 1489–1501 (2011)
Xie, G.S., Zhang, X.Y., Yan, S., Liu, C.L.: Hybrid CNN and dictionary-based models for scene recognition and domain adaptation. IEEE Trans. Circuits Syst. Video Technol. 27(6), 1263–1274 (2017)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. PAMI 40, 1452–1464 (2017)
Bao, B.-K., Liu, G., Changsheng, X., Yan, S.: Inductive robust principal component analysis. IEEE Trans. Image Process. 21(8), 3794–3800 (2012)
Bao, B.-K., Min, W., Li, T., Changsheng, X.: Joint local and global consistency on interdocument and interword relationships for co-clustering. IEEE Trans. Cybern. 45(1), 15–28 (2015)
Min, W., Bao, B.-K., Mei, S., Zhu, Y., Rui, Y., Jiang, S.: You are what you eat: exploring rich recipe information for cross-region food analysis. IEEE Trans. Multimed. 20(4), 950–964 (2018)
Bao, B.-K., Changsheng, X., Min, W., Hossain, M.S.: Cross-platform emerging topic detection and elaboration from multimedia streams. TOMCCAP 11(4), 54 (2015)
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)
Acknowledgement
This work is supported by the National Key Research & Development Plan of China (No. 2017YFB1002800), by the National Natural Science Foundation of China under Grant 61872424, 61572503, 61720106006, 61432019, and by NUPTSF (No. NY218001), also supported by the Key Research Program of Frontier Sciences, CAS, Grant NO. QYZDJ-SSW-JSC039, and the K.C. Wong Education Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhang, J., Bao, BK., Xu, C. (2019). Scene Recognition via Bi-enhanced Knowledge Space Learning. In: Chang, CY., Lin, CC., Lin, HH. (eds) New Trends in Computer Technologies and Applications. ICS 2018. Communications in Computer and Information Science, vol 1013. Springer, Singapore. https://doi.org/10.1007/978-981-13-9190-3_23
Download citation
DOI: https://doi.org/10.1007/978-981-13-9190-3_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9189-7
Online ISBN: 978-981-13-9190-3
eBook Packages: Computer ScienceComputer Science (R0)