Abstract
For challenging visual recognition tasks such as scene classification and object detection there is a need to bridge the semantic gap between low-level features and the semantic concept descriptors. This requires mapping a scene image onto a semantic representation. Semantic multinomial (SMN) representation is a semantic representation of an image that corresponds to a vector of posterior probabilities of concepts. In this work we propose to build a concept neural network (CoNN) to obtain the SMN representation for a scene image. An important issue in building a CoNN is that it requires the availability of ground truth concept labels. In this work we propose to use pseudo-concepts obtained from feature maps of higher level layers of convolutional neural network. The effectiveness of the proposed approaches are studied using standard datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ke, Y., Sukthankar, R.: PCA-SIFT: a more distinctive representation for local image descriptors. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2, p. II. IEEE (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)
Chatfield, K., Lempitsky, V.S., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: BMVC, vol. 2, p. 8 (2011)
Rasiwasia, N., Vasconcelos, N.: Holistic context models for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 34(5), 902–917 (2012)
Perina, A., Cristani, M., Murino, V.: Learning natural scene categories by selective multi-scale feature extraction. Image Vis. Comput. 28(6), 927–939 (2010)
Rasiwasia, N., Moreno, P.J., Vasconcelos, N.: Bridging the gap: query by semantic example. IEEE Trans. Multimed. 9(5), 923–938 (2007)
Thenkanidiyoor, V., Chandra Sekhar, C.: Dynamic kernels based approaches to analysis of varying length patterns in speech and image processing tasks. In: Pattern Recognition and Big Data, p. 407 (2016)
Gupta, S., Dileep, A.D., Thenkanidiyoor, V.: The semantic multinomial representation of images obtained using dynamic kernel based pseudo-concept SVMs. In: National Conference on Communication (2017)
Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2528–2535. IEEE (2010)
Vogel, J., Schiele, B.: Semantic modeling of natural scenes for content-based image retrieval. Int. J. Comput. Vis. 72(2), 133–157 (2007)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105 (2012)
Vogel, J., Schiele, B.: Natural scene retrieval based on a semantic modeling step. In: Enser, P., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 207–215. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27814-6_27
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Advances in Neural Information Processing Systems, pp. 487–495 (2014)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Tras. Intell. Syst. Technol. 2(3), 27 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Pradhan, D.K., Gupta, S., Thenkanidiyoor, V., Aroor Dinesh, D. (2018). Semantic Multinomial Representation for Scene Images Using CNN-Based Pseudo-concepts and Concept Neural Network. In: Rameshan, R., Arora, C., Dutta Roy, S. (eds) Computer Vision, Pattern Recognition, Image Processing, and Graphics. NCVPRIPG 2017. Communications in Computer and Information Science, vol 841. Springer, Singapore. https://doi.org/10.1007/978-981-13-0020-2_35
Download citation
DOI: https://doi.org/10.1007/978-981-13-0020-2_35
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0019-6
Online ISBN: 978-981-13-0020-2
eBook Packages: Computer ScienceComputer Science (R0)