Abstract
Most successful approaches on image classification apply the Bag-of-Words (BoW) approach in the context of category-level image classification. To incorporate spatial image information in the BoW model, Spatial Pyramids (SPs) are used. However, spatial pyramids are rigid in nature and are based on predefined grid configurations. As a consequence, they often fail to coincide with the underlying spatial structure of images from different categories which may negatively affect the classification accuracy.
This study proposes a technique that addresses this problem by presenting a selective Spatial Pyramid (SP) representation for automatically learning the most appropriate shape for each category. The proposed approach provides an image representation by inferring the constituent geometrical parts. As a result, the image representation retains the descriptive spatial information to yield a structural description of the image. From large scale experiments on the Pascal VOC2007 and Caltech101, it can be derived that SPs which are obtained by selective search outperforms the standard SPs.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: European Conference on Computer Vision Workshop on Statistical Learning in Computer Vision (2004)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. Trans. Pattern Anal. Mach. Intell. 27(10), 1615–1630 (2005)
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: Computer Vision and Pattern Recognition (2005)
Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: an in-depth study: a comprehensive study. Int. J. Comput. Vis. 73(2), 213–218 (2007)
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Computer Vision and Pattern Recognition (2006)
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge 2007 results (2007)
Nedovic, V., Smeulders, A.W.M., Redert, A., Geusebroek, J.-M.: Stages as models of scene geometry. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1673–1687 (2010)
Slonim, N., Tishby, N.: Agglomerative information bottleneck. In: Neural Information Processing Systems (1999)
Fulkerson, B., Vedaldi, A., Soatto, S.: Localizing objects with smart dictionaries. In: European Conference of Computer Vision (2008)
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples. In: Computer Vision on Pattern Recognition Workshop (2004)
Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: International Conference on Computer Vision, pp. 654–661 (2005)
Delage, E., Lee, H., Ng, A.Y.: A dynamic bayesian network model for autonomous 3D reconstruction from a single indoor image. In: Computer Vision and Pattern Recognition, pp. 2418–2428 (2006)
Sudderth, E.B., Torralba, A., Freeman, W.T., Willsky, A.S.: Depth from familiar objects: a hierarchical model for 3D scenes. In: Computer Vision on Pattern Recognition, pp. 2410–2417 (2006)
Nedovic, V., Smeulders, A., Redert, A., Geusebroek, J.-M.: Depth information by stage classification. In: International Conference on Computer Vision (2007)
Marszalek, M., Schmid, C., Harzallah, H., van de Weijer, J.: Learning object representation for visual object class recognition. In: International Conference on Computer Vision on Visual recognition Challenge Workshop (2007)
van Gemert, J.: Exploiting photographic style for category-level image classification by generalizing the spatial pyramid. In: International Conference on Machine Learning (2011)
Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: International Conference on Computer Vision (2005)
Rui, L., Gijsenij, A., Gevers, T., Nedovic, V., De, X., Geusebroek, J.: Color constancy using 3D scene geometry. In: International Conference on Computer Vision (2009)
Moore, A.P., Prince, S.J.D., Warrell, J., Mohammed, U., Jones, G., Superpixel lattices. In: Conference on Computer Vision and Pattern Recognition (2008)
Lowe, D.: Distinctive image features from scale invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Gehler, P.V., Nowozin, S.: On feature combination for multiclass object classification. In: International Conference on Computer Vision (2009)
Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Largescale scene recognition from abbey to zoo. In: Computer Vision and Pattern Recognition (2010)
Khan, F., van de Weijer, J., Vanrell, M.: Top-down color attention for object recognition. In: International Conference on Computer Vision (2009)
Su, Y., Jurie, F.: Visual word disambiguation by semantic contexts. In: International Conference on Computer Vision (2011)
Sharma, G., Jurie, F.: Learning discriminative spatial representation for image classification. In: British Machine Vision Conference (2011)
Boureau, Y., Bach, F., LeCun, Y., Ponce, J.: Learning midlevel features for recognition. In: Computer Vision and Patten Recognition (2010)
Zhang, H., Berg, A.C., Maire, M., Malik, J.: SVM-KNN: discriminative nearest neighbor classification for visual category recognition. In: Computer Vision and Pattern Recognition (2006)
Boiman, O., Rehovot, I., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: Computer Vision and Pattern Recognition (2008)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: Computer Vision and Pattern Recognition (2009)
van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. Trans. Pattern Anal. Mach. Intell. 32(9), 1582–1596 (2010)
Elfiky, N.: Application of analytics in machine vision using Big Data. Asian J. Appl. Sci. 7(4), 376–385 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Elfiky, N. (2020). A Novel Spatial Layout Representation for Object Recognition. In: Hassanien, AE., Azar, A., Gaber, T., Oliva, D., Tolba, F. (eds) Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2020). AICV 2020. Advances in Intelligent Systems and Computing, vol 1153. Springer, Cham. https://doi.org/10.1007/978-3-030-44289-7_52
Download citation
DOI: https://doi.org/10.1007/978-3-030-44289-7_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44288-0
Online ISBN: 978-3-030-44289-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)