Spatial Weighting for Bag-of-Visual-Words and Its Application in Content-Based Image Retrieval

Chen, Xin; Hu, Xiaohua; Shen, Xiajiong

doi:10.1007/978-3-642-01307-2_90

Xin Chen²³,
Xiaohua Hu^23,24 &
Xiajiong Shen²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5476))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2695 Accesses

Abstract

It is a challenging and important task to retrieve images from a large and highly varied image data set based on their visual contents. Problems like how to fill the semantic gap between image features and the user have attracted a lot of attention from the research community. Recently, the ’bag of visual words’ approach exhibits very good performance in content-based image retrieval (CBIR). However, since the ’bag of visual words’ approach represents an image as an unordered collection of local descriptors which only use the intensity information, the resulting model provides little insight about the spatial constitution and color information of the image. In this paper, we develop a novel image representation method which uses Gaussian mixture model (GMM) to provide spatial weighting for visual words and apply this method to facilitate content based image retrieval. Our approach is a simple and more efficient compared with the order-less ’bag of visual words’ approach. In our method, firstly, we extract visual tokens from the image data set and cluster them into a lexicon of visual words. Then, we represent the spatial constitution of an image as a mixture of n Gaussians in the feature space and decompose the image into n regions. The spatial weighting scheme is achieved by weighting visual words according to the probability of each visual word belonging to each of the n regions in the image. The cosine similarity between spatial weighted visual word vectors is used as distance measurement between regions, while the image-level distance is obtained by averaging the pair-wise distances between regions. We compare the performance of our method with the traditional ’bag of visual words’ and ’blobworld’ approaches under the same image retrieval scenario. Experimental results demonstrate that the our method is able to tell images apart in the semantic level and improve the performance of CBIR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Enhanced bag of visual words representations for content based image retrieval: a comparative study

Article 27 May 2019

A Novel Visual Word Assignment Model for Content-Based Image Retrieval

Color and texture applied to a signature-based bag of visual words method for image retrieval

Article 27 September 2016

References

Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database and web-based tool for image annotation. MIT AI Lab Memo AIM-2005-025 (September 2005) (Revised on April 2007)
Google Scholar
Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image rerieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(12), 1349–1380 (2000)
Article Google Scholar
Lew, M.S., et al.: Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl. (2006)
Google Scholar
Yang, J., Jiang, Y.G., Hauptmann, A.G., Ngo, C.W.: Evaluating Bag-of-Visual-Words Representations in Scene Classification. In: ACM SIGMM Int’l Workshop on Multimedia Information Retrieval (MIR 2007), Augsburg, Germany (September 2007)
Google Scholar
Belongie, S., Carson, C.: Color- and Texture-Based Image Segmentation Using EM and Its Application to Content-Based Image Retrieval[C]. In: ICCV 1998, pp. 675–682 (1998)
Google Scholar
Sivic, J., Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. International Conference on Computer Vision, pp. 1470– 1477 (2003)
Google Scholar
Csurka, G., Bray, C., Dance, C., Fan, L.: Visual categorization with bags of keypoints. In: Proc. of ECCV Workshop on Statistical Learning in Computer Vision, pp. 1–22 (2004)
Google Scholar
Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. International Journal of Computer Vision 73(2), 213–238 (2007)
Article Google Scholar
Kadir, T., Brady, M.: Scale, Saliency and Image Description. International Journal of Computer Vision 45(2), 83–105 (2001)
Article MATH Google Scholar
Kadir, T., Zisserman, A., Brady, M.: An affine invariant salient region detector. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004. LNCS, vol. 3021, pp. 228–241. Springer, Heidelberg (2004)
Chapter Google Scholar
Mikolajczyk, K., Schmid, C.: A Performance Evaluation of Local descriptors. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 257–264 (2003)
Google Scholar
Bilmes, J.A.: A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models. technical report, Univ. of California, Berkeley (April 1998)
Google Scholar
Lowe, D.: Distinctive Image Features from Scale-Invariant Key Points. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Bosch, A., Zisserman, A., Muoz, X.: Scene Classification Using a Hybrid Generative/Discriminative Approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(4), 712–727
Google Scholar
Jiang, Y.-G., Ngo, C.-W., Yang, J.: Towards optimal bag-of-features for object categorization and semantic video retrieval. In: CIVR 2007, pp. 494–501 (2007)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proc. CVPR (2006)
Google Scholar
van de Weijer, J., Schmid, C.: Coloring Local Feature Extraction. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 334–348. Springer, Heidelberg (2006)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Science and Technology, Drexel University, Philadelphia, PA, USA
Xin Chen & Xiaohua Hu
College of Computer and Information Engineering, Henan University, Henan, China
Xiaohua Hu & Xiajiong Shen

Authors

Xin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohua Hu
View author publications
You can also search for this author in PubMed Google Scholar
Xiajiong Shen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Sirindhorn International Institute of Technology, Thammasat University, 131 Moo 5 Tiwanont Road, 12000, Bangkadi, Muang, Pathumthani, Thailand
Thanaruk Theeramunkong
Dept. of Computer Engineering, Faculty of Engineering, Chulalongkorn University, 10330, Bangkok, Thailand
Boonserm Kijsirikul
Faculty of Science & Engineering, York University, 355 Lumbers Building, 4700 Keele Street, M3J 1P3, Toronto, Ontario, Canada
Nick Cercone
School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, 923-1292, Ishikawa, Japan
Tu-Bao Ho

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, X., Hu, X., Shen, X. (2009). Spatial Weighting for Bag-of-Visual-Words and Its Application in Content-Based Image Retrieval. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science(), vol 5476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01307-2_90

Download citation

DOI: https://doi.org/10.1007/978-3-642-01307-2_90
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01306-5
Online ISBN: 978-3-642-01307-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Spatial Weighting for Bag-of-Visual-Words and Its Application in Content-Based Image Retrieval

Abstract

Access this chapter

Preview

Similar content being viewed by others

Enhanced bag of visual words representations for content based image retrieval: a comparative study

A Novel Visual Word Assignment Model for Content-Based Image Retrieval

Color and texture applied to a signature-based bag of visual words method for image retrieval

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Spatial Weighting for Bag-of-Visual-Words and Its Application in Content-Based Image Retrieval

Abstract

Access this chapter

Preview

Similar content being viewed by others

Enhanced bag of visual words representations for content based image retrieval: a comparative study

A Novel Visual Word Assignment Model for Content-Based Image Retrieval

Color and texture applied to a signature-based bag of visual words method for image retrieval

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation