Abstract
This article addresses the problem of representation, indexing and retrieval of images through the signature-based bag of visual words (S-BoVW) paradigm, which maps features extracted from image blocks into a set of words without the need of clustering processes. Here, we propose the first ever method based on the S-BoVW paradigm that considers information of texture to generate textual signatures of image blocks. We also propose a strategy that represents image blocks with words which are generated based on both color as well as texture information. The textual representation generated by this strategy allows the application of traditional text retrieval and ranking techniques to compute the similarity between images. We have performed experiments with distinct similarity functions and weighting schemes, comparing the proposed strategy to the well-known cluster-based bag of visual words (C-BoVW) and S-BoVW methods proposed previously. Our results show that the proposed strategy for representing images is a competitive alternative for image retrieval, and overcomes the baselines in many scenarios.
Similar content being viewed by others
References
Babenko A, Lempitsky V (2015) Aggregating local deep features for image retrieval. In: Proceedings of the IEEE international conference on computer vision, pp 1269–1277
Baeza-Yates R, Ribeiro-Neto B (1999) Modern Information Retrieval
Bosch A, Zisserman A, Muñoz X (2007) Image classification using random forests and ferns. In: 2007 IEEE 11Th international conference on computer vision, pp 1–8. IEEE
Bosch A, Zisserman A, Munoz X (2007) Representing shape with a spatial pyramid kernel. In: Proceedings of the 6th ACM international conference on image and video retrieval, CIVR ’07. ACM, New York, pp 401–408
Chatzichristofis SA, Zagoris K, Boutalis YS, Papamarkos N (2010) Accurate image retrieval based on compact composite descriptors and relevance feedback information. Int J Pattern Recognit Artif Intell 24:207–244
Douze M, Jégou H (2014) The yael library. In: International conference on multimedia, pp 687–690. ACM
Dumais ST (2004) Latent semantic analysis. Annual Review of Information Science and Technology 38(1):188–230
Hartigan JA, Wong MA (1979) Algorithm as 136: a k-means clustering algorithm. Appl Stat:100–108
Jégou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: International conference on computer vision, pp 304–317. Springer
Jégou H., Douze M, Schmid C, Pérez P (2010) Aggregating local descriptors into a compact image representation. In: Conference on computer vision and pattern recognition, pp 3304–3311. IEEE
Jégou H, Perronnin F, Douze M, Sánchez J, Pérez P, Schmid C (2012) Aggregating local image descriptors into compact codes. IEEE Transactions on Pattern Analysis and Machine Intelligence 34:1704–1716
Kimura P, Cavalcanti J, Saraiva P, Torres R, Gonçalves M (2011) Evaluating retrieval effectiveness of descriptors for searching in large image databases. Journal of Information and Data Management 2:305–321
Lazebnik S, Schmid C, Ponce J (2005) A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence 27:1265–1278
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer society conference on computer vision and pattern recognition, vol 2, pp 2169–2178. IEEE
Li J, Wang JZ (2003) Automatic linguistic indexing of pictures by a statistical modeling approach. In: IEEE Transactions on pattern analysis and machine intelligence, vol 25, pp 1075–1088
Lowe D (1999) Object recognition from local scale-invariant features. In: International conference on computer vision, vol 2, pp 1150–1157
Mikolajczyk K, Tuytelaars T, Schmid C, Zisserman A, Matas J, Schaffalitzky F, Kadir T, Van Gool L (2005) A comparison of affine region detectors. Int J Comput Vis 65:43–72
Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: 2006 IEEE computer society conference on Computer vision and pattern recognition, vol 2, pp 2161–2168. IEEE
Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Conference on computer vision and pattern recognition, pp 1–8. IEEE
Robertson SE, Walker S (1994) Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: ACM SIGIR Conference on research and development in information retrieval, pp 232–241
Salton G, McGill MJ (1983) Introduction to Modern Information Retrieval. McGraw-Hill
Santos J, Moura E, Silva A, Cavalcanti J, Torres R, Vidal M (2015) A signature-based bag of visual words method for image indexing and search. Pattern Recogn Lett 65:1–7
Saraiva PC, Cavalcanti JMBS, de Moura E, Goncalves MA, da S, Torres R (2016) A multimodal query expansion based on genetic programming for visually-oriented e-commerce applications Information Processin & Management
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: International conference on computer vision, pp 1470–1477. IEEE
Takala V, Ahonen T, Pietikäinen M (2005) Block-based methods for image retrieval using local binary patterns. Image Anal:13–181
Vedaldi A, Fulkerson B (2010) Vlfeat: an open and portable library of computer vision algorithms. In: Proceedings of the 18th ACM international conference on multimedia, MM ’10. ACM, New York, pp 1469–1472
Vidal ML, Cavalcanti JM, de Moura ES, da Silva AS, da Silva Torres R (2012) Sorted dominant local color for searching large and heterogeneous image databases. In: International conference on pattern recognition, pp 1960–1963. IEEE
Wan J, Wang D, Hoi SCH, Wu P, Zhu J, Zhang Y, Li J (2014) Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of the 22nd ACM international conference on multimedia, pp 157–166. ACM
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83
Zhang S, Tian Q, Hua G, Huang Q, Li S (2009) Descriptive visual words and visual phrases for image applications. In: ACM Multimedia, pp 75–84
Acknowledgments
Authors thank CAPES, E-vox/FAPEAM, FAPESP, (grants #2010/52113-5, #2013/50 169-1, and #2013/50155-0) and CNPq fellowship grants (Edleno S. de Moura, Altigran S. da Silva and Ricardo Torres) for the financial support.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
dos Santos, J.M., de Moura, E.S., da Silva, A.S. et al. Color and texture applied to a signature-based bag of visual words method for image retrieval. Multimed Tools Appl 76, 16855–16872 (2017). https://doi.org/10.1007/s11042-016-3955-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3955-4