Abstract
During the past few years, the Bag-of-Words (BoW) model based on SIFT features has been one of the most adopted approaches by the Content-Based Image Retrieval (CBIR) systems. However, these CBIR systems have shown some weaknesses and shortcomings especially for large scale image collections. This is due to two main causes: First, information is lost in the quantization step and second, the SIFT features describe only the local gradient. To tackle these issues, we proposed to take advantage of the Hamming Embedding, soft assignment and multiple assignment techniques, on the one hand, and to fuse SIFT and color features at the indexing level in a multi-index structure, on the other. In fact, in this paper, generic and non-parametric image retrieval schemes as well as a novel multi-IDF design based on multi-index structure were proposed.
Extensive experiments were conducted on three public datasets (Holidays, Ukbench and MIR Flickr 1 M as distractor). The experimental results are promising and outperform the state-of-the-art CBIR systems. In addition, only 117 bits are needed to represent each key-point which enables us to make our image retrieval schema suitable for large-scale experiments.
Similar content being viewed by others
References
Andoni A, Indyk P (2008) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun ACM 51(1):117–122
Arandjelovic R and Zisserman A (2012) Three things everyone should know to improve object retrieval. In Proceedings of the 2012 I.E. Conference on Computer Vision and Pattern Recognition (CVPR), CVPR ‘12, pages 2911–2918, Washington, DC, USA, IEEE Computer Society
Babenko, A and Lempitsky VS (2012) The inverted multi-index. In 2012 I.E. Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, June 16–21, 2012, pages 3069–3076
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110(3):346–359
Bosch A, Zisserman A, Muñoz X (2008) Scene classification using a hybrid generative/discriminative approach. IEEE Trans Pattern Anal Mach Intell 30(4):712–727
Chen X, Hu, X and Shen X (2009) Spatial weighting for bag-of-visual-words and its application in content-based image retrieval. In Theeramunkong T, Kijsirikul B, Cercone N and Ho TB editors, PAKDD, volume 5476 of Lecture Notes in Computer Science, pages 867–874. Springer
Datar M, Immorlica N, Indyk P and Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the Twentieth Annual Symposium on Computational Geometry, SCG ‘04, pages 253–262, New York, NY, USA, ACM
Elleuch Z and Marzouki K (2013) Optimization of BOW using self organizing map artificial neural network in similar images retrieval systems. In Pattern Recognition and Image Analysis - 6th Iberian Conference, IbPRIA 2013, Funchal, Madeira, Portugal, June 5–7, 2013. Proceedings, pages 330–339
Fernando B Fromont É Muselet D and Sebban M (2012) Discriminative feature fusion for image classification. In 2012 I.E. Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, June 16–21, 2012, pages 3434–3441
Hua X-S, Wang S, Li S, Lu W, Wang J (2011) Contextual image search. In ACM Multimedia
Indyk P and Motwani R (1998) Approximate nearest neighbors: Towards removing the curse of dimensionality. In Vitter JS editor, Proceedings of the Thirtieth Annual ACM Symposium on the Theory of Computing, Dallas, Texas, USA, May 23–26, 1998, pages 604–613. ACM
Jain M Jégou H and Gros P (2011) Asymmetric hamming embedding: taking the best of our bits for large scale image search. In K. Selçuk Candan, Sethuraman Panchanathan, Balakrishnan Prabhakaran, Hari Sundaram, Wu-chi Feng, and Nicu Sebe, editors, Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28–December 1, 2011, pages 1441–1444. ACM
Jegou H, Douze M and Schmid C (2008a) Hamming embedding and weak geometric consistency for large scale image search. In Proceedings of the 10th European Conference on Computer Vision: Part I, ECCV ‘08, pages 304–317, Berlin, Heidelberg, Springer-Verlag
Jegou H, Douze, M and Schmid, C (2008b) Recent advances in large scale image search. In Frank Nielsen, editor, Emerging Trends in Visual Computing, LIX Fall Colloquium, ETVC 2008, Palaiseau, France, November 18–20, 2008. Revised Invited Papers, volume 5416 of Lecture Notes in Computer Science, pages 305–326. Springer
Jegou H, Douze M and Schmid C (2009) On the burstiness of visual elements. In 2009 I.E. Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–-25 June 2009, Miami, Florida, USA, pages 1169–1176. IEEE Computer Society
Jegou H, Douze M, Schmid C (2010) Improving bag-of-features for large scale image search. Int J Comput Vis 87(3):316–336
Ji, R, Xie, X, Yao, H and Wei-Ying Ma. (2009) Vocabulary hierarchy optimization for effective and transferable retrieval. In 2009 I.E. Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA, pages 1161–1168
Jiang K, Que Q and Kulis B (2015) Revisiting kernelized locality-sensitive hashing for improved large-scale image retrieval. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7–12, 2015, pages 4933–4941. IEEE
Ke K and Sukthankar R (2004) Pca-sift: a more distinctive representation for local image descriptors. In Proceedings of the 2004 I.E. computer society conference on Computer vision and pattern recognition, CVPR’04, pages 506–513, Washington, DC, USA. IEEE Computer Society
Khan FS, van de Weijer J, Vanrell M (2012) Modulating shape features by color attention for object recognition. Int J Comput Vis 98(1):49–64
Khan FS, Rao MA, van de Weijer J, Bagdanov A, Lopez A, Felsberg M (2013) Coloring Action Recognition in Still Images. Int J Comput Vis 105(3):205–221
Kohonen T (1990) The self-organizing map. Proc IEEE 78:1464–1480
Liang Z, Wang S, Liu Z and Tian Q (2014) Packing and padding: Coupled multi-index for accurate image retrieval. In Computer Vision and Pattern Recognition (CVPR), 2014 I.E. Conference on, pages 1947–1954. IEEE
Liang Z, Wang S, Tian Q (2014a) Coupled binary embedding for large-scale image retrieval. IEEE Trans Image Process 23(8):3368–3380
Liang Z, Wang S, Tian Q (2014b) Lp-norm idf for scalable image retrieval. Image Processing, IEEE Transactions On. doi:10.1109/TIP.2014.2329182
Lin J, Morère O, Petta J, Chandrasekhar V and Veillard A (2015) Tiny descriptors for image retrieval with unsupervised triplet hashing. CoRR, abs/1511.03055
Liu X, Lou Y, Yu AW and Lang B (2011) Search by mobile image based on visual and spatial consistency. In Proceedings of the 2011 I.E. International Conference on Multimedia and Expo, ICME 2011, 11–15 July, 2011, Barcelona, Catalonia, Spain, pages 1–6
Liu Z, Li H, Zhou W and Tian Q (2012) Embedding spatial context information into inverted file for large-scale image retrieval. In Proceedings of the 20th ACM Multimedia Conference, MM ‘12, Nara, Japan, October 29–November 02, 2012, pages 199–208
Liu Z, Wang S, Zheng L and Tian Q (2014) Visual reranking with improved image graph. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, Florence, Italy, May 4–-9, 2014, pages 6889–3893. IEEE
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
MacQueeen JB (1967) Some methods for classification and analysis of multivariate observations. In LM Le Cam and J Neyman editors, Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability, volume 1, pages 281–297. University of California Press
Mikolajczyk K, Schmid C (2004) Scale & affine invariant interest point detectors. Int J Comput Vis 60(1):63–86
Niblack W, Barber R, Equitz W, Flickner M, Glasman EH, Petkovic D, Yanker P, Faloutsos C and Taubin G (1993) The qbic project: Querying images by content, using color, texture, and shape. In Storage and Retrieval for Image and Video Databases (SPIE), pages 173–187
Nister D and Stewenius, H (2006) Scalable recognition with a vocabulary tree. In Proceedings of the 2006 I.E. Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2, CVPR ‘06, pages 2161–2168, Washington, DC, USA, IEEE Computer Society
Norouzi, M and Fleet, DJ (2011) Minimal loss hashing for compact binary codes. In Lise Getoor and Tobias Scheffer, editors, Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28–July 2, 2011, pages 353–360. Omnipress
Ogle VE, Stonebraker M (1995) Chabot: Retrieval from a relational database of images. IEEE Comput 28(9):40–48
Philbin J, Chum O, Isard M, Sivic J and Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Philbin J, Chum O, Isard M, Sivic J and Zisserman A (2008) Lost in quantization: Improving particular object retrieval in large scale image databases. In 2008 I.E. Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 24–-26 June 2008, Anchorage, Alaska, USA. IEEE Computer Society
Shen X, Lin, Z, Brandt, J, Avidan, S and Wu, Y (2012) Object retrieval and localization with spatially-constrained similarity measure and k-nn re-ranking. In 2012 I.E. Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, June 16–-21, 2012, pages 3013–3020. IEEE Computer Society
Sivic, J and Zisserman, A (2003) Video google: A text retrieval approach to object matching in videos. In Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2, ICCV ‘03, pages 1470–, Washington, DC, USA, IEEE Computer Society
Tolias G, Jégou H (2014) Visual query expansion with or without geometry: Refining local descriptors by feature aggregation. Pattern Recogn 47(10):3466–3476
van de Sande K, Gevers T, Snoek C (2010) Evaluating color descriptors for object and scene recognition. IEEE Trans Pattern Anal Mach Intell 32(9):1582–1596
van de Weijer J, Gevers T, Bagdanov AD (2006) Boosting color saliency in image feature detection. IEEE Trans Pattern Anal Mach Intell 28(1):150–156
Wang X, Yang M Cour T, Zhu S, Yu K and Han TX (2011) Contextual weighting for vocabulary tree based image retrieval. In DN Metaxas, L Quan, A Sanfeliu and LJ Van Gool, editors, IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, Spain, November 6–13, 2011, pages 209–216. IEEE
Wang J, Wang J, Ke Q, Zeng G and Li S (2013) Fast approximate k-means via cluster closures. CoRR, abs/1312.3061
Weiss Y, Torralba, A and Fergus, R (2008) Spectral hashing. In D Koller, D Schuurmans, Y Bengio, and L Bottou, editors, Advances in Neural Information Processing Systems 21, Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December 8–11, 2008, pages 1753–1760. Curran Associates, Inc.
Wengert C, Douze, M and Jégou, H (2011) Bag-of-colors for improved image search. In Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28–December 1, 2011, pages 1437–1440
Yanai K (2005) Image collector ii: A system to gather a large number of images from the web. IEICE Trans 88-D(10):2432–2436
Yun F, Cao L, Guo G and Huang TS (2008) Multiple feature fusion by subspace learning. In Proceedings of the 2008 International Conference on Content-based Image and Video Retrieval, CIVR ‘08, pages 127–134, New York, NY, USA, ACM
Zhang S, Yang M, Cour T, Yu K and Metaxas DN (2012) Query specific fusion for image retrieval. In Computer Vision - ECCV 2012 - 12th European Conference on Computer Vision, Florence, Italy, October 7–13, 2012, Proceedings, Part II, pages 660–673
Zhang S, Yang M, Wang X, Lin Y and Tian Q (2013) Semantic-aware co-indexing for image retrieval. In IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, December 1–8, 2013, pages 1673–1680. IEEE
Zhou W, Li, H, Lu, Y and Tian, Q (2013) Sift match verification by geometric coding for large-scale partial-duplicate web image search. ACM Trans Multimed Comput Commun Appl, 9(1):4:1–4:18
Zhou W, Li H, Lu Y, Wang M, Tian Q (2015) Visual word expansion and BSIFT verification for large-scale image search. Multimedia Systems 21(3):245–254
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Elleuch, Z., Marzouki, K. Multi-index structure based on SIFT and color features for large scale image retrieval. Multimed Tools Appl 76, 13929–13951 (2017). https://doi.org/10.1007/s11042-016-3788-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3788-1