Skip to main content
Log in

An effective fusion model for image retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In the past decade, the popular Bag of Visual Words approach has been applied to many computer vision tasks, including image classification, video search, robot localization, and texture recognition. Unfortunately, most approaches use intensity features and discard color information, an important characteristic of any image that is motivated by human vision. Besides, if background colors are higher than foreground ones, Dominant Color Descriptor (DCD) retrieves images that contain similar background colors correctly. On the other hand, just color feature extraction is not sufficient for similar objects with different color descriptors (e.g. white dog vs. black dog). To solve these problems, a new Salient DCD (SDCD) color descriptor is proposed to extract foreground color and add semantic information into DCD based on the color distances and salient object extraction methods. Besides, a new fusion model is presented to fuse SDCD histogram and PHOW MSDSIFT histogram. Performance evaluation on several datasets proves that the new approach outperforms other existing, state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Abdelkhalak B, Zouaki H (2015) Content-based bird retrieval using shape context, color moments and bag of features. Int J Comput Sci Issues (IJCSI) 12(1):101

    Google Scholar 

  2. Alqasrawi Y, Neagu D, Cowling PI (2011) Fusing integrated visual vocabularies-based bag of visual words and weighted colour moments on spatial pyramid layout for natural scene image classification. SIViP 7(4):759–775

    Article  Google Scholar 

  3. Bannour H, Hudelot C (2013) Building and using fuzzy multimedia ontologies for semantic image annotation. Multimedia Tools Appl 72(3):2107–2141

    Article  Google Scholar 

  4. Barata C, Marques JS, Rozeira J (2006) Evaluation of color based keypoints and features for the classification of melanomas using the bag-of-features model. In: Proceedings of the 9th European conference on computer vision computer vision – ECCV 2006, Part I. Graz, pp 40–49

  5. Bay H, Tuytelaars T, Van Gool L (2006) SURF: Speeded up robust features, pp 404–417

  6. Berg AC (2006) SVM-KNN: Discriminative nearest neighbor classification for visual category

  7. Borji A (2014) What is a salient object? A dataset and a baseline model for salient object detection. (Xxx):1–15

  8. Borji A, Sihite DN, Itti L (2012) Salient object detection: a benchmark, pp 414–429

  9. Borji A, Cheng M-M, Jiang H, Li J (2014) Salient object detection: a survey, pp 1–26

  10. Bosch A, Zisserman A, Mu X, Munoz X (2007) Image classification using random forests and ferns. Iccv, pp 1–8

  11. Chen J, Li Q, Peng Q, Wong KH (2015) Csift based locality-constrained linear coding for image classification. Pattern Anal Appl 18(2):441–450

    Article  MathSciNet  Google Scholar 

  12. Chiang C-C (2013) Interactive tool for image annotation using a semi-supervised and hierarchical approach. Comput Standards Interfaces 35(1):50–58

    Article  Google Scholar 

  13. Csurka G, Dance CR, Fan L, Willamowski J, Bray C, Maupertuis D (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, vol 1, pp 1–2

  14. Dalal N, Triggs B, Europe D (2005) Histograms of oriented gradients for human detection

  15. Deng Y, Kenney C, Moore MS, Manjunath BS (1999) Peer group filtering and perceptual color image quantization IV-22. In: Proceedings of the 1999 IEEE international symposium on circuits and systems, 1999. ISCAS’99, vol 4, pp 21–24

  16. Dey V, Zhang Y, Zhong M, Geomatics Engineering (2010) A review on image segmentation techniques with. XXXVIII:31–42

  17. Fakhari A, Moghadam AME (2013) Combination of classification and regression in decision tree for multi-labeling image annotation and retrieval. Appl Soft Comput 13(2):1292–1302

    Article  Google Scholar 

  18. Griffin P, Holub G, Perona A (2007) Caltech-256 object category dataset

  19. Hua G, Wang L, Xue J, Zheng N (2011) Automatic salient object extraction with contextual cue. In: 2011 international conference on computer vision, pp 105–112

  20. Ionescu RT, Popescu M, Grozea C (2007) Local learning to improve bag of visual words model for facial expression recognition

  21. Islam M, Zhang D, Lu G (2008) Automatic categorization of image regions using dominant color based vector quantization. In: Proceedings - digital image computing: techniques and applications, DICTA 2008, pp 191–198

  22. Jalali S, Tan C, Ong S-H, Seekings PJ, Taylor EA (2013) Visual recognition using a combination of shape and color features. In: (CogSci), the annual meeting of the cognitive science society, pp 2638–2643

  23. Jiang H, Wang J, Yuan Z, Liu T, Zheng N, Li S (2011) Automatic salient object segmentation based on context and shape prior. In: BMVC, vol 6, p 9

  24. Jiang H, Wang J, Yuan Z, Wu Y, Zheng N, Li S (2013) Salient object detection: a discriminative regional feature integration approach. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 2083–2090

  25. Khan FS, vande Weijer J, Vanrell M (2011) Modulating shape features by color attention for object recognition. Int J Comput Vis 98(1):49–64

    Article  Google Scholar 

  26. Kim J, Grauman K (2011) Boundary preserving dense local regions. In: Cvpr 2011, pp 1553–1560

  27. Kim M-U, Yoon K (2014) Performance evaluation of large-scale object recognition system using bag-of-visual words model. Multimedia Tools and Applications

  28. Kulkarni G, Premraj V, Dhar S, Li S, Choi Y, Berg AC, Berg TL (2011) Baby talk: understanding and generating simple image descriptions. Cvpr 2011, pp 1601–1608

  29. Lampert CH, Blaschko MB, Hofmann T (2008) Beyond sliding windows: object localization by efficient subwindow search. In: 2008 IEEE conference on computer vision and pattern recognition, pp 1–8

  30. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition - volume 2 (CVPR’06), vol 2, pp 2169–2178

  31. Lee C-H, Yang H-C, Wang S-H (2011) An image annotation approach using location references to enhance geographic knowledge discovery. Expert Syst Appl 38 (11):13792–13802

    Google Scholar 

  32. Li F-F, Fergus R, Perona P (2007) Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70

    Article  Google Scholar 

  33. Liu C, Yuen J, Torralba A, Sivic J, Freeman WT (2008) SIFT flow: dense correspondence across different scenes. 1(1):28–42

  34. Liu W, Tao D (2013) Multiview hessian regularization for image annotation. IEEE Trans Image Process 22(7):2676–2687

    Article  MathSciNet  MATH  Google Scholar 

  35. Liu W, Tao D, Cheng J, Tang Y (2014) Multiview Hessian discriminative sparse coding for image annotation. Comput Vis Image Underst 118:50–60

    Article  Google Scholar 

  36. Long X, Lu H, Li W (2012) Image classification based on nearest neighbor basis vectors. Multimedia Tools Appl 71(3):1559–1576

    Article  Google Scholar 

  37. Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2, pp 1150–1157

  38. Luo Y, Liu T, Tao D, Xu C (2015) Multiview matrix completion for multilabel image classification. IEEE Trans Image Process 24(8):2355–2368

    Article  MathSciNet  Google Scholar 

  39. Luo Y, Wen Y, Tao D, Gui J, Xu C (2016) Large margin multi-modal multi-task feature extraction for image classification. IEEE Trans Image Process 25 (1):414–427

    Article  MathSciNet  Google Scholar 

  40. Mansourian L, Abdullah MT, Abdullah LN, Azman A (2015) Evaluating classification strategies in bag of sift feature method for animal recognition. Res J Appl Sci Eng Technol 10(11):1266–1272

    Google Scholar 

  41. Mansourian L, Abdullah MT, Abdullah LN, Azman A, Mustaffa MR (2016) A salient based bag of visual word model (sbbovw): Improvements toward difficult object recognition and object location in image retrieval. KSII Trans Internet Inf Syst 10(2):769–786

    Google Scholar 

  42. Mesleh AMA (2007) Chi square feature extraction based svms arabic language text categorization system. J Comput Sci 3(6):430–435

    Article  Google Scholar 

  43. Mikolajczyk K, Schmid C (2004) Scale & affine invariant interest point detectors. Int J Comput Vis 60(1):63–86

    Article  Google Scholar 

  44. Mikolajczyk K, Leibe B, Schiele B, Darmstadt TU (2005) Local features for object class recognition

  45. Murphy K, Torralba A, Eaton D, Freeman W (2006) Object detection and localization using local and global features, pp 382–400

  46. O’Hara S, Draper BA (2011) Introduction to the bag of features paradigm for image classification and retrieval, pp 1–25

  47. Oquab M (2012) Is object localization for free? Weakly-supervised learning with convolutional neural networks. (iii)

  48. Li F-F, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 2, pp 524–531

  49. Rassem TH, Khoo BE (2011) Object class recognition using combination of color sift descriptors. In: 2011 IEEE international conference on imaging systems and techniques (IST). IEEE, pp 290–295

  50. Talib A, Mahmuddin M, Husni H, George LE (2013) A weighted dominant color descriptor for content-based image retrieval. J Vis Commun Image Represent 24 (3):345–360

    Article  Google Scholar 

  51. Tian D (2014) Semi-supervised learning for automatic image annotation based on bayesian framework. Intern J Control Autom 7(6):213–222

    Article  Google Scholar 

  52. Tousch AM, Herbin S, Audibert JY (2012) Semantic hierarchies for image annotation: a survey. Pattern Recog 45(1):333–345

    Article  Google Scholar 

  53. Vedaldi A, Fulkerson B (2010) VLFeat - an open and portable library of computer vision algorithms. Design 3(1):1–4

    Google Scholar 

  54. Vigo DAR, Khan FS, van de Weijer J, Gevers T (2010) The impact of color on bag-of-words based object recognition. In: 2010 20th international conference on pattern recognition, pp 1549–1553

  55. Wang H, Nie F, Huang H (2013) Multi-view clustering and feature learning via structured sparsity. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 352–360

  56. Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y Locality-constrained linear coding for image classification

  57. Wang P, Wang J, Zeng G, Feng J, Zha Hongbin, Li S (2012) Salient object detection for searched web images via global saliency. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3194–3201

  58. Van De Weijer J, Khan FS (2013) Fusing color and shape for bag-of-words, pp 25–34

  59. Yamada A (2001) MPEG-7 Visual part of experimentation Model Version 9.0. ISO/IEC JTC1/SC29/WG11/N3914

  60. Yan Q, Xu L, Shi J, Jia J (2013) Hierarchical saliency detection. In: 2013 IEEE conference on computer vision and pattern recognition, pp 1155–1162

  61. Yang N, Kuo C, Chang W, Lee T (2008) A fast method for dominant color descriptor with new similarity measure. iscom2005

  62. Zhang D (2004) Improving image retrieval performance by using both color and texture features. In: 3rd international conference on image and graphics (ICIG’04), pp 4–7

  63. Zhang D, Islam MM, Lu G (2012) A review on automatic image annotation techniques. Pattern Recogn 45(1):346–362

    Article  Google Scholar 

  64. Zhang D, Islam MM, Lu G (2013) Structural image retrieval using automatic image annotation and region based inverted file. J Vis Commun Image Represent 24 (7):1087–1098

    Article  Google Scholar 

  65. Zhang J, Marszaek M, Lazebnik S, Schmid C (2006) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vis 73(2):213–238

    Article  Google Scholar 

  66. Zhong S-h, Liu Y, Liu Y, Fu-lai C (2012) Region level annotation by fuzzy based contextual cueing label propagation. Multimedia Tools Appl 70(2):625–645

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leila Mansourian.

Additional information

This article was kindly supported by the Malaysian Ministry of Higher Education under the Fundamental Research Grant Scheme (FRGS).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mansourian, L., Abdullah, M.T., Abdullah, L.N. et al. An effective fusion model for image retrieval. Multimed Tools Appl 77, 16131–16154 (2018). https://doi.org/10.1007/s11042-017-5192-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5192-x

Keywords

Navigation