Abstract
In the past decade, the popular Bag of Visual Words approach has been applied to many computer vision tasks, including image classification, video search, robot localization, and texture recognition. Unfortunately, most approaches use intensity features and discard color information, an important characteristic of any image that is motivated by human vision. Besides, if background colors are higher than foreground ones, Dominant Color Descriptor (DCD) retrieves images that contain similar background colors correctly. On the other hand, just color feature extraction is not sufficient for similar objects with different color descriptors (e.g. white dog vs. black dog). To solve these problems, a new Salient DCD (SDCD) color descriptor is proposed to extract foreground color and add semantic information into DCD based on the color distances and salient object extraction methods. Besides, a new fusion model is presented to fuse SDCD histogram and PHOW MSDSIFT histogram. Performance evaluation on several datasets proves that the new approach outperforms other existing, state-of-the-art methods.
Similar content being viewed by others
References
Abdelkhalak B, Zouaki H (2015) Content-based bird retrieval using shape context, color moments and bag of features. Int J Comput Sci Issues (IJCSI) 12(1):101
Alqasrawi Y, Neagu D, Cowling PI (2011) Fusing integrated visual vocabularies-based bag of visual words and weighted colour moments on spatial pyramid layout for natural scene image classification. SIViP 7(4):759–775
Bannour H, Hudelot C (2013) Building and using fuzzy multimedia ontologies for semantic image annotation. Multimedia Tools Appl 72(3):2107–2141
Barata C, Marques JS, Rozeira J (2006) Evaluation of color based keypoints and features for the classification of melanomas using the bag-of-features model. In: Proceedings of the 9th European conference on computer vision computer vision – ECCV 2006, Part I. Graz, pp 40–49
Bay H, Tuytelaars T, Van Gool L (2006) SURF: Speeded up robust features, pp 404–417
Berg AC (2006) SVM-KNN: Discriminative nearest neighbor classification for visual category
Borji A (2014) What is a salient object? A dataset and a baseline model for salient object detection. (Xxx):1–15
Borji A, Sihite DN, Itti L (2012) Salient object detection: a benchmark, pp 414–429
Borji A, Cheng M-M, Jiang H, Li J (2014) Salient object detection: a survey, pp 1–26
Bosch A, Zisserman A, Mu X, Munoz X (2007) Image classification using random forests and ferns. Iccv, pp 1–8
Chen J, Li Q, Peng Q, Wong KH (2015) Csift based locality-constrained linear coding for image classification. Pattern Anal Appl 18(2):441–450
Chiang C-C (2013) Interactive tool for image annotation using a semi-supervised and hierarchical approach. Comput Standards Interfaces 35(1):50–58
Csurka G, Dance CR, Fan L, Willamowski J, Bray C, Maupertuis D (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, vol 1, pp 1–2
Dalal N, Triggs B, Europe D (2005) Histograms of oriented gradients for human detection
Deng Y, Kenney C, Moore MS, Manjunath BS (1999) Peer group filtering and perceptual color image quantization IV-22. In: Proceedings of the 1999 IEEE international symposium on circuits and systems, 1999. ISCAS’99, vol 4, pp 21–24
Dey V, Zhang Y, Zhong M, Geomatics Engineering (2010) A review on image segmentation techniques with. XXXVIII:31–42
Fakhari A, Moghadam AME (2013) Combination of classification and regression in decision tree for multi-labeling image annotation and retrieval. Appl Soft Comput 13(2):1292–1302
Griffin P, Holub G, Perona A (2007) Caltech-256 object category dataset
Hua G, Wang L, Xue J, Zheng N (2011) Automatic salient object extraction with contextual cue. In: 2011 international conference on computer vision, pp 105–112
Ionescu RT, Popescu M, Grozea C (2007) Local learning to improve bag of visual words model for facial expression recognition
Islam M, Zhang D, Lu G (2008) Automatic categorization of image regions using dominant color based vector quantization. In: Proceedings - digital image computing: techniques and applications, DICTA 2008, pp 191–198
Jalali S, Tan C, Ong S-H, Seekings PJ, Taylor EA (2013) Visual recognition using a combination of shape and color features. In: (CogSci), the annual meeting of the cognitive science society, pp 2638–2643
Jiang H, Wang J, Yuan Z, Liu T, Zheng N, Li S (2011) Automatic salient object segmentation based on context and shape prior. In: BMVC, vol 6, p 9
Jiang H, Wang J, Yuan Z, Wu Y, Zheng N, Li S (2013) Salient object detection: a discriminative regional feature integration approach. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, pp 2083–2090
Khan FS, vande Weijer J, Vanrell M (2011) Modulating shape features by color attention for object recognition. Int J Comput Vis 98(1):49–64
Kim J, Grauman K (2011) Boundary preserving dense local regions. In: Cvpr 2011, pp 1553–1560
Kim M-U, Yoon K (2014) Performance evaluation of large-scale object recognition system using bag-of-visual words model. Multimedia Tools and Applications
Kulkarni G, Premraj V, Dhar S, Li S, Choi Y, Berg AC, Berg TL (2011) Baby talk: understanding and generating simple image descriptions. Cvpr 2011, pp 1601–1608
Lampert CH, Blaschko MB, Hofmann T (2008) Beyond sliding windows: object localization by efficient subwindow search. In: 2008 IEEE conference on computer vision and pattern recognition, pp 1–8
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition - volume 2 (CVPR’06), vol 2, pp 2169–2178
Lee C-H, Yang H-C, Wang S-H (2011) An image annotation approach using location references to enhance geographic knowledge discovery. Expert Syst Appl 38 (11):13792–13802
Li F-F, Fergus R, Perona P (2007) Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70
Liu C, Yuen J, Torralba A, Sivic J, Freeman WT (2008) SIFT flow: dense correspondence across different scenes. 1(1):28–42
Liu W, Tao D (2013) Multiview hessian regularization for image annotation. IEEE Trans Image Process 22(7):2676–2687
Liu W, Tao D, Cheng J, Tang Y (2014) Multiview Hessian discriminative sparse coding for image annotation. Comput Vis Image Underst 118:50–60
Long X, Lu H, Li W (2012) Image classification based on nearest neighbor basis vectors. Multimedia Tools Appl 71(3):1559–1576
Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2, pp 1150–1157
Luo Y, Liu T, Tao D, Xu C (2015) Multiview matrix completion for multilabel image classification. IEEE Trans Image Process 24(8):2355–2368
Luo Y, Wen Y, Tao D, Gui J, Xu C (2016) Large margin multi-modal multi-task feature extraction for image classification. IEEE Trans Image Process 25 (1):414–427
Mansourian L, Abdullah MT, Abdullah LN, Azman A (2015) Evaluating classification strategies in bag of sift feature method for animal recognition. Res J Appl Sci Eng Technol 10(11):1266–1272
Mansourian L, Abdullah MT, Abdullah LN, Azman A, Mustaffa MR (2016) A salient based bag of visual word model (sbbovw): Improvements toward difficult object recognition and object location in image retrieval. KSII Trans Internet Inf Syst 10(2):769–786
Mesleh AMA (2007) Chi square feature extraction based svms arabic language text categorization system. J Comput Sci 3(6):430–435
Mikolajczyk K, Schmid C (2004) Scale & affine invariant interest point detectors. Int J Comput Vis 60(1):63–86
Mikolajczyk K, Leibe B, Schiele B, Darmstadt TU (2005) Local features for object class recognition
Murphy K, Torralba A, Eaton D, Freeman W (2006) Object detection and localization using local and global features, pp 382–400
O’Hara S, Draper BA (2011) Introduction to the bag of features paradigm for image classification and retrieval, pp 1–25
Oquab M (2012) Is object localization for free? Weakly-supervised learning with convolutional neural networks. (iii)
Li F-F, Perona P (2005) A Bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 2, pp 524–531
Rassem TH, Khoo BE (2011) Object class recognition using combination of color sift descriptors. In: 2011 IEEE international conference on imaging systems and techniques (IST). IEEE, pp 290–295
Talib A, Mahmuddin M, Husni H, George LE (2013) A weighted dominant color descriptor for content-based image retrieval. J Vis Commun Image Represent 24 (3):345–360
Tian D (2014) Semi-supervised learning for automatic image annotation based on bayesian framework. Intern J Control Autom 7(6):213–222
Tousch AM, Herbin S, Audibert JY (2012) Semantic hierarchies for image annotation: a survey. Pattern Recog 45(1):333–345
Vedaldi A, Fulkerson B (2010) VLFeat - an open and portable library of computer vision algorithms. Design 3(1):1–4
Vigo DAR, Khan FS, van de Weijer J, Gevers T (2010) The impact of color on bag-of-words based object recognition. In: 2010 20th international conference on pattern recognition, pp 1549–1553
Wang H, Nie F, Huang H (2013) Multi-view clustering and feature learning via structured sparsity. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 352–360
Wang J, Yang J, Yu K, Lv F, Huang T, Gong Y Locality-constrained linear coding for image classification
Wang P, Wang J, Zeng G, Feng J, Zha Hongbin, Li S (2012) Salient object detection for searched web images via global saliency. In: 2012 IEEE conference on computer vision and pattern recognition, pp 3194–3201
Van De Weijer J, Khan FS (2013) Fusing color and shape for bag-of-words, pp 25–34
Yamada A (2001) MPEG-7 Visual part of experimentation Model Version 9.0. ISO/IEC JTC1/SC29/WG11/N3914
Yan Q, Xu L, Shi J, Jia J (2013) Hierarchical saliency detection. In: 2013 IEEE conference on computer vision and pattern recognition, pp 1155–1162
Yang N, Kuo C, Chang W, Lee T (2008) A fast method for dominant color descriptor with new similarity measure. iscom2005
Zhang D (2004) Improving image retrieval performance by using both color and texture features. In: 3rd international conference on image and graphics (ICIG’04), pp 4–7
Zhang D, Islam MM, Lu G (2012) A review on automatic image annotation techniques. Pattern Recogn 45(1):346–362
Zhang D, Islam MM, Lu G (2013) Structural image retrieval using automatic image annotation and region based inverted file. J Vis Commun Image Represent 24 (7):1087–1098
Zhang J, Marszaek M, Lazebnik S, Schmid C (2006) Local features and kernels for classification of texture and object categories: a comprehensive study. Int J Comput Vis 73(2):213–238
Zhong S-h, Liu Y, Liu Y, Fu-lai C (2012) Region level annotation by fuzzy based contextual cueing label propagation. Multimedia Tools Appl 70(2):625–645
Author information
Authors and Affiliations
Corresponding author
Additional information
This article was kindly supported by the Malaysian Ministry of Higher Education under the Fundamental Research Grant Scheme (FRGS).
Rights and permissions
About this article
Cite this article
Mansourian, L., Abdullah, M.T., Abdullah, L.N. et al. An effective fusion model for image retrieval. Multimed Tools Appl 77, 16131–16154 (2018). https://doi.org/10.1007/s11042-017-5192-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5192-x