Constructing a discriminative visual vocabulary with macro and micro sense of visual words

Kuo, Chung-Ming; Hsieh, Chaur-Heh; Yang, Nai-Chung; Kuo, Chang-Ming; Chang, Chi-Kao; Chen, Yu-Ming

doi:10.1007/s11042-015-2970-1

Constructing a discriminative visual vocabulary with macro and micro sense of visual words

Published: 22 October 2015

Volume 75, pages 16983–17017, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Chung-Ming Kuo¹,
Chaur-Heh Hsieh²,
Nai-Chung Yang¹,
Chang-Ming Kuo¹,
Chi-Kao Chang¹ &
…
Yu-Ming Chen¹

214 Accesses
1 Citation
Explore all metrics

Abstract

Visual vocabulary representation approach has been successfully applied to many multimedia and vision applications, including visual recognition, image retrieval, and scene modeling/categorization. The idea behind the visual vocabulary representation is that an image can be represented by visual words, a collection of local features of images. In this work, we will develop a new scheme for the construction of visual vocabulary based on the analysis of visual word contents. By considering the content homogeneity of visual words, we design a visual vocabulary which contains macro-sense and micro-sense visual words. The two types of visual words are appropriately further combined to describe an image effectively. We also apply the visual vocabulary to construct image retrieving and categorization systems. The performance evaluation for the two systems indicates that the proposed visual vocabulary achieves promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Categorization Using Macro and Micro Sense Visual Vocabulary

A FWCL-based method for visual vocabulary formation

Article 21 October 2014

Partitioned K-Means Clustering for Fast Construction of Unbiased Visual Vocabulary

References

Ancuti C, Bekaert P (2007) SIFT-CCH: Increasing the SIFT distinctness by Color Co-occurrence Histograms. IEEE Int Symp Image Signal Process Anal 130–135
Baker LD, McCallum AK (1998) Distributional clustering of words for text classification. Proc Assoc Comput Mach Spec Interes Group Inf Retr 96–103
Bekkerman R, El-Yaniv R, Tishby N, Winter Y (2001) On feature distributional clustering for text categorization. Proc. Assoc Comput Mach Spec Interes Group Inf Retr 146–153
Blei DM (2012) Probabilistic topic models. Commun ACM 55(4):77–84
Article MathSciNet Google Scholar
Blei DM, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Bolovinou A, Pratikakis I, Perantonis S (2012) Bag of spatio-visual words for context inference in scene classification. Pattern Recognit 46(2013):1039–1053
Google Scholar
Bosch A, Zisserman A, Muñoz X (2008) Scene classification using a hybrid generative/discriminative approach. IEEE Trans Pattern Anal Mach Intell 30(4):712–727
Article Google Scholar
Cao Y, Sun F, Wang D, Zhou J (2012) Image cluster and retrieval with latent Dirichlet allocation model. Int J Digit Content Technol Appl 6(18):89–98
Article Google Scholar
Deng Y, Manjunath BS, Kenney C, Moore MS, Shin H (2001) An efficient color representation for image retrieval. IEEE Trans Image Proc 10(1)
Hörster E, Lienhart R, Slaney M (2007) Image retrieval on large-scale image databases. Proceedings of the 6th ACM international conference on Image and video retrieval. 17–24
Ji R, Yao H, Liu W, Sun X, Tian Q (2012) Task-dependent visual-codebook compression. IEEE Trans Image Process 21(4):2282–2293
Article MathSciNet Google Scholar
Jiang YG, Yang J, Ngo CW, Hauptmann AG (2010) Representations of keypoint-based semantic concept detection: a comprehensive study. IEEE Trans Multimedia 12(1):42–53
Article Google Scholar
Kesorn K, Poslad S (2012) An enhanced bag-of-visual word vector space model to represent visual content in athletics images. IEEE Trans Multimedia 14(1):211–222
Article Google Scholar
Kuo C, Yang NC, Kuo CM, Huang LK (2015) Image retrieval using point- and block-based visual vocabulary. IEEE 2015 Int Sympo Next Gener Electron 1–4
Li T, Mei T, Kweon IS, Hua XS (2011) Contextual bag-of-words for visual categorization. IEEE Trans Circ Syst Video Technol 21(4):381–392
Article Google Scholar
Li FF, Perona P (2005) A bayesian hierarchical model for learning natural scene categories. IEEE Comput Vis Pattern Recognit 2:524–531
Google Scholar
Liu H, Zhang C (2007) Codebook design of keyblock based image retrieval. LNCS Entertain Comput Icec470–474
López-Sastre RJ, Tuytelaars T, RodrÍguez FJA, Bascón SM (2010) Towards a more discriminative and semantic visual vocabulary. Comput Vis Image Underst 115(2011):415–425
Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Ma WY, Deng Y, Manjunath BS (1997) Tools for texture/color based search of images. Proc SPIE 3106:496–507
Article Google Scholar
Manjunath BS, Ohm JR, Vasudevan VV, Yamada A (2001) Color and texture descriptors. IEEE Trans Circ Syst Video Technol 11(6):703–714
Article Google Scholar
Mojsilovic A, Hu J, Soljanin E (2002) Extraction of perceptually important colors and similarity measurement for image matching, retrieval, and analysis. IEEE Trans Image Proc 11(11)
Mojsilovic A, Kovacevic J, Hu J, Safranek RJ, Ganapathy SK (2000) Matching and retrieval based on the vocabulary and grammar of color patterns. IEEE Trans Image Proc 9(1)
Perronnin F (2008) Universal and adapted vocabularies for generic visual categorization. IEEE Trans Pattern Anal Mach Intell 30(7):1243–1256
Article Google Scholar
Qin J, Yung NC (2009) Scene categorization via contextual visual words. Pattern Recognit 43(2010):1874–1888
MATH Google Scholar
Ren R, Collomosse J (2012) Visual sentences for pose retrieval over low-resolution cross-media dance collections. IEEE Trans Multimedia 14(6):1652–1661
Article Google Scholar
Rocha A, Carvalho T, Jelinek HF, Goldenstein S, Wainer J (2012) Points of interest and visual dictionaries for automatic retinal lesion detection. IEEE Trans Biomed Eng 59(8):2244–2253
Article Google Scholar
Sudderth EB, Torralba A, Freeman WT, Willsky AS (2005) Describing visual scenes using transformed dirichlet processes. Adv Neural Inf Proc Syst 1297–1304
Thibos L (1989) Image processing by the human eye. Adv Intell Robot Syst Conf 1989:1148–1153
Google Scholar
Wang C, Blei D, Li FF (2009) Simultaneous image classification and annotation. IEEE Comput Vis Pattern Recog (CVPR) 1903–1910
Ward M, Grinstein G, Keim D (2010) Interactive data visualization: foundations, techniques, and application, chapter 3. Hum Percept Inf Proc 73–128, A K Peters/CRC Press
Wei S, Cheng C (2009) Wood image retrieval algorithm based on keyblock distribution. IEEE Int Conf Comput Intell Softw Eng
Wu L, Hoi SCH, Yu N (2010) Semantics-preserving bag-of-words models and applications. IEEE Trans Image Proc 19(7):1908–1920
Article MathSciNet Google Scholar
Xu S, Fang T, Li D, Wang S (2010) Object classification of aerial images with bag-of-visual words. IEEE Geosci Remote Sens Lett 7(2):366–370
Article Google Scholar
Yamada A, Pickering M, Jeannin S, Jens LC (2001) MPEG-7 visual part of experimentation model version 9.0-part 3 dominant color. ISO/IEC JTC1/SC29/WG11/N3914, Pisa
Yang NC, Chang WH, Kuo CM, Li TH (2008) A fast MPEG-7 dominant color extraction with new similarity measure for image retrieval. J Vis Commun Image Represent 19:92–105
Article Google Scholar
Zhang S, Tian Q, Hua G, Huang Q, Gao W (2011) Generating descriptive visual words and visual phrases for large-scale image applications. IEEE Trans Image Proc 20(9):3664–2677
MathSciNet Google Scholar
Zhou W, Li H, Lu Y, Tian Q (2012) Principal visual word discovery for automatic license plate detection. IEEE Trans Image Proc 21(9):4269–4279
Article MathSciNet Google Scholar
Zhu L, Rao A, Zhang A (2002) Theory of keyblock-based image retrieval. ACM Trans Inf Syst 224–257
Zhu L, Tang C, Rao A, Zhang A (2001) Using thesaurus to model keyblock-based image retrieval. IEEE Int Conf Multimedia Expo 237–240
Zhu L, Zhang A, Rao A, Cedar RS (2000) Keyblock: an approach for content-based image retrieval. ACM Multimedia 157–166

Download references

Acknowledgments

The authors would like to express their sincere thanks to the anonymous reviewers for their invaluable comments and suggestions. This work was supported by the National Science Counsel of R.O.C. Granted NSC. 102-2221-E-214 -040.

Author information

Authors and Affiliations

Department of Information Engineering, I-Shou University, Dashu, Kaohsiung, Taiwan
Chung-Ming Kuo, Nai-Chung Yang, Chang-Ming Kuo, Chi-Kao Chang & Yu-Ming Chen
Department of Computer and Communication Engineering, Ming Chuan University, Gui-Shan, Taoyuan, Taiwan
Chaur-Heh Hsieh

Authors

Chung-Ming Kuo
View author publications
You can also search for this author in PubMed Google Scholar
Chaur-Heh Hsieh
View author publications
You can also search for this author in PubMed Google Scholar
Nai-Chung Yang
View author publications
You can also search for this author in PubMed Google Scholar
Chang-Ming Kuo
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Kao Chang
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Ming Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chung-Ming Kuo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kuo, CM., Hsieh, CH., Yang, NC. et al. Constructing a discriminative visual vocabulary with macro and micro sense of visual words. Multimed Tools Appl 75, 16983–17017 (2016). https://doi.org/10.1007/s11042-015-2970-1

Download citation

Received: 08 December 2014
Revised: 18 August 2015
Accepted: 25 September 2015
Published: 22 October 2015
Issue Date: December 2016
DOI: https://doi.org/10.1007/s11042-015-2970-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constructing a discriminative visual vocabulary with macro and micro sense of visual words

Abstract

Access this article

Similar content being viewed by others

Image Categorization Using Macro and Micro Sense Visual Vocabulary

A FWCL-based method for visual vocabulary formation

Partitioned K-Means Clustering for Fast Construction of Unbiased Visual Vocabulary

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Constructing a discriminative visual vocabulary with macro and micro sense of visual words

Abstract

Access this article

Similar content being viewed by others

Image Categorization Using Macro and Micro Sense Visual Vocabulary

A FWCL-based method for visual vocabulary formation

Partitioned K-Means Clustering for Fast Construction of Unbiased Visual Vocabulary

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation