Abstract
Image annotation systems aim at automatically annotating images with semantic keywords. Machine learning approaches are often used to develop these systems. In this paper, we propose an image annotation approach by incorporating word correlations into multi-class support vector machine (SVM). At first, each image is segmented into five fixed-size blocks instead of time-consuming object segmentation. Every keyword from training images is manually assigned to the corresponding block and word correlations are computed by a co-occurrence matrix. Then, MPEG-7 visual descriptors are applied to these blocks to represent visual features and the minimal-redundancy-maximum-relevance (mRMR) method is used to reduce the feature dimension. A block-feature-based multi-class SVM classifier is trained for 80 semantic concepts. At last, the probabilistic outputs from SVM and the word correlations are integrated to obtain the final annotation keywords. The experiments on Corel 5000 dataset demonstrate our approach is effective and efficient.
Similar content being viewed by others
Notes
More instances of blocks are shown in Fig. 9. For disambiguation, in the following, we use “block” to denote tile or sub-image and “image” to denote full-size picture.
Sequence number 1–4 in Fig. 3.
Sequence number 5 in Fig. 3.
Corel data are distributed through http://www.emsps.com/photocd/corelcds.htm.
References
Carneiro G, Chan AB, Moreno PJ, Vasconcelos N (2007) Supervised Learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 29(3):394–410
Chang C-C, Lin C-J (2001) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm
Chang E, Goh K, Sychay G, Wu G (2003) CBSA: content-based soft annotation for multimodal image retrieval using Bayes point machines. IEEE Trans Circuits Syst Video Technol 13(1):26–38
Chen YW, Lin CJ (2006) Combining SVMs with various feature selection strategies. Stud Fuzziness Soft Comput 207:315
Chen Y, Wang JZ (2004) Image categorization by learning and reasoning with regions. J Mach Learn Res 5:913–939
Cusano C, Ciocca G, Schettini R (2003) Image annotation using SVM. In: Proceedings of SPIE, p 330
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2)
Duygulu P, Barnard K, de Freitas JFG, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. Lect Notes Comput Sci 97–112
Eidenberger H (2003) How good are the visual MPEG-7 features? In: SPIE and IEEE visual communications and image processing conference, Lugano, Switzerland
Fan J, Gao Y, Luo H, Xu G (2004) Automatic image annotation by using concept-sensitive salient objects for image content representation. In: Proceedings of the 27th ACM SIGIR conference, pp 361–368
Fellbaum C et al (1998) WordNet: an electronic lexical database. MIT press, Cambridge
Goh KS, Chang EY, Li B (2005) Using one-class and two-class SVMs for multiclass image annotation. IEEE Trans Know Data Eng 17(10):1333–1346
Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th international conference on machine learning
Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425
Jeannin S (2001) Mpeg-7 visual part of experimentation model version 9.0. ISO/IEC JTC1/SC29/WG11, 3914
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference, pp 119–126
Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. Lec Notes Comput Sci, pp 171–171
Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2
Manjunath BS, Ohm JR, Vasudevan VV, Yamada A (2001) Color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11(6):703–715
Monay F, Gatica-Perez D (2003) On image auto-annotation with latent space models. In: Proceedings of the eleventh ACM international conference on multimedia, pp 275–278
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 1226–1238
Qi X, Han Y (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recognit 40(2):728–741
Rasiwasia N, Moreno PJ, Vasconcelos N (2007) Bridging the gap: query by semantic example. IEEE Trans Multimed 9(5):923–938
Rahman MM, Desai BC, Bhattacharya P (2006) A feature level fusion in similarity matching to content-based image retrieval. In: The 9th international conference on information fusion, pp 1–6
Stricker M, Dimai A (1997) Spectral covariance and fuzzy regions for image indexing. Mach Vis Appl 10(2):66–73
Tang J, Lewis P (2007) A study of quality issues for image auto-annotation with the corel data-set. IEEE Trans Circuits Syst Video Technol (No. 3):384–389
Tsai CF, McGarry K, Tait J (2006) CLAIRE: a modular support vector image indexing and classification system. ACM Trans Inf Syst 24(3):353–379
Vapnik VN (2000) The nature of statistical learning theory. Springer
Wang JZ, Li J, Wiederhold G (2001) SIMPLIcity: Semantics-sensitive integrated matching for picture libraries. IEEE Trans Pattern Anal Mach Intell, 947–963
Wong RCF, Leung CHC (2008) Automatic Semantic Annotation of Real-World Web Images. IEEE Trans Pattern Anal Mach Intell 30(11):1933–1944
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques
Wu TF, Lin CJ, Weng RC (2004) Probability estimates for multi-class classification by pairwise coupling. J Mach Learn Res 5:975–1005
Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of 20th international conference on machine leaning
Zhou X, Wang M, Zhang Q, Zhang J, Shi B (2007) Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching. In: Proceedings of the 6th ACM international conference on image and video retrieval, pp 25–32
Acknowledgments
We gratefully thank anonymous reviewers for their constructive comments. This work is supported by the Natural Science Foundation of China (60970047), the Natural Science Foundation and the Key Science-Technology Project of Shandong Province(Y2008 G19, 2007GG100-01002, 2008GG10001026).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, L., Ma, J. Image annotation by incorporating word correlations into multi-class SVM. Soft Comput 15, 917–927 (2011). https://doi.org/10.1007/s00500-010-0558-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-010-0558-2