Abstract
Training data acquisition is a problem in large scale statistical learning based web image annotation. A common idea is to build a large training set by analyzing the web content automatically. However, the noisy data is unavoidable involved in this kind of approach. In this paper, we present a novel web image annotation method based on noisy training set using Mixture Component based Local Fisher Discriminant Analysis (MLFDA). In our method, image annotation is viewed as a multiple class classification problem. To alleviate the influence of the noisy data, the separating hyper planes between different classes are learned by kernel-based local fisher discriminant analysis. Then the mixture components for each class are estimated in the subspace, where the noisy modals will gain small weights and play less important role in classification. The experimental results on a real-world web data set of 4000 images show that our method outperforms MBRM [3] and SVM-based method with F 1 measure improving 83% and 18% respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Proc. CVPR 2004, pp. 1002–1009 (2004)
Barnard, K., Forsyth, D.: Learning the semantics of words and pictures. In: Proc. ICCV, pp. 408–415 (2001)
Gao, Y.L., Fan, J.P., Xue, X.Y., Jain, R.: Automatic Image Annotation by Incorporating Feature Hierarchy and Boosting to Scale up SVM Classifiers. In: Proc. ACM Multimedia, pp. 901–910 (2006)
Baudat, G., Anouar, F.: Generalized Discriminant Analysis Using a Kernel Approach. Neural Computation 12(10), 2385–2404 (2000)
HTML Parser, http://htmlparser.sourceforge.net
Wang, X.J., Zhang, L., Jing, F., Ma, W.Y.: AnnoSearch: Image Auto-Annotation by Search. In: Proc. CVPR, pp. 1483–1490 (2006)
Hua, Z.G., Wang, X.J., Liu, Q.S., Lu, H.Q.: Semantic Knowledge Extraction and annotation for Web Images. In: Proc. ACM Multimedia, pp. 467–470 (2005)
Li, J., Wang, J.Z.: Real-Time Computerized Annotation of Picture. In: Proc. ACM Multimedia, pp. 911–920 (2006)
Sugiyama, M.: Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction. In: Proc. ICML, pp. 905–912 (2006)
Neil, D., Lawrence, B.: Scholkopf, Estimating a Kernel Fisher Discriminant in the Presence of Label Noise. In: Proc. ICML, pp. 306–313 (2001)
Zhu, X.Q., Wu, X.D.: Class Noise vs. Attribute Noise: A Quantitative Study of Their Impacts. Proc. Artificial Intelligence Review 22, 177–210 (2004)
Carneiro, G., Vasconcelos, N.: Formulating Semantic Image Annotation as a Supervised Learning Problem. In: Proc. CVPR, pp. 163–168 (2005)
Deng, C., Yu, S., Wen, J., et al.: VIPS:A Vision-Based Page Segmentation Algorithm. Microsoft Technical Report, MSR-TR-2003-79 (2003)
Cai, D., He, X.F., Li, Z.W., Ma, W.Y., Wen, J.R.: Hierarchical Clustering of WWW Image Search Results Using Visual, Textual and Link Information. In: Proc. ACM Multimedia, pp. 952–959 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, M., Zhou, X., Xu, H. (2008). Web Image Annotation Based on Automatically Obtained Noisy Training Set. In: Zhang, Y., Yu, G., Bertino, E., Xu, G. (eds) Progress in WWW Research and Development. APWeb 2008. Lecture Notes in Computer Science, vol 4976. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78849-2_64
Download citation
DOI: https://doi.org/10.1007/978-3-540-78849-2_64
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78848-5
Online ISBN: 978-3-540-78849-2
eBook Packages: Computer ScienceComputer Science (R0)