Web Image Annotation Based on Automatically Obtained Noisy Training Set

Wang, Mei; Zhou, Xiangdong; Xu, Hongtao

doi:10.1007/978-3-540-78849-2_64

Mei Wang¹,
Xiangdong Zhou¹ &
Hongtao Xu¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4976))

Included in the following conference series:

Asia-Pacific Web Conference

873 Accesses
3 Citations

Abstract

Training data acquisition is a problem in large scale statistical learning based web image annotation. A common idea is to build a large training set by analyzing the web content automatically. However, the noisy data is unavoidable involved in this kind of approach. In this paper, we present a novel web image annotation method based on noisy training set using Mixture Component based Local Fisher Discriminant Analysis (MLFDA). In our method, image annotation is viewed as a multiple class classification problem. To alleviate the influence of the noisy data, the separating hyper planes between different classes are learned by kernel-based local fisher discriminant analysis. Then the mixture components for each class are estimated in the subspace, where the noisy modals will gain small weights and play less important role in classification. The experimental results on a real-world web data set of 4000 images show that our method outperforms MBRM [3] and SVM-based method with F ₁ measure improving 83% and 18% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Proc. CVPR 2004, pp. 1002–1009 (2004)
Google Scholar
Barnard, K., Forsyth, D.: Learning the semantics of words and pictures. In: Proc. ICCV, pp. 408–415 (2001)
Google Scholar
Gao, Y.L., Fan, J.P., Xue, X.Y., Jain, R.: Automatic Image Annotation by Incorporating Feature Hierarchy and Boosting to Scale up SVM Classifiers. In: Proc. ACM Multimedia, pp. 901–910 (2006)
Google Scholar
Baudat, G., Anouar, F.: Generalized Discriminant Analysis Using a Kernel Approach. Neural Computation 12(10), 2385–2404 (2000)
Article Google Scholar
HTML Parser, http://htmlparser.sourceforge.net
Wang, X.J., Zhang, L., Jing, F., Ma, W.Y.: AnnoSearch: Image Auto-Annotation by Search. In: Proc. CVPR, pp. 1483–1490 (2006)
Google Scholar
Hua, Z.G., Wang, X.J., Liu, Q.S., Lu, H.Q.: Semantic Knowledge Extraction and annotation for Web Images. In: Proc. ACM Multimedia, pp. 467–470 (2005)
Google Scholar
Li, J., Wang, J.Z.: Real-Time Computerized Annotation of Picture. In: Proc. ACM Multimedia, pp. 911–920 (2006)
Google Scholar
Sugiyama, M.: Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction. In: Proc. ICML, pp. 905–912 (2006)
Google Scholar
Neil, D., Lawrence, B.: Scholkopf, Estimating a Kernel Fisher Discriminant in the Presence of Label Noise. In: Proc. ICML, pp. 306–313 (2001)
Google Scholar
Zhu, X.Q., Wu, X.D.: Class Noise vs. Attribute Noise: A Quantitative Study of Their Impacts. Proc. Artificial Intelligence Review 22, 177–210 (2004)
Article MATH Google Scholar
Carneiro, G., Vasconcelos, N.: Formulating Semantic Image Annotation as a Supervised Learning Problem. In: Proc. CVPR, pp. 163–168 (2005)
Google Scholar
Deng, C., Yu, S., Wen, J., et al.: VIPS:A Vision-Based Page Segmentation Algorithm. Microsoft Technical Report, MSR-TR-2003-79 (2003)
Google Scholar
Cai, D., He, X.F., Li, Z.W., Ma, W.Y., Wen, J.R.: Hierarchical Clustering of WWW Image Search Results Using Visual, Textual and Link Information. In: Proc. ACM Multimedia, pp. 952–959 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing and Information Technology, Fudan University, China, 200433
Mei Wang, Xiangdong Zhou & Hongtao Xu

Authors

Mei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiangdong Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hongtao Xu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Yanchun Zhang Ge Yu Elisa Bertino Guandong Xu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, M., Zhou, X., Xu, H. (2008). Web Image Annotation Based on Automatically Obtained Noisy Training Set. In: Zhang, Y., Yu, G., Bertino, E., Xu, G. (eds) Progress in WWW Research and Development. APWeb 2008. Lecture Notes in Computer Science, vol 4976. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78849-2_64

Download citation

DOI: https://doi.org/10.1007/978-3-540-78849-2_64
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78848-5
Online ISBN: 978-3-540-78849-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics