Skip to main content

Web Image Annotation Based on Automatically Obtained Noisy Training Set

  • Conference paper
Progress in WWW Research and Development (APWeb 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4976))

Included in the following conference series:

Abstract

Training data acquisition is a problem in large scale statistical learning based web image annotation. A common idea is to build a large training set by analyzing the web content automatically. However, the noisy data is unavoidable involved in this kind of approach. In this paper, we present a novel web image annotation method based on noisy training set using Mixture Component based Local Fisher Discriminant Analysis (MLFDA). In our method, image annotation is viewed as a multiple class classification problem. To alleviate the influence of the noisy data, the separating hyper planes between different classes are learned by kernel-based local fisher discriminant analysis. Then the mixture components for each class are estimated in the subspace, where the noisy modals will gain small weights and play less important role in classification. The experimental results on a real-world web data set of 4000 images show that our method outperforms MBRM [3] and SVM-based method with F 1 measure improving 83% and 18% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Proc. CVPR 2004, pp. 1002–1009 (2004)

    Google Scholar 

  2. Barnard, K., Forsyth, D.: Learning the semantics of words and pictures. In: Proc. ICCV, pp. 408–415 (2001)

    Google Scholar 

  3. Gao, Y.L., Fan, J.P., Xue, X.Y., Jain, R.: Automatic Image Annotation by Incorporating Feature Hierarchy and Boosting to Scale up SVM Classifiers. In: Proc. ACM Multimedia, pp. 901–910 (2006)

    Google Scholar 

  4. Baudat, G., Anouar, F.: Generalized Discriminant Analysis Using a Kernel Approach. Neural Computation 12(10), 2385–2404 (2000)

    Article  Google Scholar 

  5. HTML Parser, http://htmlparser.sourceforge.net

  6. Wang, X.J., Zhang, L., Jing, F., Ma, W.Y.: AnnoSearch: Image Auto-Annotation by Search. In: Proc. CVPR, pp. 1483–1490 (2006)

    Google Scholar 

  7. Hua, Z.G., Wang, X.J., Liu, Q.S., Lu, H.Q.: Semantic Knowledge Extraction and annotation for Web Images. In: Proc. ACM Multimedia, pp. 467–470 (2005)

    Google Scholar 

  8. Li, J., Wang, J.Z.: Real-Time Computerized Annotation of Picture. In: Proc. ACM Multimedia, pp. 911–920 (2006)

    Google Scholar 

  9. Sugiyama, M.: Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction. In: Proc. ICML, pp. 905–912 (2006)

    Google Scholar 

  10. Neil, D., Lawrence, B.: Scholkopf, Estimating a Kernel Fisher Discriminant in the Presence of Label Noise. In: Proc. ICML, pp. 306–313 (2001)

    Google Scholar 

  11. Zhu, X.Q., Wu, X.D.: Class Noise vs. Attribute Noise: A Quantitative Study of Their Impacts. Proc. Artificial Intelligence Review 22, 177–210 (2004)

    Article  MATH  Google Scholar 

  12. Carneiro, G., Vasconcelos, N.: Formulating Semantic Image Annotation as a Supervised Learning Problem. In: Proc. CVPR, pp. 163–168 (2005)

    Google Scholar 

  13. Deng, C., Yu, S., Wen, J., et al.: VIPS:A Vision-Based Page Segmentation Algorithm. Microsoft Technical Report, MSR-TR-2003-79 (2003)

    Google Scholar 

  14. Cai, D., He, X.F., Li, Z.W., Ma, W.Y., Wen, J.R.: Hierarchical Clustering of WWW Image Search Results Using Visual, Textual and Link Information. In: Proc. ACM Multimedia, pp. 952–959 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Yanchun Zhang Ge Yu Elisa Bertino Guandong Xu

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, M., Zhou, X., Xu, H. (2008). Web Image Annotation Based on Automatically Obtained Noisy Training Set. In: Zhang, Y., Yu, G., Bertino, E., Xu, G. (eds) Progress in WWW Research and Development. APWeb 2008. Lecture Notes in Computer Science, vol 4976. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78849-2_64

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78849-2_64

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78848-5

  • Online ISBN: 978-3-540-78849-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics