Skip to main content

Multi-modal Multi-label Semantic Indexing of Images Based on Hybrid Ensemble Learning

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4810))

Abstract

Automatic image annotation (AIA) refers to the association of words to whole images which is considered as a promising and effective approach to bridge the semantic gap between low-level visual features and high-level semantic concepts. In this paper, we formulate the task of image annotation as a multi-label multi class semantic image classification problem and propose a simple yet effective method: hybrid ensemble learning framework in which multi-label classifier based on uni-modal features and ensemble classifier based on bi-modal features are integrated into a joint classification model to perform multi-modal multi-label semantic image annotation. We conducted experiments on two commonly-used keyframe and image collections: MediaMill and Scene dataset including about 40,000 examples. The empirical studies demonstrated that the proposed hybrid ensemble learning method can enhance a given weak multi-label classifier to some extent, showing the effectiveness of our proposed method when limited number of multi-labeled training data is available.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barnard, K., Dyugulu, P., de Freitas, N., Forsyth, D., Blei, D., Jordan, M.I.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)

    Article  MATH  Google Scholar 

  2. Barnard, K., Forsyth, D.A.: Learning the Semantics of Words and Pictures. In: Proceedings of International Conference on Computer Vision, pp. 408–415 (2001)

    Google Scholar 

  3. Duygulu, P., Barnard, K., de Freitas, N., Forsyth, D.: Ojbect recognition as machine translation: Learning a lexicon fro a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 97–112. Springer, Heidelberg (2002)

    Google Scholar 

  4. Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proc. of SIGIR 2003, pp. 119–126 (2003)

    Google Scholar 

  5. Chang, E., Goh, K., Sychay, G., Wu, G.: CBSA: Content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Transactions on CSVT 13(1), 26–38 (2003)

    Google Scholar 

  6. Li, J., Wang, J.A.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on PAMI 25(10), 175–188 (2003)

    MATH  Google Scholar 

  7. Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In: Proc. of the 16th Annual Conference on Neural Information Processing Systems (2004)

    Google Scholar 

  8. Blei, D., Jordan, M.I.: Modeling annotated data. In: Proceedings of the 26th intl. SIGIR Conf., pp. 127–134 (2003)

    Google Scholar 

  9. Li, B., Goh, K.: Confidence-based dynamic ensemble for image annotation and semantics discovery. In: Proc. of ACM MM 2003, pp. 195–206 (2003)

    Google Scholar 

  10. Goh, K., Li, B., Chang, E.: Semantics and feature discovery via confidence-based ensemble. ACM Transactions on Multimedia Computing, Communications, and Applications 1(2), 168–189 (2005)

    Article  Google Scholar 

  11. Goh, K., Chang, E., Li, B.: Using on-class and two-class SVMs for multiclass image annotation. IEEE Trans. on Knowledge and Data Engineering 17(10), 1333–1346 (2005)

    Article  Google Scholar 

  12. Fan, J., Gao, Y., Luo, H.: Multi-level annotation of natural scenes using dominant image components and semantic concepts. In: Proc. of ACM MM, pp. 540–547 (2004)

    Google Scholar 

  13. Feng, S.L., Lavrenko, V., Manmatha, R.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Proc. of CVPR 2004 (2004)

    Google Scholar 

  14. Jin, R., Chai, J.Y., Si, L.: Effective Automatic image annotation via a coherent language model and active learning. In: Proc. of ACM MM 2004 (2004)

    Google Scholar 

  15. Kang, F., Jin, R., Chai, J.Y.: Regularizing Translation Models for Better Automatic Image Annotation. In: Proc. of CIKM 2004 (2004)

    Google Scholar 

  16. Monay, F., Gatica-Perez, D.: On image auto-annotation with latent space models. In: Proc. of ACM MM 2003. Conf. on Multimedia (2003)

    Google Scholar 

  17. Monay, F., Gatica-Perez, D.: PLSA-based image auto-annotation: Constraining the latent space. In: Proc. ACM Int. Conf. on Multimedia, New York (October 2004)

    Google Scholar 

  18. Zhang, R., Zhang, Z., Li, M., WY, M., Zhang, HJ.: A probabilistic semantic model for image annotation and multi-modal image retrieval. Multimedia Systems 12(1), 27–33 (2006)

    Article  Google Scholar 

  19. Schapire, R., Singer, Y.: Boostexter: A boosting-based system for text categorization. Machine Learning 39, 135–168 (2000)

    Article  MATH  Google Scholar 

  20. Wang, X.-R., Lin, C.-J.: LIBLR: a library for large regularized logistic regression (2007), Software available at http://www.csie.ntu.edu.tw/~cjlin/liblr/

  21. Boutell, M., Luo, J., Shen, X., Luo, J.: Learning multi-label scene classification. Pattern Recognition 37(9), 1757–1771 (2004)

    Article  Google Scholar 

  22. de Comite, F., Gilleron, R., Tommasi, M.: Learning multi-label alternating decision trees from texts and data. In: Proc. of MLDM 2003, pp. 35–49 (2003)

    Google Scholar 

  23. Gao, S., Wu, W., Lee, C.-H., Chua, T.-S.: A MFoM learning approach to robust multiclass multi-label text categorization. In: Proc. of ICML 2004, p. 42 (2004)

    Google Scholar 

  24. Tao, D., Xiaoou, T., Li, X., Wu, X.: Asymmetric Bagging and Random Subspace for Support Vector Machines-based Relevance Feedback in Image Retrieval. IEEE trans on PRMI 28(7), 1088–1099 (2006)

    Google Scholar 

  25. Wang, X., Zhang, L., Jing, F., Ma, W.-Y.: AnnoSearch: Image Auto-Annotation by Search. Proc. of CVPR (2006)

    Google Scholar 

  26. Chen, K., Lu, B.L., Kwok, J.T.: Effcient Classification of Multi-label and Imbalanced Data using Min-Max Modular Classifiers. In: Proc. of IJCNN 2006, pp. 1770–1775 (2006)

    Google Scholar 

  27. Snoek, C.G.M., Worring, M., van Gemert, J.C., Geusebroek, J.-M., Smeulders, A.W.M.: The challenge problem for automated detection of 101 semantic concepts in multimedia. In: Proc. Of ACM MM 2006, pp. 421–430 (2006)

    Google Scholar 

  28. Hoi, S.C., Jin, R., Lyu, M.: Batch Mode Active Learning and Its Application to Medical Image Classification. In: Proc. of ICML 2006, pp. 417–424 (2006)

    Google Scholar 

  29. Song, Y., Qi, G.-J., Hua, X.-S., Dai, L.-R., Wang, R.-H.: Video Annotation by Active Learning and Semi-Supervised Ensembling. In: Proc. of ICME 2006, pp. 933–936 (2006)

    Google Scholar 

  30. Feng, H., Chua, T.-S.: A bootstrapping approach to annotating large image collection. In: MIR 2003, pp. 55–62 (2003)

    Google Scholar 

  31. Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N.: Supervised Learning of Semantic Classes for Image Annotation and Retrieval. IEEE trans on PAMI 29(3), 394–410 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Horace H.-S. Ip Oscar C. Au Howard Leung Ming-Ting Sun Wei-Ying Ma Shi-Min Hu

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, W., Sun, M., Habel, C. (2007). Multi-modal Multi-label Semantic Indexing of Images Based on Hybrid Ensemble Learning. In: Ip, H.HS., Au, O.C., Leung, H., Sun, MT., Ma, WY., Hu, SM. (eds) Advances in Multimedia Information Processing – PCM 2007. PCM 2007. Lecture Notes in Computer Science, vol 4810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77255-2_90

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77255-2_90

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77254-5

  • Online ISBN: 978-3-540-77255-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics