Multi-modal Multi-label Semantic Indexing of Images Based on Hybrid Ensemble Learning

Li, Wei; Sun, Maosong; Habel, Christopher

doi:10.1007/978-3-540-77255-2_90

Multi-modal Multi-label Semantic Indexing of Images Based on Hybrid Ensemble Learning

Wei Li¹,
Maosong Sun¹ &
Christopher Habel²

Conference paper

1177 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4810))

Abstract

Automatic image annotation (AIA) refers to the association of words to whole images which is considered as a promising and effective approach to bridge the semantic gap between low-level visual features and high-level semantic concepts. In this paper, we formulate the task of image annotation as a multi-label multi class semantic image classification problem and propose a simple yet effective method: hybrid ensemble learning framework in which multi-label classifier based on uni-modal features and ensemble classifier based on bi-modal features are integrated into a joint classification model to perform multi-modal multi-label semantic image annotation. We conducted experiments on two commonly-used keyframe and image collections: MediaMill and Scene dataset including about 40,000 examples. The empirical studies demonstrated that the proposed hybrid ensemble learning method can enhance a given weak multi-label classifier to some extent, showing the effectiveness of our proposed method when limited number of multi-labeled training data is available.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barnard, K., Dyugulu, P., de Freitas, N., Forsyth, D., Blei, D., Jordan, M.I.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
Article MATH Google Scholar
Barnard, K., Forsyth, D.A.: Learning the Semantics of Words and Pictures. In: Proceedings of International Conference on Computer Vision, pp. 408–415 (2001)
Google Scholar
Duygulu, P., Barnard, K., de Freitas, N., Forsyth, D.: Ojbect recognition as machine translation: Learning a lexicon fro a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 97–112. Springer, Heidelberg (2002)
Google Scholar
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proc. of SIGIR 2003, pp. 119–126 (2003)
Google Scholar
Chang, E., Goh, K., Sychay, G., Wu, G.: CBSA: Content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Transactions on CSVT 13(1), 26–38 (2003)
Google Scholar
Li, J., Wang, J.A.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on PAMI 25(10), 175–188 (2003)
MATH Google Scholar
Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In: Proc. of the 16th Annual Conference on Neural Information Processing Systems (2004)
Google Scholar
Blei, D., Jordan, M.I.: Modeling annotated data. In: Proceedings of the 26^th intl. SIGIR Conf., pp. 127–134 (2003)
Google Scholar
Li, B., Goh, K.: Confidence-based dynamic ensemble for image annotation and semantics discovery. In: Proc. of ACM MM 2003, pp. 195–206 (2003)
Google Scholar
Goh, K., Li, B., Chang, E.: Semantics and feature discovery via confidence-based ensemble. ACM Transactions on Multimedia Computing, Communications, and Applications 1(2), 168–189 (2005)
Article Google Scholar
Goh, K., Chang, E., Li, B.: Using on-class and two-class SVMs for multiclass image annotation. IEEE Trans. on Knowledge and Data Engineering 17(10), 1333–1346 (2005)
Article Google Scholar
Fan, J., Gao, Y., Luo, H.: Multi-level annotation of natural scenes using dominant image components and semantic concepts. In: Proc. of ACM MM, pp. 540–547 (2004)
Google Scholar
Feng, S.L., Lavrenko, V., Manmatha, R.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Proc. of CVPR 2004 (2004)
Google Scholar
Jin, R., Chai, J.Y., Si, L.: Effective Automatic image annotation via a coherent language model and active learning. In: Proc. of ACM MM 2004 (2004)
Google Scholar
Kang, F., Jin, R., Chai, J.Y.: Regularizing Translation Models for Better Automatic Image Annotation. In: Proc. of CIKM 2004 (2004)
Google Scholar
Monay, F., Gatica-Perez, D.: On image auto-annotation with latent space models. In: Proc. of ACM MM 2003. Conf. on Multimedia (2003)
Google Scholar
Monay, F., Gatica-Perez, D.: PLSA-based image auto-annotation: Constraining the latent space. In: Proc. ACM Int. Conf. on Multimedia, New York (October 2004)
Google Scholar
Zhang, R., Zhang, Z., Li, M., WY, M., Zhang, HJ.: A probabilistic semantic model for image annotation and multi-modal image retrieval. Multimedia Systems 12(1), 27–33 (2006)
Article Google Scholar
Schapire, R., Singer, Y.: Boostexter: A boosting-based system for text categorization. Machine Learning 39, 135–168 (2000)
Article MATH Google Scholar
Wang, X.-R., Lin, C.-J.: LIBLR: a library for large regularized logistic regression (2007), Software available at http://www.csie.ntu.edu.tw/~cjlin/liblr/
Boutell, M., Luo, J., Shen, X., Luo, J.: Learning multi-label scene classification. Pattern Recognition 37(9), 1757–1771 (2004)
Article Google Scholar
de Comite, F., Gilleron, R., Tommasi, M.: Learning multi-label alternating decision trees from texts and data. In: Proc. of MLDM 2003, pp. 35–49 (2003)
Google Scholar
Gao, S., Wu, W., Lee, C.-H., Chua, T.-S.: A MFoM learning approach to robust multiclass multi-label text categorization. In: Proc. of ICML 2004, p. 42 (2004)
Google Scholar
Tao, D., Xiaoou, T., Li, X., Wu, X.: Asymmetric Bagging and Random Subspace for Support Vector Machines-based Relevance Feedback in Image Retrieval. IEEE trans on PRMI 28(7), 1088–1099 (2006)
Google Scholar
Wang, X., Zhang, L., Jing, F., Ma, W.-Y.: AnnoSearch: Image Auto-Annotation by Search. Proc. of CVPR (2006)
Google Scholar
Chen, K., Lu, B.L., Kwok, J.T.: Effcient Classification of Multi-label and Imbalanced Data using Min-Max Modular Classifiers. In: Proc. of IJCNN 2006, pp. 1770–1775 (2006)
Google Scholar
Snoek, C.G.M., Worring, M., van Gemert, J.C., Geusebroek, J.-M., Smeulders, A.W.M.: The challenge problem for automated detection of 101 semantic concepts in multimedia. In: Proc. Of ACM MM 2006, pp. 421–430 (2006)
Google Scholar
Hoi, S.C., Jin, R., Lyu, M.: Batch Mode Active Learning and Its Application to Medical Image Classification. In: Proc. of ICML 2006, pp. 417–424 (2006)
Google Scholar
Song, Y., Qi, G.-J., Hua, X.-S., Dai, L.-R., Wang, R.-H.: Video Annotation by Active Learning and Semi-Supervised Ensembling. In: Proc. of ICME 2006, pp. 933–936 (2006)
Google Scholar
Feng, H., Chua, T.-S.: A bootstrapping approach to annotating large image collection. In: MIR 2003, pp. 55–62 (2003)
Google Scholar
Carneiro, G., Chan, A.B., Moreno, P.J., Vasconcelos, N.: Supervised Learning of Semantic Classes for Image Annotation and Retrieval. IEEE trans on PAMI 29(3), 394–410 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Lab of Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, P.R. China
Wei Li & Maosong Sun
Fachbereich Informatik, Universität Hamburg, Hamburg, 22527, Germany
Christopher Habel

Authors

Wei Li
View author publications
You can also search for this author in PubMed Google Scholar
Maosong Sun
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Habel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Horace H.-S. Ip Oscar C. Au Howard Leung Ming-Ting Sun Wei-Ying Ma Shi-Min Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, W., Sun, M., Habel, C. (2007). Multi-modal Multi-label Semantic Indexing of Images Based on Hybrid Ensemble Learning. In: Ip, H.HS., Au, O.C., Leung, H., Sun, MT., Ma, WY., Hu, SM. (eds) Advances in Multimedia Information Processing – PCM 2007. PCM 2007. Lecture Notes in Computer Science, vol 4810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77255-2_90

Download citation

DOI: https://doi.org/10.1007/978-3-540-77255-2_90
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77254-5
Online ISBN: 978-3-540-77255-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics