Article

Multi-level annotation of natural scenes using dominant image components and semantic concepts

Authors:

Hangzai LuoAuthors Info & Claims

MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia

Pages 540 - 547

https://doi.org/10.1145/1027527.1027660

Published: 10 October 2004 Publication History

Abstract

Automatic image annotation is a promising solution to enable semantic image retrieval via keywords. In this paper, we propose a multi-level approach to annotate the semantics of <b><i>natural scenes</i></b> by using both the dominant image components (salient objects) and the relevant semantic concepts. To achieve automatic image annotation at the content level, we use salient objects as the dominant image components for image content representation and feature extraction. To support automatic image annotation at the concept level, a novel image classification technique is developed to map the images into the most relevant semantic image concepts. In addition, Support Vector Machine (SVM) classifiers are used to learn the detection functions for the pre-defined salient objects and finite mixture models are used for semantic concept interpretation and modeling. An <b><i>adaptive EM algorithm</i></b> has been proposed to determine the optimal model structure and model parameters simultaneously. We have also demonstrated that our algorithms are very effective to enable multi-level annotation of <b><i>natural scenes</i></b> in a large-scale image dataset.

References

[1]

J.R. Smith and S.F. Chang, "Visually searching the web for content", IEEE Multimedia, 1997.

Digital Library

[2]

E. Chang, "Statistical learning for effective visual information retrieval", Proc. ICIP, 2003.

[3]

X. He, W.-Y. Ma, O. King, M. Li and H.J. Zhang, "Learning and inferring a semantic space from user's relevance feedback", ACM MM, 2002.

Digital Library

[4]

J.R. Smith and C.S. Li, "Image classification and querying using composite region templates", Computer Vision and Image Understanding, vol.75, 1999.

Digital Library

[5]

P. Duygulu, K. Barnard, N. de Freitas, D. Forsyth, "Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary", ECCV, 2002.

Digital Library

[6]

K. Branard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, M.I. Jordan, "Matching words and pictures", Journal of Machine Learning Research, vol.3, pp.1107--1135, 2003.

Digital Library

[7]

M. Szummer and R.W. Picard, "Indoor-outdoor image classification", Proc. ICAIVL, 1998.

Digital Library

[8]

R. Schettini, A. Valsasna, C. Brambilla, M. De Ponti, "A indoor/outdoor/close-up photo classifier", Proc. Color Imaging, 2001.

[9]

C. Carson, S. Belongie, H. Greenspan, J. Malik, "Region-based image querying", ICAIVL, 1997.

Digital Library

[10]

J. Huang, S.R. Kumar and R. Zabih, "An automatic hierarchical image classification scheme", ACM MM, 1998.

Digital Library

[11]

N. Campbell, B. Thomas, T. Troscianko, "Automatic segmentation and classification of outdoor images using neural networks", Intl. Journal of Neural Systems, vol.8, pp.137--144, 1997.

[12]

J. Li, J.Z. Wang, and G. Wiederhold, "SIMPLIcity: Semantic-sensitive integrated matching for picture libraries", VISUAL, Lyon, France, 2000.

Digital Library

[13]

A. Vailaya, M. Figueiredo, A.K. Jain, H.J. Zhang, "Image classification for content-based indexing", IEEE Trans. on Image Processing, vol.10, 2001.

Digital Library

[14]

A. Hartmann, R. Lienhart, "Automatic classification of images on the web", Proc. SPIE, vol.4676, 2002.

[15]

E. Chang, K. Goh, G. Sychay, G. Wu, "CBSA: Content-based annotation for multimodal image retrieval using Bayes point machines", IEEE Trans. CSVT, 2002.

[16]

B. Li, K. Goh, E. Chang, "Confidence-based dynamic ensamble for image annotation and semantic discovery", ACM MM, 2003.

Digital Library

[17]

A. Mojsilovic, J. Gomes, B. Rogowitz, "ISee: Perceptual features for image library navigation", Proc. SPIE, 2001.

[18]

A.B. Torralba and A. Oliva, "Semantic organization of scenes using discriminant structural templates", Proc. of IEEE ICCV, 1999.

Digital Library

[19]

J.R. Smith and S.-F. Chang, "Multi-stage classification of images from features and related text", Proc. DELOS, 1997.

[20]

F. Money, D. Gatica-Perez, "On image auto- annotation with latent space model", ACM MM, 2003.

Digital Library

[21]

J. Luo and S. Etz, "A physical model-based approach to detecting sky in photographic images", IEEE Trans. on Image Processing, vol.11, 2002.

Digital Library

[22]

S.F. Chang, W. Chen, H. Sundaram, "Semantic visual template: Linking visual features to semantics", Proc. ICIP, 1998.

[23]

S. Tong and E. Chang, "Support vector machine active learning for image retrieval", ACM MM, 2001.

Digital Library

[24]

C. Zhang, T. Chen, "Indexing and retrieval of 3D models aided by active learning", ACM MM, 2001.

Digital Library

[25]

D. Comanicu, P. Meer, "Mean shift: A robust approach toward feature space analysis", IEEE Trans. PAMI, vol.24, pp.603-619, 2002.

Digital Library

[26]

Y. Wu, Q. Tian, T.S. Huang, "Discriminant-EM algorithm with application to image retrieval", Proc. CVPR, pp.222--227, 2000.

[27]

J. Lin, "Divergence measures based on the Shannon entropy", IEEE Trans. on IT, vol.37, no.1, 1991.

Digital Library

[28]

A.B. Benitez, J.R. Smith and S.-F. Chang, "MediaNet: A multimedia information network for knowledge representation", Proc. SPIE, vol.4210, 2000.

[29]

H. Greenspan, J. Goldberger, A. Mayer, "Probabilistic space-time video modeling via piecewise GMM", IEEE Trans. PAMI, vol.26, no.3, 2004.

Digital Library

[30]

K. Barnard and D. Forsyth, "Learning the semantics of words and pictures", Proc. ICCV, pp.408--415, 2001.

[31]

M.R. Naphade, X. Zhou, and T.S. Huang, "Image classification using a set of labeled and unlabeled images", Proc. SPIE, 2000.

[32]

M.R. Naphade and T.S. Huang, "A probabilistic framework for semantic video indexing, filtering, and retrival", IEEE Trans. on Multimedia, vol.3, pp.141--151, 2001.

Digital Library

[33]

R. Oami, A. Benitez, S.-F. Chang, N. Dimitrova, "Understanding and modeling user interests in consumer videos", ICME, 2004.

[34]

N. Ueda and R. Nakano, Z. Ghahramani, G. E. Hinton, "SMEM algorithm for mixture models", NIPS, 1998.

Digital Library

[35]

B. Zhang, C. Zhang, X. Yi, "Competitive EM algorithm for finite mixture models", Pattern Recognition, vol.37, pp.131--144, 2004.

[36]

M.A.T. Figueiredo and A.K. Jain, "Unsupervised learning of finite mixture models", IEEE Trans. on PAMI, vol.24, no.3, pp.318--396, 2002.

Digital Library

Cited By

Tsai CHu YLin WWang M(2017)Early versus Late Dimensionality Reduction of Bag-of-Words Feature Representation for Image ClassificationProceedings of the 4th International Conference on Bioinformatics Research and Applications10.1145/3175587.3175598(42-45)Online publication date: 8-Dec-2017
https://dl.acm.org/doi/10.1145/3175587.3175598
Chang JJuang HChen YChang C(2017)Safe binary particle swam algorithm for an enhanced unsupervised label refinement in automatic face annotationMultimedia Tools and Applications10.1007/s11042-016-4058-y76:18(18339-18359)Online publication date: 1-Sep-2017
https://dl.acm.org/doi/10.1007/s11042-016-4058-y
Lei CLiu DLi W(2016)Social Diffusion Analysis With Common-Interest Model for Image AnnotationIEEE Transactions on Multimedia10.1109/TMM.2015.247727718:4(687-701)Online publication date: Apr-2016
https://doi.org/10.1109/TMM.2015.2477277
Show More Cited By

Index Terms

Multi-level annotation of natural scenes using dominant image components and semantic concepts
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
2. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia databases

Recommendations

Automatic image annotation by using concept-sensitive salient objects for image content representation
SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

Multi-level annotation of images is a promising solution to enable more effective semantic image retrieval by using various keywords at different semantic levels. In this paper, we propose a multi-level approach to annotate the semantics of natural ...
Review: Automatic Image Annotation for Semantic Image Retrieval
Image and Signal Processing
Abstract
Nowadays, the number of digital data sets grows exponentially. Hence, the need to conceive efficient and powerful image indexation and retrieval systems grows as well. Automatic image annotation was adopted by several research as the emerging ...
Visual and semantic context modeling for scene-centric image annotation

Automatic image annotation enables efficient indexing and retrieval of the images in the large-scale image collections, where manual image labeling is an expensive and labor intensive task. This paper proposes a novel approach to automatically annotate ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia

October 2004

1028 pages

ISBN:1581138938

DOI:10.1145/1027527

General Chairs:
Henning Schulzrinne
Columbia University
,
Nevenka Dimitrova
Philips Research
,
Program Chairs:
Angela Sasse
UCL
,
Sue Moon
KAIST
,
Rainer Lienhart
U Augsburg

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

MM04

Sponsor:

MM04: 2004 12th Annual ACM International Conference on Multimedia

October 10 - 16, 2004

NY, New York, USA

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

87
Total Citations
View Citations
1,122
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)3

Reflects downloads up to 24 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Tsai CHu YLin WWang M(2017)Early versus Late Dimensionality Reduction of Bag-of-Words Feature Representation for Image ClassificationProceedings of the 4th International Conference on Bioinformatics Research and Applications10.1145/3175587.3175598(42-45)Online publication date: 8-Dec-2017
https://dl.acm.org/doi/10.1145/3175587.3175598
Chang JJuang HChen YChang C(2017)Safe binary particle swam algorithm for an enhanced unsupervised label refinement in automatic face annotationMultimedia Tools and Applications10.1007/s11042-016-4058-y76:18(18339-18359)Online publication date: 1-Sep-2017
https://dl.acm.org/doi/10.1007/s11042-016-4058-y
Lei CLiu DLi W(2016)Social Diffusion Analysis With Common-Interest Model for Image AnnotationIEEE Transactions on Multimedia10.1109/TMM.2015.247727718:4(687-701)Online publication date: Apr-2016
https://doi.org/10.1109/TMM.2015.2477277
(2016)Statistical modeling for automatic image indexing and retrievalNeurocomputing10.1016/j.neucom.2016.04.033207:C(105-119)Online publication date: 26-Sep-2016
https://dl.acm.org/doi/10.1016/j.neucom.2016.04.033
GUO L(2015)Manifold Kernel Metric Learning for Larger-Scale Image AnnotationIEICE Transactions on Information and Systems10.1587/transinf.2014EDL8216E98.D:7(1396-1400)Online publication date: 2015
https://doi.org/10.1587/transinf.2014EDL8216
Manh NTuan NSang DBinh HThuy N(2015)Uniform Detection in Social Image Streams2015 Seventh International Conference on Knowledge and Systems Engineering (KSE)10.1109/KSE.2015.63(180-185)Online publication date: Oct-2015
https://doi.org/10.1109/KSE.2015.63
Wang DHoi SHe YZhu J(2014)Mining Weakly Labeled Web Facial Images for Search-Based Face AnnotationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2012.24026:1(166-179)Online publication date: 1-Jan-2014
https://dl.acm.org/doi/10.1109/TKDE.2012.240
Vo PSahbi H(2014)Modeling label dependencies in kernel learning for image annotation2014 IEEE International Conference on Image Processing (ICIP)10.1109/ICIP.2014.7026189(5886-5890)Online publication date: Oct-2014
https://doi.org/10.1109/ICIP.2014.7026189
Wang DHoi SWu PZhu JHe YMiao CJones GSheridan PKelly Dde Rijke MSakai T(2013)Learning to name facesProceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval10.1145/2484028.2484040(443-452)Online publication date: 28-Jul-2013
https://dl.acm.org/doi/10.1145/2484028.2484040
Yan LXingbo Z(2013)Image Semantic Information Mining Algorithm by Non-negative Matrix FactorizationProceedings of the 2013 Fourth International Conference on Intelligent Systems Design and Engineering Applications10.1109/ISDEA.2013.482(345-348)Online publication date: 6-Nov-2013
https://dl.acm.org/doi/10.1109/ISDEA.2013.482
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten