skip to main content
10.1145/1008992.1009055acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Automatic image annotation by using concept-sensitive salient objects for image content representation

Published: 25 July 2004 Publication History

Abstract

Multi-level annotation of images is a promising solution to enable more effective semantic image retrieval by using various keywords at different semantic levels. In this paper, we propose a multi-level approach to annotate the semantics of natural scenes by using both the dominant image components and the relevant semantic concepts. In contrast to the well-known image-based and region-based approaches, we use the salient objects as the dominant image components to achieve automatic image annotation at the content level. By using the salient objects for image content representation, a novel image classification technique is developed to achieve automatic image annotation at the concept level. To detect the salient objects automatically, a set of detection functions are learned from the labeled image regions by using Support Vector Machine (SVM) classifiers with an automatic scheme for searching the optimal model parameters. To generate the semantic concepts, finite mixture models are used to approximate the class distributions of the relevant salient objects. An adaptive EM algorithm has been proposed to determine the optimal model structure and model parameters simultaneously. We have also demonstrated that our algorithms are very effective to enable multi-level annotation of natural scenes in a large-scale dataset.

References

[1]
A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta and R. Jain, "Content-based image retrieval at the end of the early years", IEEE Trans. on PAMI, vol. 22, 2000.
[2]
Y. Mori, T. Takahashi, R. Oka, "Image-to-word transformation based on dividing and vector quantizing images with words", MISRM, 1999.
[3]
J.R. Smith and C.S. Li, "Image classification and querying using composite region templates", Computer Vision and Image Understanding, vol. 75, 1999.
[4]
P. Duygulu, K. Barnard, N. de Freitas, D. Forsyth, "Object recognition as machine translation: Learning a lexicon for a
[5]
K. Branard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, M.I. Jordan, "Matching words and pictures", Journal of Machine Learning Research, vol.3, pp.1107--1135, 2003.
[6]
D. Blei, M.I. Jordan, "Modeling annotated data", ACM SIGIR, pp.127--134, 2003.
[7]
J. Jeon, V. Lavrenko, R. Manmatha, "Automatic image annotation and retrieval using cross-media relevance models", ACM SIGIR, pp.119--126, 2003.
[8]
J. Hsieh and W. Grimson, "Spatial template extraction for image retrieval by region matching", IEEE Trans. on Image Processing, vol.12, 2003.
[9]
O. Maron and A.L. Ratan, "Multiple instance learning from natural scene classification", ICML, 1997.
[10]
C. Carson, S. Belongie, H. Greenspan, J. Malik, "Region-based image querying", IEEE Workshop on Content-Based Access of Image and Video Libraries, 1997.
[11]
N. Campbell, B. Thomas, T. Troscianko, "Automatic segmentation and classification of outdoor images using neural networks", Intl. Journal of Neural Systems, vol.8, pp.137--144, 1997.
[12]
J. Li, J.Z. Wang, and G. Wiederhold, "SIMPLIcity: Semantic-sensitive integrated matching for picture libraries", VISUAL, Lyon, France, 2000.
[13]
J. Li, J.Z. Wang, "Automatic linguistic indexing of pictures by a statistical modeling approach", IEEE Trans. PAMI, vol. 25, 2003.
[14]
A. Vailaya, M. Figueiredo, A.K. Jain, H.J. Zhang, "Image classification for content-based indexing", IEEE Trans.
[15]
K. Barnard and D. Forsyth, "Learning the semantics of words and pictures", Proc. ICCV, pp.408--415, 2001.
[16]
E. Chang, K. Goh, G. Sychay, G. Wu, "CBSA: Content-based annotation for multimodal image retrieval using Bayes point machines", IEEE Trans. CSVT, 2002.
[17]
V. Lavrenko, R. Manmatha, J. Jeon, "A model for learning the semantics of pictures", Proc. NIPS, 2003.
[18]
M. Das, R. Manmatha, "Automatic segmentation and indexing in a database of bird images", Proc. ICCV, 2001.
[19]
A. Mojsilovic, J. Gomes, B. Rogowitz, "ISee: Perceptual features for image library navigation", Proc. SPIE, 2001.
[20]
A.B. Torralba and A. Oliva, "Semantic organization of scenes using discriminant structural templates", Proc. of IEEE ICCV, 1999.
[21]
S. Li, X. Lv, H.J. Zhang, "View-based clustering of object appearances based on independent subspace analysis", Proc.
[22]
G. McLachlan and T. Krishnan, The EM algorithm and extensions, New York, John Wiley & Sons, 2000.
[23]
D. Comanicu, P. Meer, "Mean shift: A robust approach toward feature space analysis", IEEE Trans. PAMI, vol.24, pp.603--619, 2002.
[24]
S. Kullback and R. Leibler, "On information and sufficiency", Annals of Mathematical Statistics, vol.22, pp.76--86, 1951.
[25]
X. He, W. Ma, O. King, M. Li, H. Zhang, "Learning and inferring a semantic space from user's relevance feedback for image retrieval", SIGMM, 2002.
[26]
Y. Lu, C. Hu, X. Zhu, H. Zhang, Q. Yang, "A unified semantics and feature based image retrieval technique using relevance feedback", SIGMM, 2000.
[27]
A.B. Benitez, J.R. Smith and S.-F. Chang, "MediaNet: A multimedia information network for knowledge representation", Proc. SPIE, vol.4210, 2000.
[28]
A. Aslandogan, C. Their, C. Yu, J. Zon, N. Rishe, "Image retrieval using WordNet", ACM SIGIR, 1997.

Cited By

View all
  • (2024)Image CAPTCHAs: When Deep Learning Breaks the MoldIEEE Access10.1109/ACCESS.2024.344297612(112211-112231)Online publication date: 2024
  • (2022)The Image Annotation Refinement in Embedding Feature Space based on Mutual InformationInternational Journal of Circuits, Systems and Signal Processing10.46300/9106.2022.16.2316(191-201)Online publication date: 10-Jan-2022
  • (2020)A review on visual content-based and users’ tags-based image annotation: methods and techniquesMultimedia Tools and Applications10.1007/s11042-020-08862-1Online publication date: 9-May-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
July 2004
624 pages
ISBN:1581138814
DOI:10.1145/1008992
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. adaptive EM algorithm
  2. multi-level image annotation
  3. salient objects
  4. semantic image classification

Qualifiers

  • Article

Conference

SIGIR04
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 24 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Image CAPTCHAs: When Deep Learning Breaks the MoldIEEE Access10.1109/ACCESS.2024.344297612(112211-112231)Online publication date: 2024
  • (2022)The Image Annotation Refinement in Embedding Feature Space based on Mutual InformationInternational Journal of Circuits, Systems and Signal Processing10.46300/9106.2022.16.2316(191-201)Online publication date: 10-Jan-2022
  • (2020)A review on visual content-based and users’ tags-based image annotation: methods and techniquesMultimedia Tools and Applications10.1007/s11042-020-08862-1Online publication date: 9-May-2020
  • (2018)Emotional Attention: A Study of Image Sentiment and Visual Attention2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition10.1109/CVPR.2018.00785(7521-7531)Online publication date: Jun-2018
  • (2017)Automatic Image Annotation Based on Particle Swarm Optimization and Support Vector ClusteringMathematical Problems in Engineering10.1155/2017/84932672017:1Online publication date: 18-May-2017
  • (2017)Cuboid Segmentation for Effective Image Retrieval2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA)10.1109/DICTA.2017.8227422(1-8)Online publication date: Nov-2017
  • (2016)A Resource Aware MapReduce Based Parallel SVM for Large Scale Image ClassificationsNeural Processing Letters10.1007/s11063-015-9472-z44:1(161-184)Online publication date: 1-Aug-2016
  • (2016)Weighted subspace modeling for semantic concept retrieval using gaussian mixture modelsInformation Systems Frontiers10.1007/s10796-016-9660-z18:5(877-889)Online publication date: 1-Oct-2016
  • (2015)Gaussian Mixture Model-Based Subspace Modeling for Semantic Concept RetrievalProceedings of the 2015 IEEE International Conference on Information Reuse and Integration10.1109/IRI.2015.50(258-265)Online publication date: 13-Aug-2015
  • (2015)Interactive tool to improve the automatic image annotation using MPEG-7 and multi-class SVM2015 7th Conference on Information and Knowledge Technology (IKT)10.1109/IKT.2015.7288777(1-7)Online publication date: May-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media