Article

Automatic image annotation by using concept-sensitive salient objects for image content representation

Authors:

Guangyou XuAuthors Info & Claims

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 361 - 368

https://doi.org/10.1145/1008992.1009055

Published: 25 July 2004 Publication History

Abstract

Multi-level annotation of images is a promising solution to enable more effective semantic image retrieval by using various keywords at different semantic levels. In this paper, we propose a multi-level approach to annotate the semantics of natural scenes by using both the dominant image components and the relevant semantic concepts. In contrast to the well-known image-based and region-based approaches, we use the salient objects as the dominant image components to achieve automatic image annotation at the content level. By using the salient objects for image content representation, a novel image classification technique is developed to achieve automatic image annotation at the concept level. To detect the salient objects automatically, a set of detection functions are learned from the labeled image regions by using Support Vector Machine (SVM) classifiers with an automatic scheme for searching the optimal model parameters. To generate the semantic concepts, finite mixture models are used to approximate the class distributions of the relevant salient objects. An adaptive EM algorithm has been proposed to determine the optimal model structure and model parameters simultaneously. We have also demonstrated that our algorithms are very effective to enable multi-level annotation of natural scenes in a large-scale dataset.

References

[1]

A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta and R. Jain, "Content-based image retrieval at the end of the early years", IEEE Trans. on PAMI, vol. 22, 2000.

Digital Library

[2]

Y. Mori, T. Takahashi, R. Oka, "Image-to-word transformation based on dividing and vector quantizing images with words", MISRM, 1999.

[3]

J.R. Smith and C.S. Li, "Image classification and querying using composite region templates", Computer Vision and Image Understanding, vol. 75, 1999.

Digital Library

[4]

P. Duygulu, K. Barnard, N. de Freitas, D. Forsyth, "Object recognition as machine translation: Learning a lexicon for a

[5]

K. Branard, P. Duygulu, N. de Freitas, D. Forsyth, D. Blei, M.I. Jordan, "Matching words and pictures", Journal of Machine Learning Research, vol.3, pp.1107--1135, 2003.

Digital Library

[6]

D. Blei, M.I. Jordan, "Modeling annotated data", ACM SIGIR, pp.127--134, 2003.

Digital Library

[7]

J. Jeon, V. Lavrenko, R. Manmatha, "Automatic image annotation and retrieval using cross-media relevance models", ACM SIGIR, pp.119--126, 2003.

Digital Library

[8]

J. Hsieh and W. Grimson, "Spatial template extraction for image retrieval by region matching", IEEE Trans. on Image Processing, vol.12, 2003.

Digital Library

[9]

O. Maron and A.L. Ratan, "Multiple instance learning from natural scene classification", ICML, 1997.

Digital Library

[10]

C. Carson, S. Belongie, H. Greenspan, J. Malik, "Region-based image querying", IEEE Workshop on Content-Based Access of Image and Video Libraries, 1997.

Digital Library

[11]

N. Campbell, B. Thomas, T. Troscianko, "Automatic segmentation and classification of outdoor images using neural networks", Intl. Journal of Neural Systems, vol.8, pp.137--144, 1997.

[12]

J. Li, J.Z. Wang, and G. Wiederhold, "SIMPLIcity: Semantic-sensitive integrated matching for picture libraries", VISUAL, Lyon, France, 2000.

Digital Library

[13]

J. Li, J.Z. Wang, "Automatic linguistic indexing of pictures by a statistical modeling approach", IEEE Trans. PAMI, vol. 25, 2003.

Digital Library

[14]

A. Vailaya, M. Figueiredo, A.K. Jain, H.J. Zhang, "Image classification for content-based indexing", IEEE Trans.

Digital Library

[15]

K. Barnard and D. Forsyth, "Learning the semantics of words and pictures", Proc. ICCV, pp.408--415, 2001.

[16]

E. Chang, K. Goh, G. Sychay, G. Wu, "CBSA: Content-based annotation for multimodal image retrieval using Bayes point machines", IEEE Trans. CSVT, 2002.

Digital Library

[17]

V. Lavrenko, R. Manmatha, J. Jeon, "A model for learning the semantics of pictures", Proc. NIPS, 2003.

[18]

M. Das, R. Manmatha, "Automatic segmentation and indexing in a database of bird images", Proc. ICCV, 2001.

[19]

A. Mojsilovic, J. Gomes, B. Rogowitz, "ISee: Perceptual features for image library navigation", Proc. SPIE, 2001.

[20]

A.B. Torralba and A. Oliva, "Semantic organization of scenes using discriminant structural templates", Proc. of IEEE ICCV, 1999.

Digital Library

[21]

S. Li, X. Lv, H.J. Zhang, "View-based clustering of object appearances based on independent subspace analysis", Proc.

[22]

G. McLachlan and T. Krishnan, The EM algorithm and extensions, New York, John Wiley & Sons, 2000.

[23]

D. Comanicu, P. Meer, "Mean shift: A robust approach toward feature space analysis", IEEE Trans. PAMI, vol.24, pp.603--619, 2002.

Digital Library

[24]

S. Kullback and R. Leibler, "On information and sufficiency", Annals of Mathematical Statistics, vol.22, pp.76--86, 1951.

[25]

X. He, W. Ma, O. King, M. Li, H. Zhang, "Learning and inferring a semantic space from user's relevance feedback for image retrieval", SIGMM, 2002.

Digital Library

[26]

Y. Lu, C. Hu, X. Zhu, H. Zhang, Q. Yang, "A unified semantics and feature based image retrieval technique using relevance feedback", SIGMM, 2000.

[27]

A.B. Benitez, J.R. Smith and S.-F. Chang, "MediaNet: A multimedia information network for knowledge representation", Proc. SPIE, vol.4210, 2000.

[28]

A. Aslandogan, C. Their, C. Yu, J. Zon, N. Rishe, "Image retrieval using WordNet", ACM SIGIR, 1997.

Digital Library

Cited By

Moradi MMoradi MPalazzo SRundo FSpampinato C(2024)Image CAPTCHAs: When Deep Learning Breaks the MoldIEEE Access10.1109/ACCESS.2024.344297612(112211-112231)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3442976
Li WSong HZhang HLi HWang P(2022)The Image Annotation Refinement in Embedding Feature Space based on Mutual InformationInternational Journal of Circuits, Systems and Signal Processing10.46300/9106.2022.16.2316(191-201)Online publication date: 10-Jan-2022
https://doi.org/10.46300/9106.2022.16.23
Bouchakwa MAyadi YAmous I(2020)A review on visual content-based and users’ tags-based image annotation: methods and techniquesMultimedia Tools and Applications10.1007/s11042-020-08862-1Online publication date: 9-May-2020
https://doi.org/10.1007/s11042-020-08862-1
Show More Cited By

Index Terms

Automatic image annotation by using concept-sensitive salient objects for image content representation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
2. Information systems
  1. Information systems applications
    1. Multimedia information systems
      1. Multimedia databases

Recommendations

Multi-level annotation of natural scenes using dominant image components and semantic concepts
MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia

Automatic image annotation is a promising solution to enable semantic image retrieval via keywords. In this paper, we propose a multi-level approach to annotate the semantics of <b><i>natural scenes</i></b> by using both the dominant image components (...
Incorporating concept ontology to enable probabilistic concept reasoning for multi-level image annotation
MIR '06: Proceedings of the 8th ACM international workshop on Multimedia information retrieval

To enable automatic multi-level image annotation, we have addressed two inter-related important issues:(1)more effective framework for image content representation and feature extraction to characterize the middle-level semantics of image contents;(2)...
Statistical modeling and conceptualization of natural images

Multi-level annotation of images is a promising solution to enable semantic image retrieval by using various keywords at different semantic levels. In this paper, we propose a multi-level approach to interpret and annotate the semantics of natural ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval

July 2004

624 pages

ISBN:1581138814

DOI:10.1145/1008992

General Chair:
Mark Sanderson
University of Sheffield (UK)
,
Program Chairs:
Kalervo Järvelin
University of Tampere (Finland)
,
James Allan
University of Massachusetts (USA)
,
Peter Bruza
Distributed Systems Technology Centre (Australia)

Copyright © 2004 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2004

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SIGIR04

Sponsor:

SIGIR04: The 27th ACM/SIGIR International Symposium on Information Retrieval 2004

July 25 - 29, 2004

Sheffield, United Kingdom

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

45
Total Citations
View Citations
1,752
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)1

Reflects downloads up to 24 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Moradi MMoradi MPalazzo SRundo FSpampinato C(2024)Image CAPTCHAs: When Deep Learning Breaks the MoldIEEE Access10.1109/ACCESS.2024.344297612(112211-112231)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3442976
Li WSong HZhang HLi HWang P(2022)The Image Annotation Refinement in Embedding Feature Space based on Mutual InformationInternational Journal of Circuits, Systems and Signal Processing10.46300/9106.2022.16.2316(191-201)Online publication date: 10-Jan-2022
https://doi.org/10.46300/9106.2022.16.23
Bouchakwa MAyadi YAmous I(2020)A review on visual content-based and users’ tags-based image annotation: methods and techniquesMultimedia Tools and Applications10.1007/s11042-020-08862-1Online publication date: 9-May-2020
https://doi.org/10.1007/s11042-020-08862-1
Fan SShen ZJiang MKoenig BXu JKankanhalli MZhao Q(2018)Emotional Attention: A Study of Image Sentiment and Visual Attention2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition10.1109/CVPR.2018.00785(7521-7531)Online publication date: Jun-2018
https://doi.org/10.1109/CVPR.2018.00785
Hao ZGe HGu T(2017)Automatic Image Annotation Based on Particle Swarm Optimization and Support Vector ClusteringMathematical Problems in Engineering10.1155/2017/84932672017:1Online publication date: 18-May-2017
https://doi.org/10.1155/2017/8493267
Murshed MTeng SLu G(2017)Cuboid Segmentation for Effective Image Retrieval2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA)10.1109/DICTA.2017.8227422(1-8)Online publication date: Nov-2017
https://doi.org/10.1109/DICTA.2017.8227422
Guo WAlham NLiu YLi MQi M(2016)A Resource Aware MapReduce Based Parallel SVM for Large Scale Image ClassificationsNeural Processing Letters10.1007/s11063-015-9472-z44:1(161-184)Online publication date: 1-Aug-2016
https://dl.acm.org/doi/10.1007/s11063-015-9472-z
Chen CShyu MChen S(2016)Weighted subspace modeling for semantic concept retrieval using gaussian mixture modelsInformation Systems Frontiers10.1007/s10796-016-9660-z18:5(877-889)Online publication date: 1-Oct-2016
https://dl.acm.org/doi/10.1007/s10796-016-9660-z
Chen CShyu MChen S(2015)Gaussian Mixture Model-Based Subspace Modeling for Semantic Concept RetrievalProceedings of the 2015 IEEE International Conference on Information Reuse and Integration10.1109/IRI.2015.50(258-265)Online publication date: 13-Aug-2015
https://dl.acm.org/doi/10.1109/IRI.2015.50
Majidpour JKhezri EHassanzade HMohammed K(2015)Interactive tool to improve the automatic image annotation using MPEG-7 and multi-class SVM2015 7th Conference on Information and Knowledge Technology (IKT)10.1109/IKT.2015.7288777(1-7)Online publication date: May-2015
https://doi.org/10.1109/IKT.2015.7288777
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten