Article

Hidden Markov models for automatic annotation and content-based retrieval of images and video

Authors:

SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval

Pages 544 - 551

https://doi.org/10.1145/1076034.1076127

Published: 15 August 2005 Publication History

Get Access

Abstract

This paper introduces a novel method for automatic annotation of images with keywords from a generic vocabulary of concepts or objects for the purpose of content-based image retrieval. An image, represented as sequence of feature-vectors characterizing low-level visual features such as color, texture or oriented-edges, is modeled as having been stochastically generated by a hidden Markov model, whose states represent concepts. The parameters of the model are estimated from a set of manually annotated (training) images. Each image in a large test collection is then automatically annotated with the a posteriori probability of concepts present in it. This annotation supports content-based search of the image-collection via keywords. Various aspects of model parameterization, parameter estimation, and image annotation are discussed. Empirical retrieval results are presented on two image-collections | COREL and key-frames from TRECVID. Comparisons are made with two other recently developed techniques on the same datasets.

References

[1]

A. Amir et al. IBM Research TRECVID-2003 Video Retrieval System. In Proc. TRECVID2003, November 2003.

Google Scholar

[2]

K. Barnard, P. Duygulu, N. de Freitas, D. Forsyth, D. M. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107--1135, 2003.

Digital Library

Google Scholar

[3]

D. M. Blei and M. I. Jordan. Modeling Annotated Data. In 26th Annual International ACM SIGIR Conference, pages 127--134, 2003.

Digital Library

Google Scholar

[4]

P. Duygulu, K. Barnard, N. de Freitas, and D. Forsyth. Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. In Seventh European Conference on Computer Vision, volume4, pages 97--112, 2002.

Digital Library

Google Scholar

[5]

S. L. Feng, R. Manmatha, and V. Lavrenko. Multiple Bernoulli relevance models for image and video annotation. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, volume 2, pages II--1002--II--1009, 2004.

Digital Library

Google Scholar

[6]

G. Iyengar et al. Joint Visual-Test Modeling for Multimedia Retrieval. Available at: http://www.clsp.jhu.edu/ws2004/groups/ws04vstxt/, 2004.

Google Scholar

[7]

J. Jeon, V. Lavrenko, and R. Manmatha. Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In 26th Annual International ACM SIGIR COnference, pages 119--126, 2003.

Digital Library

Google Scholar

[8]

V. Lavrenko, S. L. Feng, and R. Manmatha. Statistical models for automatic video annotation and retrieval. In Proc. IEEE International Conf. on Acoustics, Speech and Signal Processing, volume 3, pages 17--21, May 2003.

Google Scholar

[9]

J. Li and J. Z. Wang. Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach. IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(9):1075--1088, 2003.

Digital Library

Google Scholar

[10]

NIST. In Proceedings of the TREC Video Retrieval Evaluation Conference (TRECVID2003), November 2003.

Google Scholar

[11]

L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE, 77(2):257--286, 1989.

Crossref

Google Scholar

[12]

S. Young et al. The HTK Book. 2002.

Google Scholar

[13]

P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pages I--511--I--518, December 2001.

Google Scholar

Cited By

View all

Panchal NGarg D(2023)Image Captioning: A Comprehensive Survey, Comparative Analysis of Existing Models, and Research Gaps2023 Second International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)10.1109/ICAISS58487.2023.10250630(1120-1127)Online publication date: 23-Aug-2023
https://doi.org/10.1109/ICAISS58487.2023.10250630
Bouchakwa MAyadi YAmous I(2020)A review on visual content-based and users’ tags-based image annotation: methods and techniquesMultimedia Tools and Applications10.1007/s11042-020-08862-1Online publication date: 9-May-2020
https://doi.org/10.1007/s11042-020-08862-1
Tonge ACaragea C(2019)Privacy-aware Tag Recommendation for Accurate Image Privacy PredictionACM Transactions on Intelligent Systems and Technology10.1145/333505410:4(1-28)Online publication date: 12-Aug-2019
https://dl.acm.org/doi/10.1145/3335054
Show More Cited By

Index Terms

Hidden Markov models for automatic annotation and content-based retrieval of images and video
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Color image retrieval based on hidden Markov models
ICIP '95: Proceedings of the 1995 International Conference on Image Processing (Vol. 1)-Volume 1 - Volume 1

A new approach to retrieving images from a color image database is proposed in this paper. Each image in the database is represented by a pseudo two-dimensional hidden Markov model (2D PHMM), where both the chromatic and spatial information about the ...
Coding with partially hidden Markov models
DCC '95: Proceedings of the Conference on Data Compression

Partially hidden Markov models (PHMM) are introduced. They are a variation of the hidden Markov models (HMM) combining the power of explicit conditioning on past observations and the power of using hidden states. (P)HMM may be combined with arithmetic ...
Structural hidden Markov models: An application to handwritten numeral recognition

We introduce in this paper a generalization of the widely used hidden Markov models (HMM's), which we name "structural hidden Markov models" (SHMM). Our approach is motivated by the need of modeling complex structures which are encountered in many ...

Comments

Information & Contributors

Information

Published In

SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval

August 2005

708 pages

ISBN:1595930345

DOI:10.1145/1076034

General Chairs:
Ricardo Baeza-Yates
University of Chile, Chile
,
Nivio Ziviani
Federal University of Minas Gerais, Brazil
,
Program Chairs:
Gary Marchionini
University of North Carolina, USA
,
Alistair Moffat
University of Melbourne, Australia
,
John Tait
University of Sunderland, UK

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 August 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

SIGIR05

Sponsor:

SIGIR

SIGIR05: The 28th ACM/SIGIR International Symposium on Information Retrieval 2005

August 15 - 19, 2005

Salvador, Brazil

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

59
Total Citations
View Citations
1,333
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Panchal NGarg D(2023)Image Captioning: A Comprehensive Survey, Comparative Analysis of Existing Models, and Research Gaps2023 Second International Conference on Augmented Intelligence and Sustainable Systems (ICAISS)10.1109/ICAISS58487.2023.10250630(1120-1127)Online publication date: 23-Aug-2023
https://doi.org/10.1109/ICAISS58487.2023.10250630
Bouchakwa MAyadi YAmous I(2020)A review on visual content-based and users’ tags-based image annotation: methods and techniquesMultimedia Tools and Applications10.1007/s11042-020-08862-1Online publication date: 9-May-2020
https://doi.org/10.1007/s11042-020-08862-1
Tonge ACaragea C(2019)Privacy-aware Tag Recommendation for Accurate Image Privacy PredictionACM Transactions on Intelligent Systems and Technology10.1145/333505410:4(1-28)Online publication date: 12-Aug-2019
https://dl.acm.org/doi/10.1145/3335054
Ma YXie QLiu YXiong S(2019)A weighted KNN-based automatic image annotation methodNeural Computing and Applications10.1007/s00521-019-04114-yOnline publication date: 7-Mar-2019
https://doi.org/10.1007/s00521-019-04114-y
Chien BKu C(2017)Large-scale image annotation with image---text hybrid learning modelsSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-016-2221-z21:11(2857-2869)Online publication date: 1-Jun-2017
https://dl.acm.org/doi/10.1007/s00500-016-2221-z
Zeller F(2016)Soziale Medien in der empirischen ForschungHandbuch Soziale Medien10.1007/978-3-658-03765-9_21(389-408)Online publication date: 6-Oct-2016
https://doi.org/10.1007/978-3-658-03765-9_21
Akpınar SAlpaslan F(2015)Optical flow-based representation for video action detectionEmerging Trends in Image Processing, Computer Vision and Pattern Recognition10.1016/B978-0-12-802045-6.00021-1(331-351)Online publication date: 2015
https://doi.org/10.1016/B978-0-12-802045-6.00021-1
Zeller F(2015)Soziale Medien in der empirischen ForschungHandbuch Soziale Medien10.1007/978-3-658-03895-3_21-1(1-19)Online publication date: 12-Nov-2015
https://doi.org/10.1007/978-3-658-03895-3_21-1
(2014)BibliographySemantic Multimedia Analysis and Processing10.1201/b17080-21(421-512)Online publication date: 18-Jun-2014
https://doi.org/10.1201/b17080-21
Shaikh RDeep SLi JMemon MKhan AKumar K(2014)Content factors segmentation with CBIR in real world2014 11th International Computer Conference on Wavelet Actiev Media Technology and Information Processing(ICCWAMTIP)10.1109/ICCWAMTIP.2014.7073412(297-300)Online publication date: Dec-2014
https://doi.org/10.1109/ICCWAMTIP.2014.7073412
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Color image retrieval based on hidden Markov models

Coding with partially hidden Markov models

Structural hidden Markov models: An application to handwritten numeral recognition

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations