research-article

A probabilistic topic-connection model for automatic image annotation

Authors:

E. K. ParkAuthors Info & Claims

CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

Pages 899 - 908

https://doi.org/10.1145/1871437.1871552

Published: 26 October 2010 Publication History

Abstract

The explosive increase of image data on Internet has made it an important, yet very challenging task to index and automatically annotate image data. To achieve that end, sophisticated algorithms and models have been proposed to study the correlation between image content and corresponding text description. Despite the success of previous works, however, researchers are still facing two major difficulties that may undermine their effort of providing reliable and accurate annotations for images. The first difficulty is lacking of comprehensive benchmark image dataset with high quality text descriptions. The second difficulty is lacking of effective way to represent the image content and make it associate with the text descriptions. In our paper, we aim to deal with both problems. To deal with the first problem, we utilize Wikipedia as external knowledge source and enrich the ontology structure of ImageNet database with comprehensive and highly-reliable text descriptions from Wikipedia articles. To address the second problem, we develop a Probabilistic Topic-Connection (PTC) model to represent the connection between latent semantic topic in text description and latent patterns from image feature space. We compare the performance of our model with the currently popular Correspondence LDA (Corr-LDA) model under the same automatic image annotation scenario using cross-validation. Experimental results demonstrate that our model is able to well represent the connection between latent semantic topics and latent patterns in image feature space, thus facilitates knowledge organization and understanding of both image and text descriptions.

References

[1]

A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, Content-based image rerieval at the end of the early years, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349--1380, 2000.

Digital Library

[2]

Lew, M. S., et al. Content-based multimedia information retrieval: State of the art and challenges. ACM Trans. Multimedia Comput. Commun. Appl., 2006.

Digital Library

[3]

Jia Li and James Z. Wang, ''Real-time Computerized Annotation of Pictures,'' Proceedings of the ACM Multimedia Conference, pp. 911--920, ACM, Santa Barbara, CA, October 2006.

Digital Library

[4]

R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, "Learning Object Categories from Google's Image Search," Proc. Int'l Conf. Computer Vision, vol. II, pp. 1816--1823, Oct. 2005.

Digital Library

[5]

Changhu Wang, Lei Zhang, Hong-Jiang Zhang. Learning to Reduce the Semantic Gap in Web Image Retrieval and Annotation, in Proc. of the 31st Annual International ACM SIGIR Conference on Research & Development on Information Retrieval (SIGIR), Singapore, July

Digital Library

[6]

David M. Blei, Michael I. Jordan: Modeling annotated data. SIGIR 2003: 127--134

Digital Library

[7]

Gustavo Carneiro, Antoni B. Chan, Pedro J. Moreno, Nuno Vasconcelos, "Supervised Learning of Semantic Classes for Image Annotation and Retrieval," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 3, pp. 394--410, Mar. 2007

Digital Library

[8]

Amr Ahmed, Eric P. Xing, William W. Cohen, Robert F. Murphy, Structured Correspondence topic models for mining captioned figures in biomedical literature, Proceedings of the 15th ACM SIGKDD International conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France.

Digital Library

[9]

X. Chen, C. Lu, Y. An, and P. Achananuparp. Probabilistic Models for Topic Learning from Images and Captions in Online Biomedical Literatures. In the Proceedings of 18th ACM Conference on Information and Knowledge Management (CIKM'09).

Digital Library

[10]

L. Fei-Fei, R. Fergus, and P. Perona. One-shot learning of object categories. PAMI, 28(4):594--611, April 2006.

Digital Library

[11]

G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report 7694, Caltech, 2007.

[12]

M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zisserman. The PASCAL Visual Object Classes Challenge 2008 (VOC2008) Results. http://www.pascal-network.org/challenges/VOC/voc2008/workshop/.

[13]

B. Russell, A. Torralba, K. Murphy, and W. Freeman. Labelme: A database and web-based tool for image annotation. IJCV, 77(1- 3):157--173, May 2008.

Digital Library

[14]

J. Deng, W. Dong, R. Socher, L. -J. Li and L. Fei-Fei, ImageNet: A Larget-Scale Hierarchical Image Database. IEEE Compter Visual and Pattern Recognition (CVPR), 2009.

[15]

Christiane Fellbaum (1998, ed.) WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.

[16]

Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, Xiaohua Zhou: Exploiting Wikipedia as external knowledge for document clustering. KDD 2009: 389--396

Digital Library

[17]

Hu, J., Fang, L., Cao, Y., et al. Enhancing Text Clustering by Leveraging Wikipedia Semantics. In Proceedings of the 31st annual International ACM SIGIR Conference on Research and Development in Information Retrieval. (Singapore, July 20 - 24, 2008). ACM Press, New York, NY, 179--186.

Digital Library

[18]

Wang, P. and Domeniconi, C. 2008. Building Semantic Kernels for text classification using Wikipedia. In Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. (Nevada, Las Vegas, August 24 - 27, 2008). ACM Press, New York, NY, 713--721.

Digital Library

[19]

F. Smadja, Retrieving collections from text: Xtract. Computational Linguistics, 1993, 19(1), pp. 143--177

Digital Library

[20]

J. Yang, Y. G. Jiang, A. G. Hauptmann, C. W. Ngo, Evaluating Bag-of-Visual-Words Representations in Scene Classification. ACM SIGMM Int'l Workshop on Multimedia Information Retrieval (MIR'07), Augsburg, Germany, Sep. 2007.

Digital Library

[21]

J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid, Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study. International Journal of Computer Vision, vol. 73, no. 2, June 2007, pp. 213--238

Digital Library

[22]

Yu-Gang Jiang, Chong-Wah Ngo, Jun Yang: Towards optimal bag-of-features for object categorization and semantic video retrieval. CIVR 2007: 494--501

Digital Library

[23]

Lowe, D. Distinctive Image Features from Scale-Invariant Key Points. International Journal of Computer Vision, 60(2): 91--110, 2004.

Digital Library

[24]

Sivic, J., Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. International Conference on Computer Vision. (2003) 1470--1477

Digital Library

[25]

J. Matas, O. Chum, U. M., T. Pajdla. Robust wide baseline stereo from maximally stable extremal regions. In BMVC, 2002.

[26]

K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. IEEE T. PAMI, 27(10):1615--1630, 2005.

Digital Library

[27]

L.-J. Li, R. Socher and L. Fei-Fei. Towards Total Scene Understanding:Classification, Annotation and Segmentation in an Automatic Framework. Computer Vision and Pattern Recognition (CVPR) 2009.

[28]

Zhong Wu, Qifa Ke, M. Isard, Jian Sun, Bundling features for large scale partial-duplicate web image search Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (18 August 2009), pp. 25--32.

[29]

Per-Erik Forssén and David G. Lowe, "Shape descriptors for maximally stable extremal regions," International Conference on Computer

[30]

Van Rijsbergen, C.J., Information Retrieval, Butterworths, 1975.

Digital Library

[31]

T. L. Griffiths, M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101:5228--5235, 2004.

[32]

A. Gelman, J. Carlin, H. Stern, and D. Rubin. Bayesian Data Analysis 2nd edition. Chapman-Hall, 2003.

Cited By

Kawai MSato HShiohama T(2022)Topic model-based recommender systems and their applications to cold-start problemsExpert Systems with Applications10.1016/j.eswa.2022.117129202(117129)Online publication date: Sep-2022
https://doi.org/10.1016/j.eswa.2022.117129
Sun GWu XPeng Q(2016)Part-based clothing image annotation by visual neighbor retrievalNeurocomputing10.1016/j.neucom.2015.12.141213:C(115-124)Online publication date: 12-Nov-2016
https://dl.acm.org/doi/10.1016/j.neucom.2015.12.141
Luo CHe TZhang XZhou Z(2015)Learning Forum Posts Topic Discovery and Its Application in Recommendation SystemJournal of Software10.17706/jsw.10.4.392-40210:4(392-402)Online publication date: Apr-2015
https://doi.org/10.17706/jsw.10.4.392-402
Show More Cited By

Index Terms

A probabilistic topic-connection model for automatic image annotation
1. Computing methodologies
  1. Machine learning
2. Information systems
  1. Information systems applications
    1. Data mining
    2. Multimedia information systems
      1. Multimedia databases

Recommendations

Probabilistic models for topic learning from images and captions in online biomedical literatures
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Biomedical images and captions are one of the major sources of information in online biomedical publications. They often contain the most important results to be reported, and provide rich information about the main themes in published papers. In the ...
Automatic Image Annotation Using Global and Local Features
SMAP '11: Proceedings of the 2011 Sixth International Workshop on Semantic Media Adaptation and Personalization

Automatic image annotation methods require a quality training image dataset, from which annotations for target images are obtained. At present, the main problem with these methods is their low effectiveness and scalability if a large-scale training ...
Semi-supervised topic modeling for image annotation
MM '09: Proceedings of the 17th ACM international conference on Multimedia

We propose a novel technique for semi-supervised image annotation which introduces a harmonic regularizer based on the graph Laplacian of the data into the probabilistic semantic model for learning latent topics of the images. By using a probabilistic ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management

October 2010

2036 pages

ISBN:9781450300995

DOI:10.1145/1871437

General Chair:
Jimmy Huang
York University, Canada
,
Program Chairs:
Nick Koudas
University of Toronto, Canada
,
Gareth Jones
Dublin City University, Ireland
,
Xindong Wu
University of Vermont, USA
,
Kevyn Collins-Thompson
Microsoft Research, USA
,
Aijun An
York University, Canada

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM '10

Sponsor:

CIKM '10: International Conference on Information and Knowledge Management

October 26 - 30, 2010

ON, Toronto, Canada

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
366
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kawai MSato HShiohama T(2022)Topic model-based recommender systems and their applications to cold-start problemsExpert Systems with Applications10.1016/j.eswa.2022.117129202(117129)Online publication date: Sep-2022
https://doi.org/10.1016/j.eswa.2022.117129
Sun GWu XPeng Q(2016)Part-based clothing image annotation by visual neighbor retrievalNeurocomputing10.1016/j.neucom.2015.12.141213:C(115-124)Online publication date: 12-Nov-2016
https://dl.acm.org/doi/10.1016/j.neucom.2015.12.141
Luo CHe TZhang XZhou Z(2015)Learning Forum Posts Topic Discovery and Its Application in Recommendation SystemJournal of Software10.17706/jsw.10.4.392-40210:4(392-402)Online publication date: Apr-2015
https://doi.org/10.17706/jsw.10.4.392-402
Lin ZDing GHu M(2015)Image auto-annotation via tag-dependent random search over range-constrained visual neighboursMultimedia Tools and Applications10.1007/s11042-013-1811-374:11(4091-4116)Online publication date: 1-Jun-2015
https://dl.acm.org/doi/10.1007/s11042-013-1811-3
Wang ZCui PXie LZhu WRui YYang S(2014)Bilateral Correspondence Model for Words-and-Pictures Association in Multimedia-Rich MicroblogsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/261138810:4(1-21)Online publication date: 4-Jul-2014
https://dl.acm.org/doi/10.1145/2611388
Wang JHu XTu XHe TChen XLebanon GWang HZaki M(2012)Author-conference topic-connection model for academic network searchProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398597(2179-2183)Online publication date: 29-Oct-2012
https://dl.acm.org/doi/10.1145/2396761.2398597
Chen XHu XZhou ZAn YHe TPark EChen XLebanon GWang HZaki M(2012)Modeling semantic relations between visual attributes and object categories via dirichlet forest priorProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398428(1263-1272)Online publication date: 29-Oct-2012
https://dl.acm.org/doi/10.1145/2396761.2398428

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten