research-article

Towards indexing representative images on the web

Authors:
Xin-Jing Wang

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China
View Profile

,
Zheng Xu

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Lei Zhang

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China
View Profile

,
Ce Liu

Micrososft Research New England, Boston, MA, USA

Micrososft Research New England, Boston, MA, USA
View Profile

,
Yong Rui

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China
View Profile

MM '12: Proceedings of the 20th ACM international conference on MultimediaOctober 2012Pages 1229–1238https://doi.org/10.1145/2393347.2396423

Published:29 October 2012Publication History

MM '12: Proceedings of the 20th ACM international conference on Multimedia

Pages 1229–1238

ABSTRACT

Even after 20 years of research on real-world image retrieval, there is still a big gap between what search engines can provide and what users expect to see. To bridge this gap, we present an image knowledge base, ImageKB, a graph representation of structured entities, categories, and representative images, as a new basis for practical image indexing and search. ImageKB is automatically constructed via a both bottom-up and top-down, scalable approach that efficiently matches 2 billion web images onto an ontology with millions of nodes. Our approach consists of identifying duplicate image clusters from billions of images, obtaining a candidate set of entities and their images, discovering definitive texts to represent an image and identifying representative images for an entity. To date, ImageKB contains 235.3M representative images corresponding to 0.52M entities, much larger than the state-of-the-art alternative ImageNet that contains 14.2M images for 0.02M synsets. Compared to existing image databases, ImageKB reflects the distributions of both images on the web and users' interests, contains rich semantic descriptions for images and entities, and can be widely used for both text to image search and image to text understanding.

References

Smith, J., Chang, S.F.: An image and video search engine for the world wide web (1996) In: SPIE.Google Scholar
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys 40 (2008) 1--60. Google ScholarDigital Library
Garcia, S., Williams, H.E., Cannane, A.: Access-ordered indexes (2004) In: ACSC. Google ScholarDigital Library
Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Computing Surveys 38 (2006) Google ScholarDigital Library
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large dataset for non-parametric object and scene recognition. (In: IEEE T-PAMI). Google ScholarDigital Library
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database (2009) In: CVPR.Google Scholar
Fellbaum, C.: Wordnet: An electronic lexical database (1998) Bradford Books.Google Scholar
Shi, S., Zhang, H., Yuan, X., Wen, J.: Corpus-based semantic class mining: distributional vs. pattern-based approaches (2010) In: ICCL. Google ScholarDigital Library
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision. Google ScholarDigital Library
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision. Google ScholarDigital Library
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision. Google ScholarDigital Library
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694 (2007).Google Scholar
Russell, B., Torralba, A., Murphy, K., Freeman, W.: Labelme: a database and web-based tool for image annotation. In: IJCV 77 (2008) 157--173. Google ScholarDigital Library
Deselaers, T., Ferrari, V.: Visual and semantic similarity in imagenet (2011) In: CVPR. Google ScholarDigital Library
Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: Learning to rank with joint word-image embeddings (2010) In: ECCV.Google Scholar
Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: Learning to rank with joint word-image embeddings (2010) In: ECCV.Google Scholar
Good, J.: How many photos have ever been taken? (2011) http://blog.1000memories.com/94-number-ofphotos-ever-taken-digital-and-analog-in-shoebox.Google Scholar
Wang, X.J., Zhang, L., Jing, F., Ma, W.Y.: Lei zhang, feng jing, wei-ying ma, annosearch: Image auto-annotation by search (2006) In: CVPR. Google ScholarDigital Library
Wang, X.J., Zhang, L., Liu, M., Li, Y., Ma, W.Y.: Arista - image search to annotation on billions of web photos (2010) In: CVPR.Google Scholar
Wang, X.J., Zhang, L., Ma, W.Y.: Duplicate search-based image annotation using web-scale data. Proceedings of IEEE (2012)Google Scholar
Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos (2003) In Proc. ICCV. Google ScholarDigital Library
Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-hash and tf-idf weighting (2008) In Proc. BMVC.Google Scholar
Ke, Y., Sukthankar, R., Huston, L.: Efficient near-duplicate detection and sub-image retrieval (2004) In: ACM Multimedia. Google ScholarDigital Library
Chum, O., Matas, J.: Large scale discovery of spatilly related images. IEEE T-PAMI (2010) Google ScholarDigital Library
Lee, D., Ke, Q., Isard, M.: Partition min-hash for partial duplicate image discovery (2010) In: ECCV. Google ScholarDigital Library
Pearson, K.: On lines and planes of closest fit to systems of points in space. Philosophical Magazine 2 (1901) 559--572.Google ScholarCross Ref
Abdi, H., Williams, L.: Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics 2 (2010) 433--459.Google ScholarDigital Library
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization (1997) In: ICML. Google ScholarDigital Library
Chang, C., Lin, C.: Libsvm: A library for support vector machines (2012) http://www.csie.ntu.edu.tw/cjlin/libsvm. Google ScholarDigital Library
Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers. MIT Press (1999).Google Scholar

Index Terms

Towards indexing representative images on the web
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
2. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
    2. Retrieval tasks and goals
  2. Information systems applications
    1. Multimedia information systems
      1. Multimedia databases

Recommendations

Understanding web images by object relation network
WWW '12: Proceedings of the 21st international conference on World Wide Web

This paper presents an automatic method for understanding and interpreting the semantics of unannotated web images. We observe that the relations between objects in an image carry important semantics about the image. To capture and describe such ...
Read More
Joint statistical analysis of images and keywords with applications in semantic image enhancement
MM '12: Proceedings of the 20th ACM international conference on Multimedia

With the advent of social image-sharing communities, millions of images with associated semantic tags are now available online for free and allow us to exploit this abundant data in new ways. We present a fast non-parametric statistical framework ...
Read More
Image search—from thousands to billions in 20 years
Special Sections on the 20th Anniversary of ACM International Conference on Multimedia, Best Papers of ACM Multimedia 2012

This article presents a comprehensive review and analysis on image search in the past 20 years, emphasizing the challenges and opportunities brought by the astonishing increase of dataset scales from thousands to billions in the same time period, which ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '12: Proceedings of the 20th ACM international conference on Multimedia
October 2012
1584 pages
ISBN:9781450310895
DOI:10.1145/2393347
General Chairs:
Noboru Babaguchi
Osaka University, Japan
,
Kiyoharu Aizawa
The University of Tokyo, Japan
,
John Smith
IBM, USA
,
Program Chairs:
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Thomas Plagemann
University of Oslo, Norway
,
Xian-Sheng Hua
Microsoft, USA
,
Rong Yan
Facebook, USA
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 October 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
image knowledge base
image understanding
large-scale text to image translation
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 283
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Towards indexing representative images on the web

MM '12: Proceedings of the 20th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Understanding web images by object relation network

Joint statistical analysis of images and keywords with applications in semantic image enhancement

Image search—from thousands to billions in 20 years

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Towards indexing representative images on the web

MM '12: Proceedings of the 20th ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Understanding web images by object relation network

Joint statistical analysis of images and keywords with applications in semantic image enhancement

Image search—from thousands to billions in 20 years

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media