ABSTRACT
Even after 20 years of research on real-world image retrieval, there is still a big gap between what search engines can provide and what users expect to see. To bridge this gap, we present an image knowledge base, ImageKB, a graph representation of structured entities, categories, and representative images, as a new basis for practical image indexing and search. ImageKB is automatically constructed via a both bottom-up and top-down, scalable approach that efficiently matches 2 billion web images onto an ontology with millions of nodes. Our approach consists of identifying duplicate image clusters from billions of images, obtaining a candidate set of entities and their images, discovering definitive texts to represent an image and identifying representative images for an entity. To date, ImageKB contains 235.3M representative images corresponding to 0.52M entities, much larger than the state-of-the-art alternative ImageNet that contains 14.2M images for 0.02M synsets. Compared to existing image databases, ImageKB reflects the distributions of both images on the web and users' interests, contains rich semantic descriptions for images and entities, and can be widely used for both text to image search and image to text understanding.
- Smith, J., Chang, S.F.: An image and video search engine for the world wide web (1996) In: SPIE.Google Scholar
- Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys 40 (2008) 1--60. Google ScholarDigital Library
- Garcia, S., Williams, H.E., Cannane, A.: Access-ordered indexes (2004) In: ACSC. Google ScholarDigital Library
- Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Computing Surveys 38 (2006) Google ScholarDigital Library
- Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large dataset for non-parametric object and scene recognition. (In: IEEE T-PAMI). Google ScholarDigital Library
- Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database (2009) In: CVPR.Google Scholar
- Fellbaum, C.: Wordnet: An electronic lexical database (1998) Bradford Books.Google Scholar
- Shi, S., Zhang, H., Yuan, X., Wen, J.: Corpus-based semantic class mining: distributional vs. pattern-based approaches (2010) In: ICCL. Google ScholarDigital Library
- Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision. Google ScholarDigital Library
- Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision. Google ScholarDigital Library
- Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories (2004) In: CVPR Workshop on Generative-Model Based Vision. Google ScholarDigital Library
- Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694 (2007).Google Scholar
- Russell, B., Torralba, A., Murphy, K., Freeman, W.: Labelme: a database and web-based tool for image annotation. In: IJCV 77 (2008) 157--173. Google ScholarDigital Library
- Deselaers, T., Ferrari, V.: Visual and semantic similarity in imagenet (2011) In: CVPR. Google ScholarDigital Library
- Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: Learning to rank with joint word-image embeddings (2010) In: ECCV.Google Scholar
- Weston, J., Bengio, S., Usunier, N.: Large scale image annotation: Learning to rank with joint word-image embeddings (2010) In: ECCV.Google Scholar
- Good, J.: How many photos have ever been taken? (2011) http://blog.1000memories.com/94-number-ofphotos-ever-taken-digital-and-analog-in-shoebox.Google Scholar
- Wang, X.J., Zhang, L., Jing, F., Ma, W.Y.: Lei zhang, feng jing, wei-ying ma, annosearch: Image auto-annotation by search (2006) In: CVPR. Google ScholarDigital Library
- Wang, X.J., Zhang, L., Liu, M., Li, Y., Ma, W.Y.: Arista - image search to annotation on billions of web photos (2010) In: CVPR.Google Scholar
- Wang, X.J., Zhang, L., Ma, W.Y.: Duplicate search-based image annotation using web-scale data. Proceedings of IEEE (2012)Google Scholar
- Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos (2003) In Proc. ICCV. Google ScholarDigital Library
- Chum, O., Philbin, J., Zisserman, A.: Near duplicate image detection: min-hash and tf-idf weighting (2008) In Proc. BMVC.Google Scholar
- Ke, Y., Sukthankar, R., Huston, L.: Efficient near-duplicate detection and sub-image retrieval (2004) In: ACM Multimedia. Google ScholarDigital Library
- Chum, O., Matas, J.: Large scale discovery of spatilly related images. IEEE T-PAMI (2010) Google ScholarDigital Library
- Lee, D., Ke, Q., Isard, M.: Partition min-hash for partial duplicate image discovery (2010) In: ECCV. Google ScholarDigital Library
- Pearson, K.: On lines and planes of closest fit to systems of points in space. Philosophical Magazine 2 (1901) 559--572.Google ScholarCross Ref
- Abdi, H., Williams, L.: Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics 2 (2010) 433--459.Google ScholarDigital Library
- Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization (1997) In: ICML. Google ScholarDigital Library
- Chang, C., Lin, C.: Libsvm: A library for support vector machines (2012) http://www.csie.ntu.edu.tw/cjlin/libsvm. Google ScholarDigital Library
- Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers. MIT Press (1999).Google Scholar
Index Terms
- Towards indexing representative images on the web
Recommendations
Understanding web images by object relation network
WWW '12: Proceedings of the 21st international conference on World Wide WebThis paper presents an automatic method for understanding and interpreting the semantics of unannotated web images. We observe that the relations between objects in an image carry important semantics about the image. To capture and describe such ...
Joint statistical analysis of images and keywords with applications in semantic image enhancement
MM '12: Proceedings of the 20th ACM international conference on MultimediaWith the advent of social image-sharing communities, millions of images with associated semantic tags are now available online for free and allow us to exploit this abundant data in new ways. We present a fast non-parametric statistical framework ...
Image search—from thousands to billions in 20 years
Special Sections on the 20th Anniversary of ACM International Conference on Multimedia, Best Papers of ACM Multimedia 2012This article presents a comprehensive review and analysis on image search in the past 20 years, emphasizing the challenges and opportunities brought by the astonishing increase of dataset scales from thousands to billions in the same time period, which ...
Comments