ABSTRACT
Can we have a universal detector that could recognize unseen objects with no training exemplars available? Such a detector is so desirable, as there are hundreds of thousands of object concepts in human vocabulary but few available labeled image examples. In this study, we attempt to build such a universal detector to predict concepts in the absence of training data. First, by considering both semantic relatedness and visual variance, we mine a set of realistic small-semantic-gap (SSG) concepts from a large-scale image corpus. Detectors of these concepts can deliver reasonably satisfactory recognition accuracies. From these distinctive visual models, we then leverage the semantic ontology knowledge and co-occurrence statistics of concepts to extend visual recognition to unseen concepts. To the best of our knowledge, this work presents the first research attempting to substantiate the semantic gap measuring of a large amount of concepts and leverage visually learnable concepts to predicate those with no training images available. Testings on NUS-WIDE dataset demonstrate that the selected concepts with small semantic gaps can be well modeled and the prediction of unseen concepts delivers promising results with comparable accuracy to preliminary training-based methods.
- T. Chua, J. Tang, R. Hong, H. Li, Z. Luo, and Y. Zheng. Nus-wide: A real-world web image database from national university of singapore. In CIVR, 2009. Google ScholarDigital Library
- R. Cilibrasi and P. Vitányi. The google similarity distance. TKDE, 2007. Google ScholarDigital Library
- J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.Google ScholarCross Ref
- R. Duda, D. Stork, and P. Hart. Pattern Classification. John Wiley, 2000. Google ScholarDigital Library
- F. Li, A. Iyer, C. Koch, and P. Perona. What do we perceive in a glance of a real-world scene? Journal of Vision, 2007.Google Scholar
- C. Fellbaum. WordNet: An Electronic Lexical Database. MIT Press, 1998.Google ScholarCross Ref
- Y. Gao and J. Fan. Incorporating concept ontology to enable probabilistic concept reasoning for multi-level image annotation. In MIR, 2006. Google ScholarDigital Library
- G. Griffin and D. Perona. Learning and using taxonomies for fast visual categorization. In CVPR, 2008.Google ScholarCross Ref
- Y. Jiang, C. Ngo, and S. Chang. Semantic context transfer across heterogeneous sources for domain adaptive video search. In MM, 2009. Google ScholarDigital Library
- D. Liu, X.-S. Hua, L. Yang, M. Wang and H.-J. Zhang, Tag ranking, In WWW, 2009. Google ScholarDigital Library
- E. Rosch and B. Lloyd. Cognition and categorization. Hillsdale, NJ: Lawrence Erlbaum, 1978.Google Scholar
- B. Schölkopf and A. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, 2002.Google ScholarDigital Library
- J. Tang, S. Yan, R. Hong, G. Qi, and T. Chua. Inferring semantic concepts from community-contributed images and noisy tags. In MM, 2009. Google ScholarDigital Library
- B. Tversky and K. Hemenway. Categories of environmental scenes. Cognitive Psychology, 1983.Google Scholar
- Z. Wu and M. Palmer. Verb semantics and lexical selection. In ACL, 1994. Google ScholarDigital Library
- J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid. Local features and kernels for classification of texture and object categories: A comprehensive study. IJCV, 2007. Google ScholarDigital Library
- A. Zweig and D. Weinshall. Exploiting object hierarchy: Combining models from different category levels. In ICCV, 2007.Google ScholarCross Ref
Index Terms
- Towards a universal detector by mining concepts with small semantic gaps
Recommendations
Short communication: Towards a universal detector by mining concepts with small semantic gaps
Can we have a universal detector that could visually recognize unseen objects with no training exemplars available? Such a detector is so desirable, as there are hundreds of thousands of object concepts in human vocabulary but few labeled image examples ...
Constructing Concept Lexica With Small Semantic Gaps
In recent years, constructing mathematical models for visual concepts by using content features, i.e., color, texture, shape, or local features, has led to the fast development of concept-based multimedia retrieval. In concept-based multimedia retrieval,...
Multi-level feature representations for video semantic concept detection
Video semantic concept detection is a fundamental problem with many practical applications such as concept-based video retrieval. The major challenge of concept detection lies in the existence of the well-known semantic gap between the low-level visual ...
Comments