skip to main content
10.1145/2501105.2501110acmotherconferencesArticle/Chapter ViewAbstractPublication PagesvigtaConference Proceedingsconference-collections
research-article

"Tell me more": how semantic technologies can help refining internet image search

Authors Info & Claims
Published:15 July 2013Publication History

ABSTRACT

Several branches of computer vision heavily rely (but we could even say depend) on the availability of large datasets of labelled images. While such labeling is usually done by hand, a powerful help can be obtained from Internet and its related tools. In this paper we address the problem of automatically generating a set of images representing an object class, given the name of the class. We exploit semantic technologies, such as lexical resources and ontologies, in order to improve the search performances by using a standard web search engine. We will also discuss an application to the automatic building of a training set for a classification framework. Preliminary experiments are provided for 10 classes from the public CalTech256 dataset and results show an average increment in classification accuracy of about 10%.

References

  1. K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. M. Blei, and M. I. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107--1135, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. L. Berg and D. A. Forsyth. Animals on the web. In CVPR, volume 2, pages 1463--1470, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. C. Bunescu and R. J. Mooney. Multiple instance learning for sparse positive bags. In ICML, pages 105--112, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Carbonetto, N. de Freitas, and K. Barnard. A statistical model for general contextual object recognition. In ECCV, volume 3021 of Lecture Notes in Computer Science, pages 350--362. Springer Berlin Heidelberg, 2004.Google ScholarGoogle Scholar
  5. Y. Chen, X. S. Zhou, and T. Huang. One-class svm for learning in image retrieval. In ICIP, volume 1, pages 34--37, 2001.Google ScholarGoogle Scholar
  6. B. Collins, J. Deng, K. Li, and L. Fei-Fei. Towards scalable dataset construction: An active learning approach. In ECCV, pages 86--98, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. A. Bray. Visual categorization with bags of keypoints. In ECCV workshop, pages 1--22, 2004.Google ScholarGoogle Scholar
  8. C. Fellbaum. WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press, 1998.Google ScholarGoogle Scholar
  9. C. Fellbaum. Wordnet and wordnets. In K. Brown, editor, Encyclopedia of Language and Linguistics, pages 665--670, Oxford, 2005. Elsevier.Google ScholarGoogle Scholar
  10. R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from internet image searches. Internet Vision, 98(8):1453--1466, 2010.Google ScholarGoogle Scholar
  11. G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report CNS-TR-2007-001, California Institute of Technology, 2007.Google ScholarGoogle Scholar
  12. N. Guarino and C. Welty. An overview of ontoclean. In S. Staab and R. Studer, editors, Handbook on Ontologies, International Handbooks on Information Systems, pages 201--220. Springer Berlin Heidelberg, 2009.Google ScholarGoogle Scholar
  13. S. Helmer, D. Meger, P. Viswanathan, S. McCann, M. Dockrey, P. Fazli, T. Southey, M. Muja, M. Joya, L. Jim, D. G. Lowe, and A. K. Mackworth. Semantic robot vision challenge: Current state and future directions. In IJCAI workshop, 2009.Google ScholarGoogle Scholar
  14. L.-J. Li and L. Fei-Fei. Optimol: Automatic online picture collection via incremental model learning. Int. J. Comput. Vis., 88(2):147--168, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. G. Lowe. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis., 60(2):91--110, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Masolo, S. Borgo, A. Gangemi, N. Guarino, and A. Oltramari. Wonderweb deliverable d18. Technical report, CNR, 2003.Google ScholarGoogle Scholar
  17. Princeton University. Wordnet online. http://wordnet.princeton.edu/, May 2010.Google ScholarGoogle Scholar
  18. F. Schroff, A. Criminisi, and A. Zisserman. Harvesting image databases from the web. In ICCV, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  19. A. T. Setz and C. G. M. Snoek. Can social tagged images aid concept-based video search? In ICME, pages 1460--1463, Piscataway, NJ, USA, 2009. IEEE Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Ulges, C. Schulze, M. Koch, and T. M. Breuel. Learning automatic concept detectors from online video. Comput. Vis. Image Underst., 114(4):429--438, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Vijayanarasimhan and K. Grauman. Keywords to visual categories: Multiple-instance learning forweakly supervised object categorization. In CVPR, 2008.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. "Tell me more": how semantic technologies can help refining internet image search

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          VIGTA '13: Proceedings of the International Workshop on Video and Image Ground Truth in Computer Vision Applications
          July 2013
          71 pages
          ISBN:9781450321693
          DOI:10.1145/2501105

          Copyright © 2013 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 15 July 2013

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          VIGTA '13 Paper Acceptance Rate8of15submissions,53%Overall Acceptance Rate8of15submissions,53%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader