skip to main content
10.1145/1816041.1816052acmconferencesArticle/Chapter ViewAbstractPublication PagescivrConference Proceedingsconference-collections
research-article

The accuracy and value of machine-generated image tags: design and user evaluation of an end-to-end image tagging system

Published:05 July 2010Publication History

ABSTRACT

Automated image tagging is a problem of great interest, due to the proliferation of photo sharing services. Researchers have achieved considerable advances in understanding motivations and usage of tags, recognizing relevant tags from image content, and leveraging community input to recommend more tags. In this work we address several important issues in building an end-to-end image tagging application, including tagging vocabulary design, taxonomy-based tag refinement, classifier score calibration for effective tag ranking, and selection of valuable tags, rather than just accurate ones. We surveyed users to quantify tag utility and error tolerance, and use this data in both calibrating scores from automatic classifiers and in taxonomy based tag expansion. We also compute the relative importance among tags based on user input and statistics from Flickr. We present an end-to-end system evaluated on thousands of user-contributed photos using 60 popular tags. We can issue four tags per image with over 80% accuracy, up from 50% baseline performance, and we confirm through a comparative user study that value-ranked tags are preferable to accuracy-ranked tags.

References

  1. Flickr API. http://www.flickr.com/services/api/.Google ScholarGoogle Scholar
  2. IBM Multimedia Analysis and Retrieval System. http://www.alphaworks.ibm.com/tech/imars.Google ScholarGoogle Scholar
  3. The PASCAL visual object classes homepage. http://pascallin.ecs.soton.ac.uk/challenges/VOC/.Google ScholarGoogle Scholar
  4. M. Ames and M. Naaman. Why we tag: Motivations for annotation in mobile and online media. In Proc. CHI, pages 971--980, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. M. Blei, and M. I. Jordan. Matching words and pictures. J. Mach. Learn. Res., 3:1107--1135, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Bischoff, C. S. Firan, W. Nejdl, and R. Paiu. Can all tags be used for search? In Proc. ACM CIKM, pages 193--202, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Coates. Two cultures of fauxonomies collide... http://www.plasticbag.org/archives/2005/06/, 2005.Google ScholarGoogle Scholar
  8. T. M. Cover and J. Thomas. Elements of information theory. Wiley, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Elson, J. R. Douceur, J. Howell, and J. Saul. Asirra: A captcha that exploits interest-aligned manual image categorization. In ACM CCS '07, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. C. Fellbaum et al. WordNet: An electronic lexical database. MIT press Cambridge, MA, 1998.Google ScholarGoogle Scholar
  11. P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object Detection with Discriminatively Trained Part Based Models. Journal of Artificial Intelligence Research, 29, 2007.Google ScholarGoogle Scholar
  12. N. Garg and I. Weber. Personalized tag suggestion for Flickr. In Proc. WWW, pages 1063--1064, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. A. Golder and B. A. Huberman. Usage patterns of collaborative tagging systems. J. Inf. Sci., 32(2), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. Kennedy, M. Slaney, and K. Weinberger. Reliable tags using image similarity: mining specificity and expertise from large-scale multimedia databases. In WSMC '09: Proc. of workshop on Web-scale multimedia corpus, pages 17--24, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Li and J. Z. Wang. Real-time computerized annotation of pictures. IEEE Trans. PAMI, 30(6):985--1002, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Liu, X.-S. Hua, L. Yang, M. Wang, and H.-J. Zhang. Tag ranking. In Proc. WWW, pages 351--360, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. H. Liu and P. Singh. ConceptNet: a practical commonsense reasoning toolkit. BT Tech. Journal, 22(4):211--226, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Campbell et. al. IBM research TRECVID-2007 video retrieval system. TREC Video Retrieval Evaluation Online Proceeding, 2007.Google ScholarGoogle Scholar
  19. M. Naphade et. al. Large-scale concept ontology for multimedia. IEEE Multimedia, 13(3):86--91, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Marlow, M. Naaman, D. Boyd, and M. Davis. HT06, tagging paper, taxonomy, Flickr, academic article, to read. In Proc. the 17th conference on Hypertext and hypermedia, page 40, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Marszałek and C. Schmid. Semantic hierarchies for visual object recognition. In Proc. CVPR, jun 2007.Google ScholarGoogle ScholarCross RefCross Ref
  22. P. Bolettieri et. al. CoPhIR: a test collection for content-based image retrieval. CoRR, 2009.Google ScholarGoogle Scholar
  23. J. Platt. Probabilities for SV machines. Advances in Neural Information Processing Systems, pages 61--74, 1999.Google ScholarGoogle Scholar
  24. T. Rattenbury, N. Good, and M. Naaman. Towards extracting Flickr tag semantics. In Proc. WWW '07, pages 1287--1288, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Reed and D. Lenat. Mapping ontologies into cyc. In Proc. AAAI Conference 2002 Workshop on Ontologies for the Semantic Web, pages 02--11, 2002.Google ScholarGoogle Scholar
  26. R. Shi, C.-H. Lee, and T.-S. Chua. Enhancing image annotation by integrating concept ontology and text-based bayesian learning model. In Proc. ACM Multimedia, pages 341--344, New York, NY, USA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. Sigurbjörnsson and R. van Zwol. Flickr tag recommendation based on collective knowledge. In Proc. WWW '08, pages 327--336, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and TRECVid. In Proceedings of the 8th ACM international workshop on Multimedia information retrieval, page 330, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Torralba, R. Fergus, and W. Freeman. 80 million tiny images: a large dataset for non-parametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11): 1958--1970, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. K. Van De Sande, T. Gevers, and C. Snoek. Evaluation of color descriptors for object and scene recognition. In Proc. IEEE CVPR, page 1, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  31. V. Vapnik. The nature of statistical learning theory. Springer Verlag, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In Proc. of the Intl. Conf. on Computer Vision (ICCV), 2009.Google ScholarGoogle ScholarCross RefCross Ref
  33. L. Von Ahn and L. Dabbish. Labeling images with a computer game. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, pages 319--326, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. K. Q. Weinberger, M. Slaney, and R. Van Zwol. Resolving tag ambiguity. In Proc. ACM Multimedia, pages 111--120, New York, NY, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. R. Yan, J. Tesic, and J. R. Smith. Model-shared subspace boosting for multi-label classification. In Proc. ACM KDD, pages 834--843, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The accuracy and value of machine-generated image tags: design and user evaluation of an end-to-end image tagging system

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIVR '10: Proceedings of the ACM International Conference on Image and Video Retrieval
      July 2010
      492 pages
      ISBN:9781450301176
      DOI:10.1145/1816041

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 July 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader