ABSTRACT
Automated image tagging is a problem of great interest, due to the proliferation of photo sharing services. Researchers have achieved considerable advances in understanding motivations and usage of tags, recognizing relevant tags from image content, and leveraging community input to recommend more tags. In this work we address several important issues in building an end-to-end image tagging application, including tagging vocabulary design, taxonomy-based tag refinement, classifier score calibration for effective tag ranking, and selection of valuable tags, rather than just accurate ones. We surveyed users to quantify tag utility and error tolerance, and use this data in both calibrating scores from automatic classifiers and in taxonomy based tag expansion. We also compute the relative importance among tags based on user input and statistics from Flickr. We present an end-to-end system evaluated on thousands of user-contributed photos using 60 popular tags. We can issue four tags per image with over 80% accuracy, up from 50% baseline performance, and we confirm through a comparative user study that value-ranked tags are preferable to accuracy-ranked tags.
- Flickr API. http://www.flickr.com/services/api/.Google Scholar
- IBM Multimedia Analysis and Retrieval System. http://www.alphaworks.ibm.com/tech/imars.Google Scholar
- The PASCAL visual object classes homepage. http://pascallin.ecs.soton.ac.uk/challenges/VOC/.Google Scholar
- M. Ames and M. Naaman. Why we tag: Motivations for annotation in mobile and online media. In Proc. CHI, pages 971--980, 2007. Google ScholarDigital Library
- K. Barnard, P. Duygulu, D. Forsyth, N. de Freitas, D. M. Blei, and M. I. Jordan. Matching words and pictures. J. Mach. Learn. Res., 3:1107--1135, 2003. Google ScholarDigital Library
- K. Bischoff, C. S. Firan, W. Nejdl, and R. Paiu. Can all tags be used for search? In Proc. ACM CIKM, pages 193--202, 2008. Google ScholarDigital Library
- T. Coates. Two cultures of fauxonomies collide... http://www.plasticbag.org/archives/2005/06/, 2005.Google Scholar
- T. M. Cover and J. Thomas. Elements of information theory. Wiley, 1991. Google ScholarDigital Library
- J. Elson, J. R. Douceur, J. Howell, and J. Saul. Asirra: A captcha that exploits interest-aligned manual image categorization. In ACM CCS '07, 2007. Google ScholarDigital Library
- C. Fellbaum et al. WordNet: An electronic lexical database. MIT press Cambridge, MA, 1998.Google Scholar
- P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object Detection with Discriminatively Trained Part Based Models. Journal of Artificial Intelligence Research, 29, 2007.Google Scholar
- N. Garg and I. Weber. Personalized tag suggestion for Flickr. In Proc. WWW, pages 1063--1064, 2008. Google ScholarDigital Library
- S. A. Golder and B. A. Huberman. Usage patterns of collaborative tagging systems. J. Inf. Sci., 32(2), 2006. Google ScholarDigital Library
- L. Kennedy, M. Slaney, and K. Weinberger. Reliable tags using image similarity: mining specificity and expertise from large-scale multimedia databases. In WSMC '09: Proc. of workshop on Web-scale multimedia corpus, pages 17--24, 2009. Google ScholarDigital Library
- J. Li and J. Z. Wang. Real-time computerized annotation of pictures. IEEE Trans. PAMI, 30(6):985--1002, 2008. Google ScholarDigital Library
- D. Liu, X.-S. Hua, L. Yang, M. Wang, and H.-J. Zhang. Tag ranking. In Proc. WWW, pages 351--360, 2009. Google ScholarDigital Library
- H. Liu and P. Singh. ConceptNet: a practical commonsense reasoning toolkit. BT Tech. Journal, 22(4):211--226, 2004. Google ScholarDigital Library
- M. Campbell et. al. IBM research TRECVID-2007 video retrieval system. TREC Video Retrieval Evaluation Online Proceeding, 2007.Google Scholar
- M. Naphade et. al. Large-scale concept ontology for multimedia. IEEE Multimedia, 13(3):86--91, 2006. Google ScholarDigital Library
- C. Marlow, M. Naaman, D. Boyd, and M. Davis. HT06, tagging paper, taxonomy, Flickr, academic article, to read. In Proc. the 17th conference on Hypertext and hypermedia, page 40, 2006. Google ScholarDigital Library
- M. Marszałek and C. Schmid. Semantic hierarchies for visual object recognition. In Proc. CVPR, jun 2007.Google ScholarCross Ref
- P. Bolettieri et. al. CoPhIR: a test collection for content-based image retrieval. CoRR, 2009.Google Scholar
- J. Platt. Probabilities for SV machines. Advances in Neural Information Processing Systems, pages 61--74, 1999.Google Scholar
- T. Rattenbury, N. Good, and M. Naaman. Towards extracting Flickr tag semantics. In Proc. WWW '07, pages 1287--1288, 2007. Google ScholarDigital Library
- S. Reed and D. Lenat. Mapping ontologies into cyc. In Proc. AAAI Conference 2002 Workshop on Ontologies for the Semantic Web, pages 02--11, 2002.Google Scholar
- R. Shi, C.-H. Lee, and T.-S. Chua. Enhancing image annotation by integrating concept ontology and text-based bayesian learning model. In Proc. ACM Multimedia, pages 341--344, New York, NY, USA, 2007. Google ScholarDigital Library
- B. Sigurbjörnsson and R. van Zwol. Flickr tag recommendation based on collective knowledge. In Proc. WWW '08, pages 327--336, 2008. Google ScholarDigital Library
- A. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and TRECVid. In Proceedings of the 8th ACM international workshop on Multimedia information retrieval, page 330, 2006. Google ScholarDigital Library
- A. Torralba, R. Fergus, and W. Freeman. 80 million tiny images: a large dataset for non-parametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(11): 1958--1970, 2008. Google ScholarDigital Library
- K. Van De Sande, T. Gevers, and C. Snoek. Evaluation of color descriptors for object and scene recognition. In Proc. IEEE CVPR, page 1, 2008.Google ScholarCross Ref
- V. Vapnik. The nature of statistical learning theory. Springer Verlag, 2000. Google ScholarDigital Library
- A. Vedaldi, V. Gulshan, M. Varma, and A. Zisserman. Multiple kernels for object detection. In Proc. of the Intl. Conf. on Computer Vision (ICCV), 2009.Google ScholarCross Ref
- L. Von Ahn and L. Dabbish. Labeling images with a computer game. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, pages 319--326, 2004. Google ScholarDigital Library
- K. Q. Weinberger, M. Slaney, and R. Van Zwol. Resolving tag ambiguity. In Proc. ACM Multimedia, pages 111--120, New York, NY, USA, 2008. Google ScholarDigital Library
- R. Yan, J. Tesic, and J. R. Smith. Model-shared subspace boosting for multi-label classification. In Proc. ACM KDD, pages 834--843, 2007. Google ScholarDigital Library
Index Terms
- The accuracy and value of machine-generated image tags: design and user evaluation of an end-to-end image tagging system
Recommendations
Tagging photos using users' vocabularies
Online social image share websites such as Flickr and Panoramio allow users to manually annotate their images with their own words, which can be used to facilitating image retrieval and other image applications. The smart-phones have made it possible ...
Tagging image with informative and correlative tags
APWeb'11: Proceedings of the 13th Asia-Pacific web conference on Web technologies and applicationsAutomatic tagging can automatically label images with semantic tags to significantly facilitate multimedia search and organization. Existing tagging methods often use probabilistic or co-occurring tags, which may result in ambiguity and noise. In this ...
Tagging Image with Informative and Correlative Tags
APWeb 2011: 13th Asia-Pacific Web Conference on Web Technologies and Applications - Volume 6612Automatic tagging can automatically label images with semantic tags to significantly facilitate multimedia search and organization. Existing tagging methods often use probabilistic or co-occurring tags, which may result in ambiguity and noise. In this ...
Comments