Skip to main content

Advertisement

Log in

Data-driven approaches for social image and video tagging

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The large success of online social platforms for creation, sharing and tagging of user-generated media has lead to a strong interest by the multimedia and computer vision communities in research on methods and techniques for annotating and searching social media. Visual content similarity, geo-tags and tag co-occurrence, together with social connections and comments, can be exploited to perform tag suggestion as well as to per-form content classification and c lustering and enable more effective semantic indexing and retrieval of visual data. However there is need to overcome the relatively low quality of these metadata: user produced tags and annotations are known to be ambiguous, imprecise and/or incomplete, excessively personalized and limited - and at the same time take into account the ‘web-scale’ quantity of media and the fact that social network users continuously add new images and create new terms. We will review the state of the art approaches to automatic annotation and tag refinement for social images, considering also the temporal patterns of their usage, and discuss extensions to tag suggestion and localization in web video sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. Alexa Internet, Inc. http://www.alexa.com

  2. Google Trends. http://www.google.com/trends

  3. Available on request at: http://www.micc.unifi.it/ballan/research/tag-webvideos/

  4. Source code and dataset metadata are available from http://www.micc.unifi.it/uricchio/

References

  1. Alonso O, Gertz M, Baeza-Yates R (2007) On the value of temporal information in information retrieval. SIGIR Forum 41(2): 35–41

    Article  Google Scholar 

  2. Ballan L, Bertini M, Del Bimbo A, Meoni M, Serra G (2010) Tag suggestion and localization in user-generated videos based on social knowledge. In: Proceedings of ACM SIGMM Workshop on Social Media (WSM). Firenze

  3. Ballan L, Bertini M, Del Bimbo A, Serra G (2011) Enriching and localizing semantic tags in internet videos. In: Proceedings of ACM international conference on multimedia (ACM MM), pp 1541–1544. doi:10.1145/2072298.2072060

  4. Choi H, Varian H (2011) Predicting the present with Google Trends. Tech. rep., Google

  5. Chu W T, Li C J (2011) Tag suggestion and localization for web videos by bipartite graph matching. In: Proceedings of ACM SIGMM Workshop on Social Media (WSM). New York

  6. Chua T S, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from National University of Singapore. In: Proceedings of ACM CIVR

  7. Cohen J (1988) Statistical power analysis for the behavioral sciences. Routledge Academic

  8. Ginsberg J, Mohebbi M H, Patel R S, Brammer L, Smolinski M S, Brilliant L (2009) Detecting influenza epidemics using search engine query data. Nature 457(7232): 1012–1014

    Article  Google Scholar 

  9. Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: Proceedings of ICCV

  10. Huiskes M J, Lew MS (2008), The MIR Flickr retrieval evaluation. In: Proceeding of ACM MIR

  11. Huiskes MJ, Thomee B, Lew MS (2010) New trends and ideas in visual concept detection: the MIR Flickr retrieval evaluation initiative. In: Proceedings of ACM MIR, pp 527–536

  12. Jin X, Gallagher A, Cao L, Luo J, Han J (2010) The wisdom of social multimedia: using Flickr for prediction and forecast. In: Proceedings of ACM MM, pp 1235–1244

  13. Kennedy L S, Chang S F, Kozintsev I V (2006) To search or to label? Predicting the performance of search-based automatic image classifiers. In: Proceedings of ACM MIR

  14. Kennedy L S, Slaney M, Weinberger K (2009) Reliable tags using image similarity: mining specificity and expertise from large-scale multimedia databases. In: Proceedings of ACM-MM Workshop on Web-Scale Multimedia Corpus. Beijing

  15. Kim G, Xing EP (2013) Time-sensitive web image ranking and retrieval via dynamic multi-task regression. In: Proceedings of ACM WSDM, pp 163–172

  16. Kim G, Xing EP, Torralba A (2010) Modeling and analysis of dynamic behaviors of web image collections. In: Proceedings of ECCV, pp 85–98

  17. Kim G, Fei-Fei L, Xing EP (2012) Web image prediction using multivariate point processes. In: Proceedings of ACM SIGKDD, pp 1068–1076

  18. Li G, Wang M, Zheng Y T, Chua T S (2011) ShotTagger: tag location for internet videos. In: Proceedings of ACM ICMR

  19. Li H, Yi L, Guan Y, Zhang H (2013) DUT-WEBV: a benchmark dataset for performance evaluation of tag localization for web video. In: Proceedings of MMM

  20. Li X, Snoek C G M, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans Multimed 11(7): 1310–1322

    Article  Google Scholar 

  21. Li X, Snoek C G M, Worring M (2010a) Unsupervised multi-feature tag relevance learning for social image retrieval. In: Proceedings of ACM CIVR

  22. Li Z, Liu J, Zhu X, Liu T, Lu H (2010b) Image annotation using multi-correlation probabilistic matrix factorization. In: Proceedings of the international conference on multimedia, MM’10. ACM, New York, pp 11871190

  23. Liu D, Hua X S, Yang L, Wang M, Zhang HJ (2009) Tag ranking. In: Proceedings of WWW

  24. Liu D, Hua X S, Wang M, Zhang HJ (2010) Image retagging. In: Proceedings of ACM multimedia

  25. Liu D, Hua X S, Zhang H J (2011a) Content-based tag processing for internet social images. Multimed Tools Appl 51(1): 723–738

    Article  Google Scholar 

  26. Liu D, Yan S, Hua X S, Zhang H J (2011b) Image retagging using collaborative tag propagation. IEEE Trans Multimed 13(4): 702–712

    Article  Google Scholar 

  27. Liu Y, Jin R, Yang L (2006) Semi-supervised multi-label learning by constrained non-negative matrix factorization. In: AAAI-06: proceedings of the ninth national conference on artificial intelligence, vol 21. AAAI Press, p 421

  28. Makadia A, Pavlovic V, Kumar S (2008) A new baseline for image annotation. In: Proceedings of ECCV

  29. Min H S, Choi J, De Neve W, Ro Y M, Plataniotis K N (2009) Semantic annotation of personal video content using an image folksonomy. In: Proceedings of IEEE ICIP

  30. Paisitkriangkrai S, Mei T, Zhang J, Hua X S (2010) Scalable clip-based near-duplicate video detection with ordinal measure. In: Proceedings of ACM CIVR

  31. Rattenbury T, Good N, Naaman M (2007) Towards automatic extraction of event and place semantics from flickr tags. In: Proceedings of ACM SIGIR, pp 103–110

  32. Salakhutdinov R, Mnih A (2008) Probabilistic matrix factorization. Adv. Neural Info Process Syst 20: 1257–1264

    Google Scholar 

  33. Sang J, Xu C, Liu J (2012) User-aware image tag refinement via ternary semantic analysis. IEEE Trans Multimed 14(3): 883–895

    Article  Google Scholar 

  34. Shao J, Yin W, Ma S, Zhuang Y (2010) Topic discovery of web video using star-structured k-partite graph. In: Proceedings of ACM multimedia

  35. Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceedings of WWW, pp 327–336

  36. Thomee B, Bakker EM, Lew MS (2010) TOP-SURF: a visual words toolkit. In: Proceedings of ACM multimedia. doi:10.1145/1873951.1874250

  37. Tsai D, Jing Y, Liu Y, Rowley H A, Ioffe S, Rehg J M (2011) Large-scale image annotation using visual synset. In: 2011 IEEE International conference on computer vision (ICCV). IEEE, pp 611–618

  38. Ulges A, Schulze C, Koch M, Breuel T M (2010) Learning automatic concept detectors from online video. Comp Vision Image Underst 114(4): 429–438

    Article  Google Scholar 

  39. Verbeek J, Guillaumin M, Mensink T, Schmid C (2010) Image annotation with TagProp on the MIRFLICKR set. In: Proceedings of ACM MIR

  40. von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of ACMCHI

  41. Wang C, Jing F, Zhang L, Zhang HJ (2007) Content-based image annotation refinement. In: Proceedings of CVPR

  42. Zhang ML, Zhou ZH (2004) Improve multi-instance neural networks through feature selection. Neural Process Lett 19(1):1–10. doi:10.1023/B:NEPL.0000016836.03614.9f

    Google Scholar 

  43. Zhu G, Yan S, Ma Y (2010) Image tag refinement towards low-rank. In: Proceedings of ACM multimedia

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Bertini.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ballan, L., Bertini, M., Uricchio, T. et al. Data-driven approaches for social image and video tagging. Multimed Tools Appl 74, 1443–1468 (2015). https://doi.org/10.1007/s11042-014-1976-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-1976-4

Keywords

Navigation