Abstract
In recent years, various Web-based sharing and community services such as Flickr and YouTube have made a vast and rapidly growing amount of multimedia content available online. Uploaded by individual participants, content in these immense pools of content is accompanied by varied types of metadata, such as social network data or descriptive textual information. These collections present, at once, new challenges and exciting opportunities for multimedia research. This article presents an approach for “social multimedia” applications. The approach is based on the experience of building a number of successful applications that are based on mining multimedia content analysis in social multimedia context.
Similar content being viewed by others
Notes
http://www.facebook.com/press/info.php?statistics, retrieved March 2010.
A connected component of this graph does not necessarily mean that all the connected clips actually overlap: the overlap property is not transitive.
Bewilderingly, some of the most examined photos included women in minimal clothing, even when the photos were not necessarily relevant to the location or the tag – proving that human factors are not always as predictable as researchers would hope. Or maybe they are.
It must be said that this author is also “guilty” in administrating such evaluations in past research.
References
Abbasi R, Chernov S, Nejdl W, Paiu R, Staab S (2009) Exploiting flickr tags and groups for finding landmark photos. In: ECIR ’09: proceedings of the 31th European conference on ir research on advances in information retrieval. Springer-Verlag, Berlin, Heidelberg, pp 654–661
Adams B, Phung D, Venkatesh S (2006) Extraction of social context and application to personal multimedia exploration. In: MULTIMEDIA ’06: proceedings of the 14th annual ACM international conference on multimedia. ACM, New York, NY, USA, pp 987–996
Ahern S, Eckles D, Good N, King S, Naaman M, Nair R (2007) Over-exposed? privacy patterns and considerations in online and mobile photo sharing. In: CHI ’07: proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA
Ahern S, King S, Naaman M, Nair R (2007) Summarization of online image collections via implicit feedback. In WWW ’07: proceedings of the 16th international conference on World Wide Web. ACM, New York, NY, USA, pp 1325–1326
Ahern S, Naaman M, Nair R, Yang JH-I (2007) World explorer: visualizing aggregate data from unstructured text in geo-referenced collections. In JCDL ’07: proceedings of the seventh ACM/IEEE-CS joint conference on digital libraries. ACM, New York, NY, USA, pp 1–10
Ames M, Naaman M (2007) Why we tag: motivations for annotation in mobile and online media. In: CHI ’07: proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison Wesley
Becker H, Naaman M, Gravano L (2009) Event identification in social media. In: WebDB ’09: proceedings of the 12th international workshop on the web and databases: colocated with ACM SIGMOD
Becker H, Naaman M, Gravano L (2010) Learning similarity metrics for event identification in social media. In: WSDM’10: proceedings of the third ACM international conference on web search and data mining. ACM, New York, NY, USA, pp 291–300
Beerends JG, Stemerdink JA (1992) A perceptual audio quality measure based on a psychoacoustic sound representation. J Audio Eng Soc 40(12):963–978
Berg TL, Forsyth DA (2007) Automatic ranking of iconic images. Technical report, U.C. Berkeley
Beyer KS, Goldstein J, Ramakrishnan R, Shaft U (1999) When is “nearest neighbor” meaningful? In ICDT ’99: proceedings of the 7th international conference on database theory. Springer-Verlag, London, UK, pp 217–235
Boll S (2007) Multitube–where web 2.0 and multimedia could meet. IEEE Multimed 14(1):9–13
Boll S, Bulterman D, Jain R, Chua T-S, Lienhart R, Wilcox L, Davis M, Venkatesh S (2004) Between context-aware media capture and multimedia content analysis: where do we find the promised land? In MULTIMEDIA ’04: proceedings of the 12th annual ACM international conference on multimedia. ACM, New York, NY, USA, pp 868–868
Boll S, Sandhaus P, Scherp A, Thieme S (2006) Metaxa—context- and content-driven metadata enhancement for personal photo books. In: Advances in multimedia modeling, pp 332–343
Boll S, Sandhaus P, Scherp A, Westermann U (2007) Semantics, content, and structure of many for the creation of personal photo albums. In: MULTIMEDIA ’07: proceedings of the 15th international conference on multimedia. ACM, New York, NY, USA, pp 641–650
Boutemedjet S, Ziou D (2008) A graphical model for context-aware visual content recommendation. IEEE Trans Multimedia 10(1):52–62
Bulterman DCA (2004) Is it time for a moratorium on metadata? IEEE MultiMed 11(4):10–17
Cao L, Luo J, Huang TS (2008) Annotating photo collections by label propagation according to multiple similarity cues. In: MM ’08: proceeding of the 16th ACM international conference on multimedia. ACM, New York, NY, USA, pp 121–130
Chang E (2008) Organizing multimedia data socially. In: CIVR ’08: proceedings of the 2008 international conference on content-based image and video retrieval. ACM, New York, NY, USA, pp 569–570
Chang EY (2005) Extent: fusing context, content, and semantic ontology for photo annotation. In: CVDB ’05: proceedings of the 2nd international workshop on computer vision meets databases. ACM, New York, NY, USA, pp 5–11
Chen W-C, Battestini A, Gelfand N, Setlur V (2009) Visual summaries of popular landmarks from community photo collections. In: MM ’09: proceedings of the seventeen ACM international conference on multimedia. ACM, New York, NY, USA, pp 789–792
Choudhury MD, Sundaram H, John A, Seligmann DD (2009) What makes conversations interesting? Themes, participants and consequences of conversations in online social media. In: WWW ’09: proceeding of the 18th international conference on World Wide Web. ACM, New York, NY, USA
Christel MG, Hauptmann AG, Wactlar HD (2002) Collages as dynamic summaries for news video. In: MULTIMEDIA ’02: proceedings of the 10th international conference on multimedia. ACM, pp 561–569
Crandall D, Backstrom L, Huttenlocher D, Kleinberg J (2009) Mapping the world’s photos. In: WWW ’09: proceeding of the 18th international conference on World Wide Web. ACM, New York, NY, USA
Creswell JW (2002) Research design: qualitative, quantitative, and mixed methods approaches, 2nd edn. Sage Publications, Thousand Oaks, CA, USA
Cunningham SJ, Nichols DM (2008) How people find videos. In: JCDL ’08: proceedings of the Eigth ACM/IEEE joint conference on digital libraries. ACM, New York, NY, USA
Das M, Farmer J, Gallagher A, Loui A (2008) Event-based location matching for consumer image collections. In: CIVR ’08: proceedings of the 2008 international conference on content-based image and video retrieval. ACM, New York, NY, USA, pp 339–348
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):1–60
Davis M, King S, Good N, Sarvas R (2004) From context to content: leveraging context to infer media metadata. In: Proceedings of the 12th international conference on multimedia (MM2004). ACM, pp 188–195
Davis M, Smith M, Stentiford F, Bambidele A, Canny J, Good N, King S, Janakiraman R (2006) Using context and similarity for face and location identification. In: Proceedings of the IS&T/SPIE 18th annual symposium on electronic imaging science and technology
Dubinko M, Kumar R, Magnani J, Novak J, Raghavan P, Tomkins A (2006) Visualizing tags over time. In: WWW ’06: proceedings of the 15th international conference on World Wide Web. ACM, New York, NY, USA, pp 193–202
Duda RO, Hart PE, Stork DG (2000) Pattern classification, 2nd edn. Wiley-Interscience
Elliott B, Özsoyoglu ZM (2008) Annotation suggestion and search for personal multimedia objects on the web. In: CIVR ’08: proceedings of the 2008 international conference on content-based image and video retrieval. ACM, New York, NY, USA, pp 75–84
Flickr.com (2010) http://www.flickr.com
Graham A, Garcia-Molina H, Paepcke A, Winograd T (2002) Time as essence for photo browsing through personal digital libraries. In: JCDL ’02: proceedings of the second ACM/IEEE-CS joint conference on digital libraries
Haitsma J, Kalker T (2003) A highly robust audio fingerprinting system with an efficient search strategy. J New Music Res 32(2):211–221
Hao Q, Cai R, Wang XJ, Yang JM, Pang Y, Zhang L (2009) Generating location overviews with images and tags by mining user-generated travelogues. In: MM ’09: proceedings of the seventeen ACM international conference on multimedia. ACM, New York, NY, USA, pp 801–804
Jaffe A, Naaman M, Tassa T, Davis M (2006) Generating summaries and visualization for large collections of geo-referenced photographs. In: MIR ’06: proceedings of the 8th ACM international workshop on multimedia information retrieval. ACM, New York, NY, USA, pp 89–98
Jaimes A, Christel M, Gilles S, Sarukkai R, Ma W-Y (2005) Multimedia information retrieval: what is it, and why isn’t anyone using it? In: MIR ’05: proceedings of the 7th ACM SIGMM international workshop on multimedia information retrieval. ACM, New York, NY, USA, pp 3–8
Ji R, Xie X, Yao H, Ma W-Y (2009) Mining city landmarks from blogs by graph modeling. In: MM ’09: proceedings of the seventeen ACM international conference on multimedia. ACM, New York, NY, USA, pp 105–114
Jing F, Zhang L, Ma W-Y (2006) Virtualtour: an online travel assistant based on high quality images. In: Proceedings of the 14th international conference on multimedia (MM2005). ACM, New York, NY, USA, pp 599–602
Joshi D, Luo J (2008) Inferring generic activities and events from image content and bags of geo-tags. In: Proceedings of the 2008 international conference on content-based image and video retrieval. ACM, Niagara Falls, Canada, pp 37–46
Kennedy LS, Chang S-F (2007) A reranking approach for context-based concept fusion in video indexing and retrieval. In: CIVR ’07: proceedings of the 6th ACM international conference on image and video retrieval. ACM, New York, NY, USA, pp 333–340
Kennedy LS, Naaman M (2008) Generating diverse and representative image search results for landmarks. In: WWW ’08: proceeding of the 17th international conference on World Wide Web. ACM, New York, NY, USA, pp 297–306
Kennedy L, Naaman M (2009) Less talk, more rock: automated organization of community-contributed collections of concert videos. In: WWW ’09: proceeding of the 18th international conference on World Wide Web. ACM, New York, NY, USA
Kennedy L, Chang S-F, Kozintsev I (2006) To search or to label? Predicting the performance of search-based automatic image classifiers. In: Proceedings of the 8th ACM international workshop on multimedia information retrieval, pp 249–258
Kennedy L, Naaman M, Ahern S, Nair R, Rattenbury T (2007) How flickr helps us make sense of the world: context and content in community-contributed media collections. In: Proceedings of the 15th international conference on multimedia (MM2007). ACM, New York, NY, USA, pp 631–640
Lew MS, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: state of the art and challenges. ACM TOMCCAP 2(1):1–19
Liu Y, Zhang D, Lu G, Ma W-Y (2009) A survey of content-based image retrieval with high-level semantics. Pattern Recogn. 40(1):262–282
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Luo J, Boutell M, Brown C (2006) Pictures are not taken in a vacuum—an overview of exploiting context for semantic scene content understanding. IEEE Signal Process Mag 23(2):101–114
Manjunath BS, Ma WY (1996) Texture features for browsing and retrieval of image data. IEEE Trans Pattern Anal Mach Intell 18(8):837–842
Mertens R, Farzan R, Brusilovsky P (2006) Social navigation in web lectures. In: HYPERTEXT ’06: proceedings of the seventeenth conference on hypertext and hypermedia. ACM, New York, NY, USA, pp 41–44
Naaman M, Nair R (2008) ZoneTag’s collaborative tag suggestions: what is this person doing in my phone? IEEE Multimed 15(3):34–40
Naaman M, Garcia-Molina H, Paepcke A, Yeh RB (2005) Leveraging context to resolve identity in photo albums. In: JCDL ’05: proceedings of the Fifth ACM/IEEE-CS joint conference on digital libraries. ACM Press
Naaman M, Harada S, Wang Q, Garcia-Molina H, Paepcke A (2004) Context data in geo-referenced digital photo collections. In: Proceedings of the 12th international conference on multimedia (MM2004). ACM
Naaman M, Nair R, Kaplun V (2008) Photos on the go: a mobile application case study. In: CHI ’08: proceeding of the twenty-sixth annual SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA, pp 1739–1748
Naaman M, Paepcke A, Garcia-Molina H (2003) From where to what: metadata sharing for digital photographs with geographic coordinates. In: 10th international conference on cooperative information systems (CoopIS)
Naaman M, Song YJ, Paepcke A, Garcia-Molina H (2004) Automatic organization for digital photographs with geographic coordinates. In:JCDL ’04: proceedings of the fourth ACM/IEEE-CS joint conference on digital libraries
Naci SU, Hanjalic A (2007) Intelligent browsing of concert videos. In: MULTIMEDIA ’07: proceedings of the 15th international conference on multimedia. ACM, New York, NY, USA, pp 150–151
Nair R, Reid N, Davis M (2005) Photo LOI: browsing multi-user photo collections. In: Proceedings of the 13th international conference on multimedia (MM2005). ACM
Negoescu RA, Gatica-Perez D (2008) Analyzing flickr groups. In: CIVR ’08: proceedings of the 2008 international conference on content-based image and video retrieval. ACM, New York, NY, USA, pp 417–426
Negoescu R-A, Adams B, Phung D, Venkatesh S, Gatica-Perez D (2009) Flickr hypergroups. In: MM ’09: proceedings of the seventeen ACM international conference on multimedia. ACM, New York, NY, USA, pp 813–816
Nov O, Naaman M, Ye C (2008) What drives content tagging: the case of photos on flickr. In: CHI ’08: proceeding of the twenty-sixth annual SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA, pp 1097–1100
O’Hare N, Smeaton AF (2009) Context-aware person identification in personal photo collections. IEEE Trans Multimedia 11(2):220–228
O’Hare N, Gurrin C, Jones GJF, Smeaton AF (2005) Combination of content analysis and context features for digital photograph retrieval. In: 2nd IEE European workshop on the integration of knowledge, semantic and digital media technologies
O’Hare N, Gurrin C, Lee H, Murphy N, Smeaton AF, Jones GJF (2005) My digital photos: Where and when? In: Proceedings of the 13th international conference on multimedia (MM2005). ACM
O’Hare N, Lee H, Cooray S, Gurrin C, Jones G, Malobabic J, O’Connor N, Smeaton A, Uscilowski B (2006) MediAssist: using content-based analysis and context to manage personal photo collections. In: Image and video retrieval, pp 529–532
Olsen DR Jr (2007) Evaluating user interface systems research. In: UIST ’07: proceedings of the 20th annual ACM symposium on user interface software and technology, pp 251–258
Paillard B, Mabilleau P, Morissette S, Soumagne J (1992) PERCEVAL: perceptual evaluation of the quality of audio signals. J Audio Eng Soc 40(1/2):21–31
Palmer S, Rosch E, Chase P (1981) Canonical perspective and the perception of objects. In: Long JB, Baddeley AD (eds) Attention and performance IX. Lawrence Erlbaum Associates, Hillsdale, N.J., pp 135–151
Pigeau A, Gelgon M (2004) Organizing a personal image collection with statistical model-based ICL clustering on spatio-temporal camera phone meta-data. J Vis Commun Image Represent 15(3):425–445
Qi G-J, Hua X-S, Zhang H-J (2009) Learning semantic distance from community-tagged media collection. In: MM ’09: proceedings of the seventeen ACM international conference on multimedia. ACM, New York, NY, USA, pp 243–252
Rattenbury T, Good N, Naaman M (2007) Towards automatic extraction of event and place semantics from flickr tags. In: Proceedings of the thirtieth annual international ACM SIGIR conference on research and development in information retrieval. ACM, New York, NY, USA, pp 103–110
Salovaara A, Jacucci G, Oulasvirta Timo Saari A, Kanerva P, Kurvinen E, Tiitta S (2006) Collective creation and sense-making of mobile media. In: CHI ’06: proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, NY, USA, pp 1211–1220
Schmitz P (2006) Inducing ontology from flickr tags. In: Proceedings of the workshop on collaborative web tagging at WWW2006
Setz AT, Snoek CGM (2009) Can social tagged images aid concept-based video search? In: ICME’09: proceedings of the 2009 IEEE international conference on multimedia and Expo. IEEE Press, Piscataway, NJ, USA, pp 1460–1463
Shamma DA, Shaw R, Shafton PL, Liu Y (2007) Watch what I watch: using community activity to understand content. In: MIR ’07: proceedings of the international workshop on workshop on multimedia information retrieval. ACM, New York, NY, USA, pp 275–284
Shamma DA, Bastea-Forte M, Joubert N, Liu Y (2008) Enhancing online personal connections through the synchronized sharing of online video. In: CHI ’08: CHI ’08 extended abstracts on human factors in computing systems. ACM, New York, NY, USA, pp 2931–2936
Shamma DA, Kennedy L, Churchill EF (2009) Tweet the debates: understanding community annotation of uncollected sources. In: WSM ’09: proceedings of the first SIGMM workshop on Social media. ACM, New York, NY, USA, pp 3–10
Shamma DA, Kennedy L, Churchill E (2010) Statler: summarizing media through short-message services. In: CSCW ’10: proceedings of the 2010 ACM conference on computer supported cooperative work. ACM, New York, NY, USA
Shaw R, Schmitz P (2006) Community annotation and remix: a research platform and pilot deployment. In: HCM ’06: proceedings of the 1st ACM international workshop on human-centered multimedia. ACM, New York, NY, USA, pp 89–98
Shneiderman B (2008) COMPUTER SCIENCE: Science 2.0. Science 319(5868):1349–1350
Shrestha P, Weda H, Barbieri M, Sekulovski D (2006) Synchronization of multiple video recordings based on still camera flashes. In: MULTIMEDIA ’06: proceedings of the 14th international conference on multimedia. ACM, pp 137–140
Shrestha P, Barbieri M, Weda H (2007) Synchronization of multi-camera video recordings based on audio. In: MULTIMEDIA ’07: proceedings of the 15th international conference on multimedia. ACM, pp 545–548
Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: WWW ’08: proceeding of the 17th international conference on World Wide Web. ACM, New York, NY, USA, pp 327–336
Simon I, Snavely N, Seitz SM (2007) Scene summarization for online image collections. In: ICCV ’07: proceedings of the 11th IEEE international conference on computer vision.
Sinha P, Jain R (2008) Classification and annotation of digital photos using optical context data. In: CIVR ’08: proceedings of the 2008 international conference on content-based image and video retrieval. ACM, New York, NY, USA, pp 309–318
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380
Snavely N, Garg R, Seitz SM, Szeliski R (2008) Finding paths through the world’s photos. In: SIGGRAPH ’08: proceedings of ACM SIGGRAPH. ACM, New York, NY, USA, pp 1–11
Snoek CGM, Worring M, Smeulders AWM, Freiburg B (2007) The role of visual content and style for concert video indexing. In: IEEE international conference on multimedia and expo, pp 252–255, 2–5 July 2007
Strickerand M, Orengo M (1995) Similarity of color images. In: Proc. SPIE storage and retrieval for image and video databases vol 2420, pp 381–392
Syeda-Mahmood T, Ponceleon D (2001) Learning video browsing behavior and its application in the generation of video previews. In: MULTIMEDIA ’01: proceedings of the 9th ACM international conference on multimedia. ACM, New York, NY, USA, pp 119–128
Tang J, Yan S, Hong R, Qi G-J, Chua T-S (2009) Inferring semantic concepts from community-contributed images and noisy tags. In: MM ’09: proceedings of the seventeen ACM international conference on multimedia. ACM, New York, NY, USA, pp 223–232
Thiede T, Treurniet WC, Bitto R, Schmidmer C, Sporer T, Beerends JG, Colomes C (2000) PEAQ-the ITU standard for objective measurement of perceived audio quality. J Audio Eng Soc 48(1/2):3–29
Toyama K, Logan R, Roseway A (2003) Geographic location tags on digital images. In: Proceedings of the 11th international conference on multimedia (MM2003). ACM, pp 156–166
Tsai C-M, Qamra A, Chang E (2005) Extent: inferring image metadata from context and content. In: IEEE International conference on multimedia and expo
Uchihashi S, Foote J, Girgensohn A (1999) Video manga: generating semantically meaningful video summaries. In: MULTIMEDIA ’99: proceedings of the 7th international conference on multimedia. ACM, pp 383–392
van Houten Y, Naci U, Freiburg B, Eggermont R, Schuurman S, Hollander D, Reitsma J, Markslag M, Kniest J, Veenstra M, Hanjalic A (2005) The multimedian concert-video browser. In: IEEE international conference on multimedia and expo, 2005. ICME, pp 1561–1564, 6-6 July 2005
Wang A (2003) An industrial strength audio search algorithm. In: Proceedings of the international conference on music information retrieval
Wang M, Yang K, Hua X-S, Zhang H-J (2009) Visual tag dictionary: interpreting tags with visual words. In: WSMC ’09: proceedings of the 1st workshop on web-scale multimedia corpus. ACM, New York, NY, USA, pp 1–8
Westermann U, Jain R (2007) Toward a common event model for multimedia applications. IEEE Multimed 14(1):19–29
Wu Y, Chang EY, Tseng BL (2005) Multimodal metadata fusion using causal strength. In: Proceedings of the 13th international conference on multimedia (MM2005). ACM, New York, NY, USA, pp 872–881
Wu L, Hua X-S, Yu N, Ma W-Y, Li S (2008) Flickr distance. In: MM ’08: proceeding of the 16th ACM international conference on multimedia. ACM, New York, NY, USA, pp 31–40
Wu L, Hoi SCH, Jin R, Zhu J, Yu N (2009) Distance metric learning from uncertain side information with application to automated photo tagging. In: MM ’09: proceedings of the seventeen ACM international conference on multimedia. ACM, New York, NY, USA, pp 135–144
YouTube (2010) http://youtube.com/
Zheng Y-T, Zhao M, Song Y, Adam H, Buddemeier U, Bissacco A, Brucher F, Chua T-S, Neven H (2009) Tour the world: building a web-scale landmark recognition engine. In: CVPR ’09: IEEE conference on computer vision and pattern recognition, pp 1085–1092
Zheng Y-T, Zhao M, Song Y, Adam H, Buddemeier U, Bissacco A, Brucher F, Chua T-S, Neven H (2009) Tour the world: building a web-scale landmark recognition engine. In: CVPR workshops 2009: IEEE computer society conference on computer vision and pattern recognition workshops, 2009, pp 1085–1092
Zunjarwad A, Sundaram H, Xie L (2007) Contextual wisdom: social relations and correlations for multimedia event annotation. In: Proceedings of the 15th international conference on multimedia, pp 615–624
Acknowledgements
The author would like to acknowledge the contributions of his colleagues and interns at Yahoo! Research Berkeley, whose ideas, expertise and excitement made this work possible—or, in fact, made this work [period]. In particular, Lyndon Kennedy made many of the key contributions described here.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Naaman, M. Social multimedia: highlighting opportunities for search and mining of multimedia data in social media applications. Multimed Tools Appl 56, 9–34 (2012). https://doi.org/10.1007/s11042-010-0538-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0538-7