skip to main content
10.1145/3126858.3126897acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebmediaConference Proceedingsconference-collections
research-article

Video Annotation by Cascading Microtasks: a Crowdsourcing Approach

Published: 17 October 2017 Publication History

Abstract

This paper presents a general approach to perform crowdsourcing video annotation without requiring trained workers nor experts. It consists of dividing complex annotation tasks into simple and small microtasks and cascading them to generate a final result. Moreover, this approach allows using simple annotation tools rather than complex and expensive annotation systems. Also, it tends to avoid activities that may be tedious and time-consuming for workers. The cascade microtasks strategy is included in a workflow of three steps: Preparation, Annotation, and Presentation. A crowdsourcing video annotation process in which four different microtasks were cascaded was developed to evaluate the proposed approach. In the process, extra content such as images, text, hyperlinks and other elements are applied in the video enrichment. To support the experiment was developed a toolkit that includes Web-based annotation tools and aggregation methods, besides a presentation system for the annotated videos. This toolkit is open source and can be downloaded and used to replicate this experiment, as so to construct different crowdsourcing video annotation systems.

References

[1]
2014. VidWiki: Enabling the Crowd to Improve the Legibility of Online Educational Videos. ACM Conference on Computer Supported Cooperative Work. https://www.microsoft.com/en-us/research/publication/ vidwiki-enabling-the-crowd-to-improve-the-legibility-of-online-educational-videos/
[2]
Marco Bertini, Alberto Del Bimbo, Andrea Ferracani, Francesco Gelli, Daniele Maddaluno, and Daniele Pezzatini. 2013. Socially-aware video recommendation using users' profiles and crowdsourced annotations. In Proceedings of the 2nd international workshop on Socially-aware multimedia. ACM, 13--18.
[3]
Joan-Isaac Biel and Daniel Gatica-Perez. 2013. The youtube lens: Crowdsourced personality impressions and audiovisual analysis of vlogs. Multimedia, IEEE Transactions on 15, 1 (2013), 41--55.
[4]
Chen Chen, Xiaojun Meng, Shengdong Zhao, and Morten Fjeld. 2017. ReTool: Interactive Microtask andWorkflow Design Through Demonstration. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 3551--3556. https://doi.org/10.1145/3025453.3025969
[5]
Si Chen, Muyuan Li, Kui Ren, and Chunming Qiao. 2015. Crowd map: Accurate reconstruction of indoor floor plans from crowdsourced sensor-rich videos. In Distributed Computing Systems (ICDCS), 2015 IEEE 35th International Conference on. IEEE, 1--10.
[6]
Bruna C.R. Cunha, Rodolfo Dias Correia, and Maria da Graça Campos Pimentel. 2015. Mobile Video Annotations: A Case Study on Supporting Rehabilitation Exercises. In Proceedings of the 21st Brazilian Symposium on Multimedia and the Web (WebMedia '15). ACM, New York, NY, USA, 245--252. https://doi.org/10. 1145/2820426.2820449
[7]
Marcello N de Amorim, Ricardo MC Segundo, and Celso AS Santos. 2016. LiveSync: a Tool for Real Time Video Streaming Synchronization from Independent Sources. In WebMedia 2016 WFA. Teresina - PI, Brazil.
[8]
Travis Desell, Kyle Goehner, Alicia Andes, Rebecca Eckroad, and Susan Ellis- Felege. 2015. On the effectiveness of crowd sourcing avian nesting video analysis at Wildlife@ Home. Procedia Computer Science 51 (2015), 384--393.
[9]
Rucha Deshpande, Tayfun Tuna, Jaspal Subhlok, and Lecia Barker. 2014. A crowdsourcing caption editor for educational videos. In Frontiers in Education Conference (FIE), 2014 IEEE. IEEE, 1--8.
[10]
Djellel Eddine Difallah, Michele Catasta, Gianluca Demartini, Panagiotis G. Ipeirotis, and Philippe Cudré-Mauroux. 2015. The Dynamics of Micro-Task Crowdsourcing: The Case of Amazon MTurk. In Proceedings of the 24th International Conference on World Wide Web (WWW '15). ACM, New York, NY, USA, 238--247. https://doi.org/10.1145/2736277.2741685
[11]
Guilherme Fião, Teresa Romão, Nuno Correia, Pedro Centieiro, and A. Eduardo Dias. 2016. Automatic Generation of Sport Video Highlights Based on Fan's Emotions and Content. In Proceedings of the 13th International Conference on Advances in Computer Entertainment Technology (ACE2016). ACM, New York, NY, USA, Article 29, 6 pages. https://doi.org/10.1145/3001773.3001802
[12]
Bauke Freiburg, Jaap Kamps, and Cees GM Snoek. 2011. Crowdsourcing visual detectors for video search. In Proceedings of the 19th ACM international conference on Multimedia. ACM, 913--916.
[13]
Neeraj J Gadgil, Khalid Tahboub, David Kirsh, and Edward J Delp. 2014. A web-based video annotation system for crowdsourcing surveillance videos. In IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, 90270A--90270A.
[14]
FRANCIS GALTON. 1907. Vox Populi (The Wisdom of Crowds). Nature 75, 1949 (1907), 450--451. https://doi.org/10.1038/075509f0
[15]
Dan B. Goldman, Chris Gonterman, Brian Curless, David Salesin, and Steven M. Seitz. 2008. Video Object Annotation, Navigation, and Composition. In Proceedings of the 21st Annual ACM Symposium on User Interface Software and Technology (UIST '08). ACM, New York, NY, USA, 3--12. https://doi.org/10.1145/1449715. 1449719
[16]
Luke Gottlieb, Jaeyoung Choi, Pascal Kelm, Thomas Sikora, and Gerald Friedland. 2012. Pushing the limits of mechanical turk: qualifying the crowd for video geolocation. In Proceedings of the ACM multimedia 2012 workshop on Crowdsourcing for multimedia. ACM, 23--28.
[17]
Chul-Hee Han and Jong-Seok Lee. 2014. Quality assessment of on-line videos using metadata. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 1385--1388.
[18]
Jeff Howe. 2006. The Rise of Crowdsourcing. Wired Magazine 14, 6 (06 2006). http://www.wired.com/wired/archive/14.06/crowds.html
[19]
Samuel Huron, Petra Isenberg, and Jean Daniel Fekete. 2013. PolemicTweet: Video Annotation and Analysis through Tagged Tweets. In Human-Computer Interaction--INTERACT 2013. Springer, 135--152.
[20]
Hernisa Kacorri, Kaoru Shinkawa, and Shin Saito. 2014. Introducing game elements in crowdsourced video captioning by non-experts. In Proceedings of the 11th Web for All Conference. ACM, 29.
[21]
Gunhee Kim, Leonid Sigal, and Eric P. Xing. 2014. Joint Summarization of Large-Scale Collections of Web Images and Videos for Storyline Reconstruction. In Proceedings of the 2014 IEEE CCVPR (CVPR '14). IEEE Computer Society, Washington, DC, USA, 4225--4232. https://doi.org/10.1109/CVPR.2014.538
[22]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.
[23]
Melanie Misanchuk and Tiffany Anderson. 2001. Building Community in an Online Learning Environment: Communication, Cooperation and Collaboration. (2001).
[24]
Martin Mulazzani, Philipp Reschl, Markus Huber, Manuel Leithner, Sebastian Schrittwieser, Edgar Weippl, and FC Wien. 2013. Fast and reliable browser identification with javascript engine fingerprinting. In Web 2.0 Workshop on Security and Privacy (W2SP), Vol. 5.
[25]
Venkatesh N Murthy, Subhransu Maji, and R Manmatha. 2015. Automatic image annotation using deep learning representations. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. ACM, 603--606.
[26]
Marcello Novaes, Celso Alberto Saibel Santos, and Orivaldo Tavares. 2016. Ex- CAM - Uma metodologia Crowsourcing para a autoria de conteudo extra para videos. In WebMedia 2016 WTD. Teresina - PI, Brazil.
[27]
Sunghyun Park, Philippa Shoemark, and Louis-Philippe Morency. 2014. Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization. In Proceedings of the 19th international conference on Intelligent User Interfaces. ACM, 37--46.
[28]
José Pedro Pinto and Paula Viana. 2013. TAG4VD: a game for collaborative video annotation. In Proceedings of the 2013 ACM international workshop on Immersive media experiences. ACM, 25--28.
[29]
Laurel D Riek, Maria F O'connor, and Peter Robinson. 2011. Guess what? a game for affective annotation of video using crowd sourcing. In Affective computing and intelligent interaction. Springer, 277--285.
[30]
Dairazalia Sanchez-Cortes, Shiro Kumano, Kazuhiro Otsuka, and Daniel Gatica- Perez. 2015. In the Mood for Vlog: Multimodal Inference in Conversational Social Video. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 2 (2015), 9.
[31]
CAS Santos, Alexandre SANTOS, and TA Tavares. 2007. Uma estratégia para a construção de ambientes para a descrição semântica de vídeos. (2007).
[32]
Elizeu Santos-Neto, Tatiana Pontes, Jussara M Almeida, and Matei Ripeanu. 2014. Towards Boosting Video Popularity via Tag Selection. In SoMuS@ ICMR. Citeseer.
[33]
Ricardo Segundo, Marcello N de Amorim, and Celso AS Santos. 2016. Crowdsourcing & Multimedia: Enhancing Multimedia Activities with the Power of Crowds. In WebMedia 2016 Minicursos. Teresina - PI, Brazil.
[34]
Fabio Sulser, Ivan Giangreco, and Heiko Schuldt. 2014. Crowd-based semantic event detection and video annotation for sports videos. In Proceedings of the 2014 International ACM Workshop on Crowdsourcing for Multimedia. ACM, 63--68.
[35]
Luis Von Ahn. 2005. Human Computation. Ph.D. Dissertation. Pittsburgh, PA, USA. Advisor(s) Blum, Manuel. AAI3205378.
[36]
Carl Vondrick, Donald Patterson, and Deva Ramanan. 2013. Efficiently Scaling Up Crowdsourced Video Annotation. Int. J. Comput. Vision 101, 1 (Jan. 2013), 184--204. https://doi.org/10.1007/s11263-012-0564--1
[37]
Meng Wang and Xian-Sheng Hua. 2011. Active Learning in Multimedia Annotation and Retrieval: A Survey. ACM Trans. Intell. Syst. Technol. 2, 2, Article 10 (Feb. 2011), 21 pages. https://doi.org/10.1145/1899412.1899414
[38]
Meng Wang, Xian-Sheng Hua, Jinhui Tang, and Richang Hong. 2009. Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation. Trans. Multi. 11, 3 (April 2009), 465--476. https://doi.org/10.1109/TMM. 2009.2012919
[39]
Stefan Wilk, Stephan Kopf, and Wolfgang Effelsberg. 2015. Video Composition by the Crowd: A System to Compose User-generated Videos in Near Real-time. In Proceedings of the 6th ACM MSC (MMSys '15). ACM, New York, NY, USA, 13--24. https://doi.org/10.1145/2713168.2713178
[40]
BinWu, Erheng Zhong, Ben Tan, Andrew Horner, and Qiang Yang. 2014. Crowdsourced time-sync video tagging using temporal and personalized topic modeling. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 721--730.
[41]
Jun Zhang, Xiaoming Fan, Jianyong Wang, and Lizhu Zhou. 2012. Keywordpropagation- based Information Enriching and Noise Removal for Web News Videos. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '12). ACM, New York, NY, USA, 561--569. https://doi.org/10.1145/2339530.2339620
[42]
Yifan Zhang, Xiaoyu Zhang, Changsheng Xu, and Hanqing Lu. 2007. Personalized Retrieval of Sports Video. In Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval (MIR '07). ACM, New York, NY, USA, 313--322. https://doi.org/10.1145/1290082.1290126

Cited By

View all
  • (2022)ARiana: Augmented Reality Based In-Situ Annotation of Assembly VideosIEEE Access10.1109/ACCESS.2022.321601510(111704-111724)Online publication date: 2022
  • (2018)A Crowdsourcing Tool for Data Augmentation in Visual Question Answering TasksProceedings of the 24th Brazilian Symposium on Multimedia and the Web10.1145/3243082.3267455(137-140)Online publication date: 16-Oct-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WebMedia '17: Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web
October 2017
522 pages
ISBN:9781450350969
DOI:10.1145/3126858
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • SBC: Brazilian Computer Society
  • CNPq: Conselho Nacional de Desenvolvimento Cientifico e Tecn
  • CGIBR: Comite Gestor da Internet no Brazil
  • CAPES: Brazilian Higher Education Funding Council

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. crowdsourcing
  2. human computation
  3. microtasks
  4. multimedia systems
  5. video annotation
  6. video enrichment

Qualifiers

  • Research-article

Conference

Webmedia '17
Sponsor:
  • SBC
  • CNPq
  • CGIBR
  • CAPES
Webmedia '17: Brazilian Symposium on Multimedia and the Web
October 17 - 20, 2017
RS, Gramado, Brazil

Acceptance Rates

WebMedia '17 Paper Acceptance Rate 38 of 138 submissions, 28%;
Overall Acceptance Rate 270 of 873 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)ARiana: Augmented Reality Based In-Situ Annotation of Assembly VideosIEEE Access10.1109/ACCESS.2022.321601510(111704-111724)Online publication date: 2022
  • (2018)A Crowdsourcing Tool for Data Augmentation in Visual Question Answering TasksProceedings of the 24th Brazilian Symposium on Multimedia and the Web10.1145/3243082.3267455(137-140)Online publication date: 16-Oct-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media