research-article

Video Annotation by Cascading Microtasks: a Crowdsourcing Approach

Authors:

Marcello N. de Amorim,

Ricardo M.C. Segundo,

Celso A.S. Santos,

Orivaldo de L. TavaresAuthors Info & Claims

WebMedia '17: Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web

Pages 49 - 56

https://doi.org/10.1145/3126858.3126897

Published: 17 October 2017 Publication History

Abstract

This paper presents a general approach to perform crowdsourcing video annotation without requiring trained workers nor experts. It consists of dividing complex annotation tasks into simple and small microtasks and cascading them to generate a final result. Moreover, this approach allows using simple annotation tools rather than complex and expensive annotation systems. Also, it tends to avoid activities that may be tedious and time-consuming for workers. The cascade microtasks strategy is included in a workflow of three steps: Preparation, Annotation, and Presentation. A crowdsourcing video annotation process in which four different microtasks were cascaded was developed to evaluate the proposed approach. In the process, extra content such as images, text, hyperlinks and other elements are applied in the video enrichment. To support the experiment was developed a toolkit that includes Web-based annotation tools and aggregation methods, besides a presentation system for the annotated videos. This toolkit is open source and can be downloaded and used to replicate this experiment, as so to construct different crowdsourcing video annotation systems.

References

[1]

2014. VidWiki: Enabling the Crowd to Improve the Legibility of Online Educational Videos. ACM Conference on Computer Supported Cooperative Work. https://www.microsoft.com/en-us/research/publication/ vidwiki-enabling-the-crowd-to-improve-the-legibility-of-online-educational-videos/

[2]

Marco Bertini, Alberto Del Bimbo, Andrea Ferracani, Francesco Gelli, Daniele Maddaluno, and Daniele Pezzatini. 2013. Socially-aware video recommendation using users' profiles and crowdsourced annotations. In Proceedings of the 2nd international workshop on Socially-aware multimedia. ACM, 13--18.

Digital Library

[3]

Joan-Isaac Biel and Daniel Gatica-Perez. 2013. The youtube lens: Crowdsourced personality impressions and audiovisual analysis of vlogs. Multimedia, IEEE Transactions on 15, 1 (2013), 41--55.

Digital Library

[4]

Chen Chen, Xiaojun Meng, Shengdong Zhao, and Morten Fjeld. 2017. ReTool: Interactive Microtask andWorkflow Design Through Demonstration. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 3551--3556. https://doi.org/10.1145/3025453.3025969

Digital Library

[5]

Si Chen, Muyuan Li, Kui Ren, and Chunming Qiao. 2015. Crowd map: Accurate reconstruction of indoor floor plans from crowdsourced sensor-rich videos. In Distributed Computing Systems (ICDCS), 2015 IEEE 35th International Conference on. IEEE, 1--10.

[6]

Bruna C.R. Cunha, Rodolfo Dias Correia, and Maria da Graça Campos Pimentel. 2015. Mobile Video Annotations: A Case Study on Supporting Rehabilitation Exercises. In Proceedings of the 21st Brazilian Symposium on Multimedia and the Web (WebMedia '15). ACM, New York, NY, USA, 245--252. https://doi.org/10. 1145/2820426.2820449

Digital Library

[7]

Marcello N de Amorim, Ricardo MC Segundo, and Celso AS Santos. 2016. LiveSync: a Tool for Real Time Video Streaming Synchronization from Independent Sources. In WebMedia 2016 WFA. Teresina - PI, Brazil.

[8]

Travis Desell, Kyle Goehner, Alicia Andes, Rebecca Eckroad, and Susan Ellis- Felege. 2015. On the effectiveness of crowd sourcing avian nesting video analysis at Wildlife@ Home. Procedia Computer Science 51 (2015), 384--393.

Digital Library

[9]

Rucha Deshpande, Tayfun Tuna, Jaspal Subhlok, and Lecia Barker. 2014. A crowdsourcing caption editor for educational videos. In Frontiers in Education Conference (FIE), 2014 IEEE. IEEE, 1--8.

[10]

Djellel Eddine Difallah, Michele Catasta, Gianluca Demartini, Panagiotis G. Ipeirotis, and Philippe Cudré-Mauroux. 2015. The Dynamics of Micro-Task Crowdsourcing: The Case of Amazon MTurk. In Proceedings of the 24th International Conference on World Wide Web (WWW '15). ACM, New York, NY, USA, 238--247. https://doi.org/10.1145/2736277.2741685

Digital Library

[11]

Guilherme Fião, Teresa Romão, Nuno Correia, Pedro Centieiro, and A. Eduardo Dias. 2016. Automatic Generation of Sport Video Highlights Based on Fan's Emotions and Content. In Proceedings of the 13th International Conference on Advances in Computer Entertainment Technology (ACE2016). ACM, New York, NY, USA, Article 29, 6 pages. https://doi.org/10.1145/3001773.3001802

[12]

Bauke Freiburg, Jaap Kamps, and Cees GM Snoek. 2011. Crowdsourcing visual detectors for video search. In Proceedings of the 19th ACM international conference on Multimedia. ACM, 913--916.

Digital Library

[13]

Neeraj J Gadgil, Khalid Tahboub, David Kirsh, and Edward J Delp. 2014. A web-based video annotation system for crowdsourcing surveillance videos. In IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, 90270A--90270A.

[14]

FRANCIS GALTON. 1907. Vox Populi (The Wisdom of Crowds). Nature 75, 1949 (1907), 450--451. https://doi.org/10.1038/075509f0

[15]

Dan B. Goldman, Chris Gonterman, Brian Curless, David Salesin, and Steven M. Seitz. 2008. Video Object Annotation, Navigation, and Composition. In Proceedings of the 21st Annual ACM Symposium on User Interface Software and Technology (UIST '08). ACM, New York, NY, USA, 3--12. https://doi.org/10.1145/1449715. 1449719

[16]

Luke Gottlieb, Jaeyoung Choi, Pascal Kelm, Thomas Sikora, and Gerald Friedland. 2012. Pushing the limits of mechanical turk: qualifying the crowd for video geolocation. In Proceedings of the ACM multimedia 2012 workshop on Crowdsourcing for multimedia. ACM, 23--28.

Digital Library

[17]

Chul-Hee Han and Jong-Seok Lee. 2014. Quality assessment of on-line videos using metadata. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 1385--1388.

[18]

Jeff Howe. 2006. The Rise of Crowdsourcing. Wired Magazine 14, 6 (06 2006). http://www.wired.com/wired/archive/14.06/crowds.html

[19]

Samuel Huron, Petra Isenberg, and Jean Daniel Fekete. 2013. PolemicTweet: Video Annotation and Analysis through Tagged Tweets. In Human-Computer Interaction--INTERACT 2013. Springer, 135--152.

[20]

Hernisa Kacorri, Kaoru Shinkawa, and Shin Saito. 2014. Introducing game elements in crowdsourced video captioning by non-experts. In Proceedings of the 11th Web for All Conference. ACM, 29.

Digital Library

[21]

Gunhee Kim, Leonid Sigal, and Eric P. Xing. 2014. Joint Summarization of Large-Scale Collections of Web Images and Videos for Storyline Reconstruction. In Proceedings of the 2014 IEEE CCVPR (CVPR '14). IEEE Computer Society, Washington, DC, USA, 4225--4232. https://doi.org/10.1109/CVPR.2014.538

Digital Library

[22]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.

[23]

Melanie Misanchuk and Tiffany Anderson. 2001. Building Community in an Online Learning Environment: Communication, Cooperation and Collaboration. (2001).

[24]

Martin Mulazzani, Philipp Reschl, Markus Huber, Manuel Leithner, Sebastian Schrittwieser, Edgar Weippl, and FC Wien. 2013. Fast and reliable browser identification with javascript engine fingerprinting. In Web 2.0 Workshop on Security and Privacy (W2SP), Vol. 5.

[25]

Venkatesh N Murthy, Subhransu Maji, and R Manmatha. 2015. Automatic image annotation using deep learning representations. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. ACM, 603--606.

Digital Library

[26]

Marcello Novaes, Celso Alberto Saibel Santos, and Orivaldo Tavares. 2016. Ex- CAM - Uma metodologia Crowsourcing para a autoria de conteudo extra para videos. In WebMedia 2016 WTD. Teresina - PI, Brazil.

[27]

Sunghyun Park, Philippa Shoemark, and Louis-Philippe Morency. 2014. Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization. In Proceedings of the 19th international conference on Intelligent User Interfaces. ACM, 37--46.

Digital Library

[28]

José Pedro Pinto and Paula Viana. 2013. TAG4VD: a game for collaborative video annotation. In Proceedings of the 2013 ACM international workshop on Immersive media experiences. ACM, 25--28.

Digital Library

[29]

Laurel D Riek, Maria F O'connor, and Peter Robinson. 2011. Guess what? a game for affective annotation of video using crowd sourcing. In Affective computing and intelligent interaction. Springer, 277--285.

[30]

Dairazalia Sanchez-Cortes, Shiro Kumano, Kazuhiro Otsuka, and Daniel Gatica- Perez. 2015. In the Mood for Vlog: Multimodal Inference in Conversational Social Video. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 2 (2015), 9.

[31]

CAS Santos, Alexandre SANTOS, and TA Tavares. 2007. Uma estratégia para a construção de ambientes para a descrição semântica de vídeos. (2007).

[32]

Elizeu Santos-Neto, Tatiana Pontes, Jussara M Almeida, and Matei Ripeanu. 2014. Towards Boosting Video Popularity via Tag Selection. In SoMuS@ ICMR. Citeseer.

[33]

Ricardo Segundo, Marcello N de Amorim, and Celso AS Santos. 2016. Crowdsourcing & Multimedia: Enhancing Multimedia Activities with the Power of Crowds. In WebMedia 2016 Minicursos. Teresina - PI, Brazil.

[34]

Fabio Sulser, Ivan Giangreco, and Heiko Schuldt. 2014. Crowd-based semantic event detection and video annotation for sports videos. In Proceedings of the 2014 International ACM Workshop on Crowdsourcing for Multimedia. ACM, 63--68.

Digital Library

[35]

Luis Von Ahn. 2005. Human Computation. Ph.D. Dissertation. Pittsburgh, PA, USA. Advisor(s) Blum, Manuel. AAI3205378.

[36]

Carl Vondrick, Donald Patterson, and Deva Ramanan. 2013. Efficiently Scaling Up Crowdsourced Video Annotation. Int. J. Comput. Vision 101, 1 (Jan. 2013), 184--204. https://doi.org/10.1007/s11263-012-0564--1

Digital Library

[37]

Meng Wang and Xian-Sheng Hua. 2011. Active Learning in Multimedia Annotation and Retrieval: A Survey. ACM Trans. Intell. Syst. Technol. 2, 2, Article 10 (Feb. 2011), 21 pages. https://doi.org/10.1145/1899412.1899414

Digital Library

[38]

Meng Wang, Xian-Sheng Hua, Jinhui Tang, and Richang Hong. 2009. Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation. Trans. Multi. 11, 3 (April 2009), 465--476. https://doi.org/10.1109/TMM. 2009.2012919

[39]

Stefan Wilk, Stephan Kopf, and Wolfgang Effelsberg. 2015. Video Composition by the Crowd: A System to Compose User-generated Videos in Near Real-time. In Proceedings of the 6th ACM MSC (MMSys '15). ACM, New York, NY, USA, 13--24. https://doi.org/10.1145/2713168.2713178

Digital Library

[40]

BinWu, Erheng Zhong, Ben Tan, Andrew Horner, and Qiang Yang. 2014. Crowdsourced time-sync video tagging using temporal and personalized topic modeling. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 721--730.

[41]

Jun Zhang, Xiaoming Fan, Jianyong Wang, and Lizhu Zhou. 2012. Keywordpropagation- based Information Enriching and Noise Removal for Web News Videos. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '12). ACM, New York, NY, USA, 561--569. https://doi.org/10.1145/2339530.2339620

Digital Library

[42]

Yifan Zhang, Xiaoyu Zhang, Changsheng Xu, and Hanqing Lu. 2007. Personalized Retrieval of Sports Video. In Proceedings of the International Workshop on Workshop on Multimedia Information Retrieval (MIR '07). ACM, New York, NY, USA, 313--322. https://doi.org/10.1145/1290082.1290126

Digital Library

Cited By

Pham TMoesgen TSiltanen SBergstrom JXiao Y(2022)ARiana: Augmented Reality Based In-Situ Annotation of Assembly VideosIEEE Access10.1109/ACCESS.2022.321601510(111704-111724)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3216015
Silva RFonseca AGoldschmidt Rdos Santos JBezerra ENeto MNovais RFerraz CViana W(2018)A Crowdsourcing Tool for Data Augmentation in Visual Question Answering TasksProceedings of the 24th Brazilian Symposium on Multimedia and the Web10.1145/3243082.3267455(137-140)Online publication date: 16-Oct-2018
https://dl.acm.org/doi/10.1145/3243082.3267455

Index Terms

Video Annotation by Cascading Microtasks: a Crowdsourcing Approach

Recommendations

Tagging human activities in video by crowdsourcing
ICMR '13: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval

Activity annotation in videos is necessary to create a training dataset for most of activity recognition systems. This is a very time consuming and repetitive task. Crowdsourcing gains popularity to distribute annotation tasks to a large pool of ...
An extended study of content and crowdsourcing-related performance factors in named entity annotation

Hybrid annotation techniques have emerged as a promising approach to carry out named entity recognition on noisy microposts. In this paper, we identify a set of content and crowdsourcing-related features (number and type of entities in a post, average ...
Crowd-based Semantic Event Detection and Video Annotation for Sports Videos
CrowdMM '14: Proceedings of the 2014 International ACM Workshop on Crowdsourcing for Multimedia

Recent developments in sport analytics have heightened the interest in collecting data on the behavior of individuals and of the entire team in sports events. Rather than using dedicated sensors for recording the data, the detection of semantic events ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WebMedia '17: Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web

October 2017

522 pages

ISBN:9781450350969

DOI:10.1145/3126858

General Chairs:
Valter Roesler
UFRGS, Brazil
,
José Valdeni de Lima
UFRGS, Brazil
,
Program Chairs:
Celso Alberto Saibel Santos
UFES, Brazil
,
Roberto Willrich
UFSC, Brazil

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SBC: Brazilian Computer Society
CNPq: Conselho Nacional de Desenvolvimento Cientifico e Tecn
CGIBR: Comite Gestor da Internet no Brazil
CAPES: Brazilian Higher Education Funding Council

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

Webmedia '17

Sponsor:

SBC
CNPq
CGIBR
CAPES

Webmedia '17: Brazilian Symposium on Multimedia and the Web

October 17 - 20, 2017

RS, Gramado, Brazil

Acceptance Rates

WebMedia '17 Paper Acceptance Rate 38 of 138 submissions, 28%;

Overall Acceptance Rate 270 of 873 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
199
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Pham TMoesgen TSiltanen SBergstrom JXiao Y(2022)ARiana: Augmented Reality Based In-Situ Annotation of Assembly VideosIEEE Access10.1109/ACCESS.2022.321601510(111704-111724)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3216015
Silva RFonseca AGoldschmidt Rdos Santos JBezerra ENeto MNovais RFerraz CViana W(2018)A Crowdsourcing Tool for Data Augmentation in Visual Question Answering TasksProceedings of the 24th Brazilian Symposium on Multimedia and the Web10.1145/3243082.3267455(137-140)Online publication date: 16-Oct-2018
https://dl.acm.org/doi/10.1145/3243082.3267455

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents