ABSTRACT
Workers of microtask crowdsourcing marketplaces strive to find a balance between the need for monetary income and the need for high reputation. Such balance is often threatened by poorly formulated tasks, as workers attempt their execution despite a sub-optimal understanding of the work to be done.
In this paper we highlight the role of clarity as a characterising property of tasks in crowdsourcing. We surveyed 100 workers of the CrowdFlower platform to verify the presence of issues with task clarity in crowdsourcing marketplaces, reveal how crowd workers deal with such issues, and motivate the need for mechanisms that can predict and measure task clarity. Next, we propose a novel model for task clarity based on the goal and role clarity constructs. We sampled 7.1K tasks from the Amazon mTurk marketplace, and acquired labels for task clarity from crowd workers. We show that task clarity is coherently perceived by crowd workers, and is affected by the type of the task. We then propose a set of features to capture task clarity, and use the acquired labels to train and validate a supervised machine learning model for task clarity prediction. Finally, we perform a long-term analysis of the evolution of task clarity on Amazon mTurk, and show that clarity is not a property suitable for temporal characterisation.
- Omar Alonso and Ricardo Baeza-Yates. 2011. Design and implementation of relevance assessments using crowdsourcing. In ECIR. Springer, 153--164. Google ScholarDigital Library
- Omar Alonso, Catherine Marshall, and Marc Najork. 2014. Crowdsourcing a subjective labeling task: a human-centered framework to ensure reliable results. Technical Report. MSR-TR-2014-91.Google Scholar
- Janine Berg. 2016. Income security in the on-demand economy: findings and policy lessons from a survey of crowdworkers. Comparative Labor Law & Policy Journal 37, 3 (2016).Google Scholar
- Hein Broekkamp, Bernadette HAM van Hout-Wolters, Gert Rijlaarsdam, and Huub van den Bergh. 2002. Importance in instructional text: teachers' and students' perceptions of task demands. Journal of Educational Psychology 94, 2 (2002), 260.Google ScholarCross Ref
- Francisco Cano and María Cardelle-Elawar. 2004. An integrated analysis of secondary school studentsfi conceptions and beliefs about learning. European Journal of Psychology of Education 19, 2 (2004), 167--187.Google ScholarCross Ref
- Jeanne Sternlicht Chall and Edgar Dale. 1995. Readability revisited: The new Dale-Chall readability formula. Brookline Books.Google Scholar
- Kevyn Collins-Thompson. 2014. Computational assessment of text readability: A survey of current and future research. ITL-International Journal of Applied Linguistics 165, 2 (2014), 97--135.Google ScholarCross Ref
- Kevyn Collins-Thompson and James P Callan. 2004. A language modeling ap- proach to predicting reading difficulty.. In HLT-NAACL . 193--200.Google Scholar
- Scott A Crossley, Kristopher Kyle, and Danielle S McNamara. 2015. The tool for the automatic analysis of text cohesion (TAACO): automatic assessment of local, global, and text cohesion. Behavior research methods (2015), 1--11.Google Scholar
- Tove I Dahl, Margrethe Bals, and Anne Lene Turi. 2005. Are students' beliefs about knowledge and learning associated with their reported use of learning strategies? British journal of educational psychology 75, 2 (2005), 257--273.Google Scholar
- Edgar Dale and Jeanne S Chall. 1949. The concept of readability. Elementary English 26, 1 (1949), 19--26.Google Scholar
- Orphée De Clercq, Véronique Hoste, Bart Desmet, Philip Van Oosten, Martine De Cock, and Lieve Macken. 2014. Using the crowd for readability prediction. Natural Language Engineering 20, 03 (2014).Google Scholar
- Djellel Eddine Difallah, Michele Catasta, Gianluca Demartini, Panagiotis G Ipeiro- tis, and Philippe Cudré-Mauroux. 2015. The Dynamics of Micro-Task Crowdsourcing: The Case of Amazon MTurk. In WWW. International World Wide Web Conferences Steering Committee, 238--247. Google ScholarDigital Library
- Ujwal Gadiraju, Ricardo Kawase, and Stefan Dietze. 2014. A taxonomy of micro- tasks on the web. In Hypertext. ACM, 218--223. Google ScholarDigital Library
- Ujwal Gadiraju, Ricardo Kawase, Stefan Dietze, and Gianluca Demartini. 2015. Understanding malicious behavior in crowdsourcing platforms: the case of online surveys. In CHI. ACM, 1631--1640. Google ScholarDigital Library
- Catherine Grady and Matthew Lease. 2010. Crowdsourcing document relevance assessment with mechanical turk. In HLT-NAACL workshop on creating speech and language data with Amazon's mechanical turk. Association for Computation- alLinguistics, 172--179. Google ScholarDigital Library
- Arthur C Graesser, Danielle S McNamara, Max M Louwerse, and Zhiqiang Cai. 2004. Coh-Metrix: analysis of text on cohesion and language. Behavior research methods, instruments, & computers 36, 2 (2004), 193--202.Google Scholar
- Allison Hadwin. 2006. Student task understanding. In Learning and Teaching Conference. University of Victoria, Victoria, British Columbia, Canada.Google Scholar
- AF Hadwin, M Oshige, M Miller, and P Wild. 2009. Examining student and instructor task perceptions in a complex engineering design task. In international conference on innovation and practices in engineering design and engineering education. McMaster University, Hamilton, ON, Canada.Google Scholar
- T Hoßfeld, Raimund Schatz, and Sebastian Egger. 2011. SOS: The MOS is not enough!. In QoMEX. IEEE, 131--136.Google Scholar
- Lilly C Irani and M Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical turk. In CHI. ACM, 611--620. Google ScholarDigital Library
- Diane Lee Jamieson-Noel. 2004. Exploring task definition as a facet of self-regulated learning . Ph.D. Dissertation. Faculty of Education-Simon Fraser University.Google Scholar
- Rohit J Kate, Xiaoqiang Luo, Siddharth Patwardhan, Martin Franz, Radu Flo- rian, Raymond J Mooney, Salim Roukos, and Chris Welty. 2010. Learning to predict readability using diverse linguistic features. In ACL. Association for Computational Linguistics, 546--554. Google ScholarDigital Library
- Shashank Khanna, Aishwarya Ratan, James Davis, and William Thies. 2010. Evaluating and improving the usability of Mechanical Turk for low-income workers in India. In DEV. ACM, 12. Google ScholarDigital Library
- J Peter Kincaid, Robert P Fishburne Jr, Richard L Rogers, and Brad S Chissom. 1975. Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical Report. DTIC Document.Google Scholar
- Aniket Kittur, Ed H Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with Mechanical Turk. In CHI. ACM, 453--456. Google ScholarDigital Library
- Aniket Kittur, Jeffrey V Nickerson, Michael Bernstein, Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matt Lease, and John Horton. 2013. The future of crowd work. In CSCW. ACM, 1301--1318. Google ScholarDigital Library
- Lieve Luyten, Joost Lowyck, and Francis Tuerlinckx. 2001. Task perception as a mediating variable: A contribution to the validation of instructional knowledge. British Journal of Educational Psychology 71, 2 (2001), 203--223.Google ScholarCross Ref
- David Malvern and Brian Richards. 2012. Measures of lexical richness. The Encyclopedia of Applied Linguistics (2012).Google Scholar
- Catherine C Marshall and Frank M Shipman. 2013. Experiences surveying the crowd: Reflections on methods, participation, and reliability. In WebSci. ACM, 234--243. Google ScholarDigital Library
- Emily Pitler and Ani Nenkova. 2008. Revisiting readability: A unified framework for predicting text quality. In EMNLP. Association for Computational Linguistics, 186--195. Google ScholarDigital Library
- Presentacion Rivera-Reyes. 2015. Students' task interpretation and conceptual understanding in electronics laboratory work. (2015).Google Scholar
- Libby O Ruch and Rae R Newton. 1977. Sex characteristics, task clarity, and authority. Sex Roles 3, 5 (1977), 479--494.Google ScholarCross Ref
- Aaron D Shaw, John J Horton, and Daniel L Chen. 2011. Designing incentives for inexpert human raters. In CSCW. ACM, 275--284. Google ScholarDigital Library
- John Sweller and Paul Chandler. 1994. Why some material is difficult to learn. Cognition and instruction 12, 3 (1994), 185--233.Google Scholar
- Jie Yang, Claudia Hauff, Alessandro Bozzon, and Geert-Jan Houben. 2014. Asking the right question in collaborative q&a systems. In Hypertext. ACM, 179--189. Google ScholarDigital Library
- Jie Yang, Judith Redi, Gianluca Demartini, and Alessandro Bozzon. 2016. Modeling task complexity in crowdsourcing. In HCOMP. AAAI, 249--258Google Scholar
Index Terms
- Clarity is a Worthwhile Quality: On the Role of Task Clarity in Microtask Crowdsourcing
Recommendations
Understanding Malicious Behavior in Crowdsourcing Platforms: The Case of Online Surveys
CHI '15: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing SystemsCrowdsourcing is increasingly being used as a means to tackle problems requiring human intelligence. With the ever-growing worker base that aims to complete microtasks on crowdsourcing platforms in exchange for financial gains, there is a need for ...
Modus Operandi of Crowd Workers: The Invisible Role of Microtask Work Environments
The ubiquity of the Internet and the widespread proliferation of electronic devices has resulted in flourishing microtask crowdsourcing marketplaces, such as Amazon MTurk. An aspect that has remained largely invisible in microtask crowdsourcing is that ...
SimilarHITs: Revealing the Role of Task Similarity in Microtask Crowdsourcing
HT '18: Proceedings of the 29th on Hypertext and Social MediaWorkers in microtask crowdsourcing systems typically consume different types of tasks. Task consumption is driven by the self-selection of workers in the most popular platforms such as Amazon Mechanical Turk and CrowdFlower. Workers typically complete ...
Comments