research-article

Instrumenting the crowd: using implicit behavioral measures to predict task performance

Authors:

Jeffrey M. Rzeszotarski,

Aniket KitturAuthors Info & Claims

UIST '11: Proceedings of the 24th annual ACM symposium on User interface software and technology

Pages 13 - 22

https://doi.org/10.1145/2047196.2047199

Published: 16 October 2011 Publication History

Abstract

Detecting and correcting low quality submissions in crowdsourcing tasks is an important challenge. Prior work has primarily focused on worker outcomes or reputation, using approaches such as agreement across workers or with a gold standard to evaluate quality. We propose an alternative and complementary technique that focuses on the way workers work rather than the products they produce. Our technique captures behavioral traces from online crowd workers and uses them to predict outcome measures such quality, errors, and the likelihood of cheating. We evaluate the effectiveness of the approach across three contexts including classification, generation, and comprehension tasks. The results indicate that we can build predictive models of task performance based on behavioral traces alone, and that these models generalize to related tasks. Finally, we discuss limitations and extensions of the approach.

References

[1]

Bernstein, M.S., Little, G., Miller, R.C., et al. Soylent: a word processor with a crowd inside. Proceedings of the 23nd annual ACM symposium on User interface soft-ware and technology, ACM (2010), 313--322.

Digital Library

[2]

Callison-Burch, C. Fast, cheap, and creative: Evaluating translation quality using Amazon's Mechanical Turk. Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1, Association for Computational Linguistics (2009), 286--295.

Digital Library

[3]

Chi, E.I., Pirolli, P., and Pitkow, J. The Scent of a Site : A System for Analyzing and Predicting Information Scent, Usage, and Usability of a Web Site. 2, 1 (2000).

[4]

Dekel, O. and Shamir, O. Vox populi: Collecting high-quality labels from a crowd. COLT 2009: Proceedings of the 22nd Annual Conference on Learning Theory, Citeseer (2009).

[5]

Downs, J.S., Holbrook, M.B., Sheng, S., and Cranor, L.F. Are your participants gaming the system?: screening mechanical turk workers. Proceedings of the 28th international conference on Human factors in computing systems, ACM (2010), 2399--2402.

Digital Library

[6]

Fern, X., Komireddy, C., Grigoreanu, V., and Burnett, M. Mining problem-solving strategies from HCI data. ACM Transactions on Computer-Human Interaction 17, 1 (2010), 1--22.

Digital Library

[7]

Ghazarian, A. and Noorhosseini, S.M. Automatic detection of users' skill levels using high-frequency user interface events. User Modeling and User-Adapted Inter-action 20, 2 (2010), 109--146.

Digital Library

[8]

Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., and Witten, I.H. The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11, 1 (2009), 10--18.

Digital Library

[9]

Hilbert, D. and Redmiles, D. Extracting usability information from user interface events. ACM Computing Surveys (CSUR) 32, 4 (2000), 384--421.

Digital Library

[10]

Huang, E., Zhang, H., Parkes, D.C., Gajos, K.Z., and Chen, Y. Toward Automatic Task Design : A Progress Report. Proceedings of the ACM SIGKDD workshop on human computation, ACM (2010), 77--85.

Digital Library

[11]

Hurst, A., Hudson, S., Mankoff, J., and Trewin, S. Automatically detecting pointing performance. Proceedings of the 13th, (2008), 11.

Digital Library

[12]

Ipeirotis, P.G., Provost, F., and Wang, J. Quality management on amazon mechanical turk. Proceedings of the ACM SIGKDD workshop on human computation, ACM (2010), 64--67.

Digital Library

[13]

Ivory, M. and Hearst, M.A. The state of the art in automating usability evaluation of user interfaces. ACM Computing Surveys (CSUR) 33, 4 (2001), 470--516.

Digital Library

[14]

Kim, J., Gunn, D., Schuh, E., and Phillips, B. Tracking real-time user experience (TRUE): a comprehensive instrumentation solution for complex systems. Proceedings of the twenty-sixth annual SIGCHI conference on Human Factors in Computing Systems, (2008), 443--451.

Digital Library

[15]

Kittur, A., Chi, E., and Suh, B. Crowdsourcing user studies with mechanical Turk. Proceedings of the twenty-sixth annual SIGCHI conference on Human Factors in Computing Systems, (2008), 1509--1512.

Digital Library

[16]

Mason, W., Street, W., and Watts, D.J. Financial Incentives and the " Performance of Crowds ."SIGKDD Explorations 11, 2 (2009), 100--108.

Digital Library

[17]

Rauterberg, M. and Aeppli, R. Learning in Man-Machine Systems : the Measurement of Behavioural and Cognitive Complexity. Systems, Man and Cybernetics, 1995. Intelligent Systems for the 21st Century., IEEE International Conference on, IEEE (1995), 4685--4690.

[18]

Rogstadius, J., Kostakos, V., Kittur, A., Smus, B., Laredo, J., and Vukovic, M. An Assessment of Intrinsic and Extrinsic Motivation on Task Performance in Crowdsourcing Markets. (2011).

[19]

Shahaf, D. and Horvitz, E. Generalized task markets for human and machine computation. Proc. 24th AAAI Conference on Artificial Intelligence, (2010).

[20]

Snow, R., O'Connor, B., Jurafsky, D., and Ng, A.Y. Cheap and fast - but is it good?: evaluating non-expert annotations for natural language tasks. Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics (2008), 254--263.

Digital Library

[21]

2Stieger, S. and Reips, U.-D. What are participants doing while filling in an online questionnaire: A paradata collection tool and an empirical study. Computers in Human Behavior 26, 6 (2010), 1488--1495.

Digital Library

[22]

2Vanderaalst, W., Vandongen, B., Herbst, J., Maruster, L., Schimm, G., and Weijters, a. Workflow mining: A survey of issues and approaches. Data & Knowledge Engineering 47, 2 (2003), 237--267.

Digital Library

[23]

Von Ahn, L. and Dabbish, L. Labeling images with a computer game. Proceedings of the SIGCHI conference on Human factors in computing systems, ACM (2004), 319--326.

Digital Library

Cited By

Williams ABai MBuck JMckinney TRechkemmer AKalyanaraman KLease MHaffner PZhou XLi E(2024)Snapper: Accelerating Bounding Box Annotation in Object Detection Tasks with Find-and-Snap ToolingProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645162(471-488)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1145/3640543.3645162
Lin FTsai PLee CHo YChen YYen YChang Y(2024)“I Prefer Regular Visitors to Answer My Questions”: Users’ Desired Experiential Background of Contributors for Location-based Crowdsourcing PlatformProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642520(1-18)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642520
Li CZhang ZSaugstad MSafranchik EKulkarni CHuang XPatel SIyer VAlthoff TFroehlich J(2024)LabelAId: Just-in-time AI Interventions for Improving Human Labeling Quality and Domain Knowledge in Crowdsourcing SystemsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642089(1-21)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642089
Show More Cited By

Index Terms

Instrumenting the crowd: using implicit behavioral measures to predict task performance
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods

Recommendations

CrowdScape: interactively visualizing user behavior and output
UIST '12: Proceedings of the 25th annual ACM symposium on User interface software and technology

Crowdsourcing has become a powerful paradigm for accomplishing work quickly and at scale, but involves significant challenges in quality control. Researchers have developed algorithmic quality control approaches based on either worker outputs (such as ...
Modus Operandi of Crowd Workers: The Invisible Role of Microtask Work Environments

The ubiquity of the Internet and the widespread proliferation of electronic devices has resulted in flourishing microtask crowdsourcing marketplaces, such as Amazon MTurk. An aspect that has remained largely invisible in microtask crowdsourcing is that ...
Making better use of the crowd: how crowdsourcing can advance machine learning research

This survey provides a comprehensive overview of the landscape of crowdsourcing research, targeted at the machine learning community. We begin with an overview of the ways in which crowdsourcing can be used to advance machine learning research, focusing ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

UIST '11: Proceedings of the 24th annual ACM symposium on User interface software and technology

October 2011

654 pages

ISBN:9781450307161

DOI:10.1145/2047196

General Chair:
Jeff Pierce
IBM Research, USA
,
Program Chairs:
Maneesh Agrawala
University of California, Berkeley, USA
,
Scott Klemmer
Stanford University, USA

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 October 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

UIST '11

Sponsor:

UIST '11: The 24th Annual ACM Symposium on User Interface Software and Technology

October 16 - 19, 2011

California, Santa Barbara, USA

Acceptance Rates

UIST '11 Paper Acceptance Rate 67 of 262 submissions, 26%;

Overall Acceptance Rate 561 of 2,567 submissions, 22%

Upcoming Conference

UIST '25

Sponsor:
sigchi
sigchi

The 38th Annual ACM Symposium on User Interface Software and Technology

September 28 - October 1, 2025

Busan , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

124
Total Citations
View Citations
1,282
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)5

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Williams ABai MBuck JMckinney TRechkemmer AKalyanaraman KLease MHaffner PZhou XLi E(2024)Snapper: Accelerating Bounding Box Annotation in Object Detection Tasks with Find-and-Snap ToolingProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645162(471-488)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1145/3640543.3645162
Lin FTsai PLee CHo YChen YYen YChang Y(2024)“I Prefer Regular Visitors to Answer My Questions”: Users’ Desired Experiential Background of Contributors for Location-based Crowdsourcing PlatformProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642520(1-18)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642520
Li CZhang ZSaugstad MSafranchik EKulkarni CHuang XPatel SIyer VAlthoff TFroehlich J(2024)LabelAId: Just-in-time AI Interventions for Improving Human Labeling Quality and Domain Knowledge in Crowdsourcing SystemsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642089(1-21)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642089
Paulino DFerreira JNetto ACorreia ARibeiro JGuimarães DBarroso JParedes H(2024)Probing into the Usage of Task Fingerprinting in Web Games to Enhance Cognitive Personalization: A Pilot Gamified Experience with Neurodivergent Participants2024 IEEE 12th International Conference on Serious Games and Applications for Health (SeGAH)10.1109/SeGAH61285.2024.10639597(1-8)Online publication date: 7-Aug-2024
https://doi.org/10.1109/SeGAH61285.2024.10639597
Paulino DGuimarães DCorreia ARibeiro JBarroso JParedes H(2023)A Model for Cognitive Personalization of Microtask DesignSensors10.3390/s2307357123:7(3571)Online publication date: 29-Mar-2023
https://doi.org/10.3390/s23073571
Correia AGrover ASchneider DPimentel AChaves Rde Almeida MFonseca B(2023)Designing for Hybrid Intelligence: A Taxonomy and Survey of Crowd-Machine InteractionApplied Sciences10.3390/app1304219813:4(2198)Online publication date: 8-Feb-2023
https://doi.org/10.3390/app13042198
Chang JZhang ABragg JHead ALo KDowney DWeld D(2023)CiteSee: Augmenting Citations in Scientific Papers with Persistent and Personalized Historical ContextProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580847(1-15)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3580847
Han DChoe JChun SChung JChang MYun SSong JOh S(2023)Neglected Free Lunch – Learning Image Classifiers Using Annotation Byproducts2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01848(20143-20155)Online publication date: 1-Oct-2023
https://doi.org/10.1109/ICCV51070.2023.01848
Gould SRudnicka ACook DCecchinato MNewbold JCox A(2023)Remote Work, Work Measurement and the State of Work Research in Human-Centred ComputingInteracting with Computers10.1093/iwc/iwad01435:5(725-734)Online publication date: 27-Feb-2023
https://doi.org/10.1093/iwc/iwad014
Shraga R(2022)HumanALProceedings of the Workshop on Human-In-the-Loop Data Analytics10.1145/3546930.3547496(1-8)Online publication date: 12-Jun-2022
https://dl.acm.org/doi/10.1145/3546930.3547496
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten