skip to main content
10.1145/2348283.2348400acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Quality through flow and immersion: gamifying crowdsourced relevance assessments

Published: 12 August 2012 Publication History

Abstract

Crowdsourcing is a market of steadily-growing importance upon which both academia and industry increasingly rely. However, this market appears to be inherently infested with a significant share of malicious workers who try to maximise their profits through cheating or sloppiness. This serves to undermine the very merits crowdsourcing has come to represent. Based on previous experience as well as psychological insights, we propose the use of a game in order to attract and retain a larger share of reliable workers to frequently-requested crowdsourcing tasks such as relevance assessments and clustering. In a large-scale comparative study conducted using recent TREC data, we investigate the performance of traditional HIT designs and a game-based alternative that is able to achieve high quality at significantly lower pay rates, facing fewer malicious submissions.

References

[1]
Gambit - Payment Solutions for Virtual Currency.burlhttp://www.getgambit.com/, 2011.
[2]
O. Alonso and R. Baeza-Yates. Design and implementation of relevance assessments using crowdsourcing. In ECIR, 2011.
[3]
O. Alonso and M. Lease. Crowdsourcing 101: putting the WSDM of crowds to work for you. In WSDM, 2011.
[4]
O. Alonso, D.E. Rose, and B. Stewart. Crowdsourcing for relevance evaluation. In ACM SIGIR Forum, 2008.
[5]
B. Baldwin and B. Carpenter. LingPipe. Available from World Wide Web: http://alias-i. com/lingpipe, 2003.
[6]
C. Buckley and E.M. Voorhees. Retrieval evaluation with incomplete information. In SIGIR 2004. ACM.
[7]
J. Callan, M. Hoy, C. Yoo, and L. Zhao. Clueweb09 data set.burlhttp://boston.lti.cs.cmu.edu, 2009.
[8]
B. Carterette, J. Allan, and R. Sitaraman. Minimal test collections for retrieval evaluation. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 268--275. ACM, 2006.
[9]
B. Carterette, V. Pavlu, H. Fang, and E. Kanoulas. Million Query track 2009 overview. In Proceedings of TREC, volume 9, 2009.
[10]
V.R. Carvalho, M. Lease, and E. Yilmaz. Crowdsourcing for search evaluation. In ACM SIGIR Forum, 2011.
[11]
CrowdFlower. Crowdsourcing - Labor on demand.burlhttp://crowdflower.com/, 2012.
[12]
M. Csikszentmihalyi. Flow: The psychology of optimal experience. Harper Perennial, 1991.
[13]
C. Eickhoff and A. de Vries. How crowdsourcable is your task. In WSDM Workshop on Crowdsourcing for Search and Data Mining (CSDM), pages 11--14, 2011.
[14]
C. Eickhoff and A.P. de Vries. Increasing Cheat Robustness of Crowdsourcing Tasks. Information RetrievaltextitTo appear, 2012.
[15]
C. Grady and M. Lease. Crowdsourcing document relevance assessment with Mechanical Turk. In NAACL HLT Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, 2010.
[16]
C. Harris. You're Hired! An Examination of Crowdsourcing Incentive Models in Human Resource Tasks. In WSDM Workshop on Crowdsourcing for Search and Data Mining (CSDM), pages 15--18, 2011.
[17]
M. Hirth, T. Hoßfeld, and P. Tran-Gia. Cheat-Detection Mechanisms for Crowdsourcing. University of Würzburg, Tech. Rep, 2010.
[18]
P. Ipeirotis. Crowdsourcing using Mechanical Turk: quality management and scalability. In Proceedings of the 8th International Workshop on Information Integration on the Web: in conjunction with WWW 2011. ACM, 2011.
[19]
T. Joachims. Optimizing search engines using clickthrough data. In SIGKDD, 2002.
[20]
G. Kazai. In search of quality in crowdsourcing for search engine evaluation. ECIR, 2011.
[21]
G. Kazai, J. Kamps, and N. Milic-Frayling. Worker Types and Personality Traits in Crowdsourcing Relevance Labels. 2011.
[22]
G. Kazai and N. Milic-Frayling. On the evaluation of the quality of relevance assessments collected through crowdsourcing. In SIGIR 2009 Workshop on the Future of IR Evaluation, 2009.
[23]
C.H. Ko, J.Y. Yen, C.C. Chen, S.H. Chen, and C.F. Yen. Gender differences and related factors affecting online gaming addiction among taiwanese adolescents. The Journal of nervous and mental disease, 2005.
[24]
J. Lampel and A. Bhalla. The role of status seeking in online communities: Giving the gift of experience. Journal of Computer-Mediated Communication, 2007.
[25]
M. Lease and G. Kazai. Overview of the TREC 2011 Crowdsourcing Track (Conference Notebook). 2011.
[26]
M. Lease and E. Yilmaz. Crowdsourcing for information retrieval. In ACM SIGIR Forum, number 2. ACM, 2012.
[27]
H. Ma, R. Chandrasekar, C. Quirk, and A. Gupta. Improving search engines using human computation games. In CIKM 2009.
[28]
C.C. Marshall and F.M. Shipman. The ownership and reuse of visual media. JCDL, 2011.
[29]
J. McGonigal. Reality is broken: Why games make us better and how they can change the world. Penguin Pr, 2011.
[30]
David Pennock. The Wisdom of the Probability Sports Crowd.burlhttp://blog.oddhead.com/2007/01/04/the-wisdom-of-the-probabilitysports%-crowd/, 2007.
[31]
C. Richards. Teach the world to twitch: An interview with Marc Prensky, CEO and founder Games2train. com. Futurelab, 2003.
[32]
F. Scholer, A. Turpin, and M. Sanderson. Quantifying test collection quality based on the consistency of relevance judgements. In SIGIR 2011.
[33]
R. Snow, B. O'Connor, D. Jurafsky, and A.Y. Ng. Cheap and fast--but is it good?: evaluating non-expert annotations for natural language tasks. In EMNLP, 2008.
[34]
J. Surowiecki, M.P. Silverman, et al. The wisdom of crowds. American Journal of Physics, 2007.
[35]
Amazon Mechanical Turk. Artificial Artificial Intelligence.burlhttp://mturk.com, 2012.
[36]
J. Urbano, M. Marrero, D. Martín, J. Morato, K. Robles, and J. Lloréns. The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track: Notebook Paper. 2011.
[37]
L. von Ahn and L. Dabbish. Labeling images with a computer game. In SIGCHI, 2004.
[38]
L. von Ahn and L. Dabbish. Designing games with a purpose. Communications of the ACM, 2008.
[39]
L. von Ahn, S. Ginosar, M. Kedia, and M. Blum. Improving image search with phetch. In ICASSP, 2007.
[40]
L. von Ahn, M. Kedia, and M. Blum. Verbosity: a game for collecting common-sense facts. In SIGCHI 2006.
[41]
L. von Ahn, R. Liu, and M. Blum. Peekaboom: a game for locating objects in images. In SIGCHI 2006.
[42]
E. Voorhees. The philosophy of information retrieval evaluation. In Evaluation of cross-language information retrieval systems, 2002.
[43]
E. Voorhees, D.K. Harman, National Institute of Standards, and Technology (US). TREC: Experiment and evaluation in information retrieval. MIT press USA, 2005.
[44]
P. Welinder, S. Branson, S. Belongie, and P. Perona. The multidimensional wisdom of crowds. In NIPS, 2010.

Cited By

View all
  • (2024)The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and GuidelinesProceedings of the ACM on Human-Computer Interaction10.1145/36410238:CSCW1(1-45)Online publication date: 26-Apr-2024
  • (2024)Uncovering labeler bias in machine learning annotation tasksAI and Ethics10.1007/s43681-024-00572-wOnline publication date: 16-Sep-2024
  • (2024)The Role of Artificial Intelligence in Improving Customer Service and Retaining Human Resources: Digital Sustainability as a Mediating VariableExplainable Artificial Intelligence in the Digital Sustainability Administration10.1007/978-3-031-63717-9_5(77-92)Online publication date: 29-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
August 2012
1236 pages
ISBN:9781450314725
DOI:10.1145/2348283
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustering
  2. crowdsourcing
  3. gamification
  4. relevance assessments
  5. serious games

Qualifiers

  • Research-article

Conference

SIGIR '12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)62
  • Downloads (Last 6 weeks)4
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and GuidelinesProceedings of the ACM on Human-Computer Interaction10.1145/36410238:CSCW1(1-45)Online publication date: 26-Apr-2024
  • (2024)Uncovering labeler bias in machine learning annotation tasksAI and Ethics10.1007/s43681-024-00572-wOnline publication date: 16-Sep-2024
  • (2024)The Role of Artificial Intelligence in Improving Customer Service and Retaining Human Resources: Digital Sustainability as a Mediating VariableExplainable Artificial Intelligence in the Digital Sustainability Administration10.1007/978-3-031-63717-9_5(77-92)Online publication date: 29-Jun-2024
  • (2023)Gamification in Mobile Apps: Assessing the Effects on Customer Engagement and Loyalty in the Retail IndustryJournal of Information Technology and Digital World10.36548/jitdw.2023.4.0065:4(404-416)Online publication date: Dec-2023
  • (2023)Push it real good: the effects of push notifications promoting motivational affordances on consumer behavior in a gamified mobile appEuropean Journal of Marketing10.1108/EJM-06-2021-038857:9(2592-2618)Online publication date: 7-Jul-2023
  • (2023)Maybe we’ve got it wrong. An experimental evaluation of self-determination and Flow Theory in gamificationJournal of Research on Technology in Education10.1080/15391523.2023.224298157:2(417-436)Online publication date: 10-Aug-2023
  • (2023)Into the Unown: Improving location-based gamified crowdsourcing solutions for geo data gatheringEntertainment Computing10.1016/j.entcom.2023.10057546(100575)Online publication date: May-2023
  • (2022)Usability and applicability of a mindfulness based online intervention developed for people with problematic internet gaming behaviorInternational Journal of Psychology and Counselling10.5897/IJPC2021.065814:2(17-25)Online publication date: 31-May-2022
  • (2022)Raising Consent Awareness With Gamification and Knowledge GraphsInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.30082018:1(1-21)Online publication date: 3-Jun-2022
  • (2022)The Role of Mechanics in GamificationResearch Anthology on Game Design, Development, Usage, and Social Impact10.4018/978-1-6684-7589-8.ch091(1870-1890)Online publication date: 7-Oct-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media