skip to main content
10.1145/3159652.3159661acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Demographics and Dynamics of Mechanical Turk Workers

Published:02 February 2018Publication History

ABSTRACT

We present an analysis of the population dynamics and demographics of Amazon Mechanical Turk workers based on the results of the survey that we conducted over a period of 28 months, with more than 85K responses from 40K unique participants. The demographics survey is ongoing (as of November 2017), and the results are available at http://demographics.mturk-tracker.com: we provide an API for researchers to download the survey data. We use techniques from the field of ecology, in particular, the capture-recapture technique, to understand the size and dynamics of the underlying population. We also demonstrate how to model and account for the inherent selection biases in such surveys. Our results indicate that there are more than 100K workers available in Amazon»s crowdsourcing platform, the participation of the workers in the platform follows a heavy-tailed distribution, and at any given time there are more than 2K active workers. We also show that the half-life of a worker on the platform is around 12-18 months and that the rate of arrival of new workers balances the rate of departures, keeping the overall worker population relatively stable. Finally, we demonstrate how we can estimate the biases of different demographics to participate in the survey tasks, and show how to correct such biases. Our methodology is generic and can be applied to any platform where we are interested in understanding the dynamics and demographics of the underlying user population.

References

  1. Steven C. Amstrup, Trent L. McDonald, and Bryan F. J. Manly. 2010. Handbook of Capture-Recapture Analysis. Princeton University Press, Princeton, NJ.Google ScholarGoogle Scholar
  2. Joanna J. Arch and Alaina L. Carr. 2016. Using Mechanical Turk for research on cancer survivors. Psycho-Oncology Forthcoming (2016), --. PON-15-0731.R1.Google ScholarGoogle Scholar
  3. Antonio A. Arechar, Simon Gächter, and Lucas Molleman. 2017. Conducting interactive experiments online. Experimental Economics Forthcoming (09 May 2017), 1--33.Google ScholarGoogle Scholar
  4. K. P. Burnham and W. S. Overton. 1978. Estimation of the Size of a Closed Population when Capture Probabilities vary Among Animals. Biometrika 65, 3 (1978), 625--633.Google ScholarGoogle ScholarCross RefCross Ref
  5. Chris Callison-Burch and Mark Dredze. 2010. Creating Speech and Language Data with Amazon's Mechanical Turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk (CSLDAMT '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Krista Casler, Lydia Bickel, and Elizabeth Hackett. 2013. Separate but Equal? A Comparison of Participants and Data Gathered via Amazon's MTurk, Social Media, and Face-to-Face Behavioral Testing. Computers in Human Behavior 29 (2013), 2156--2160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jesse Chandler, Gabriele Paolacci, and Pam Mueller. 2013. Risks and Rewards of Crowdsourcing Marketplaces. In Handbook of Human Computation. Springer, New York, 377--392.Google ScholarGoogle Scholar
  8. Jesse Chandler and Danielle Shapiro. 2016. Conducting Clinical Research Using Crowdsourced Convenience Samples. Annual Review of Clinical Psychology 12 (2016), 35--81.Google ScholarGoogle ScholarCross RefCross Ref
  9. Anne Chao. 1987. Estimating the population size for capture-recapture data with unequal catchability. Biometrics 43, 4 (Dec. 1987), 783--791.Google ScholarGoogle ScholarCross RefCross Ref
  10. R. M. Cormack. 1964. Estimates of Survival From the Sighting of Marked Animals. Biometrics 51 (1964), 429--438.Google ScholarGoogle ScholarCross RefCross Ref
  11. Francisco Cribari-Neto and Achim Zeileis. 2010. Beta Regression in R. Journal of Statistical Software, Articles 34, 2 (2010), 1--24.Google ScholarGoogle ScholarCross RefCross Ref
  12. Damien L. Crone and Lisa A. Williams. 2017. Crowdsourcing participants for psychological research in Australia: A test of Microworkers. Australian Psychological Society 69 (2017), 39--47. Issue 1.Google ScholarGoogle ScholarCross RefCross Ref
  13. Joost de Winter, Miltos Kyriakidis, Dimitra Dodou, and Riender Happee. 2015. Using CrowdFlower to Study the Relationship between Self-reported Violations and Traffic Accidents. Procedia Manufacturing 3 (2015), 2518--2525.Google ScholarGoogle ScholarCross RefCross Ref
  14. Djellel Eddine Difallah, Michele Catasta, Gianluca Demartini, Panagiotis G Ipeirotis, and Philippe Cudré-Mauroux. 2015. The Dynamics of Micro-Task Crowdsourcing: The Case of Amazon MTurk. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, ACM, New York, NY, USA, 238--247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Karën Fort, Gilles Adda, and K. Bretonnel Cohen. 2011. Amazon Mechanical Turk: Gold Mine or Coal Mine? Computational Linguistics 37 (2011), 413 -- 420. Issue 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ilka H. Gleibs. 2017. Are all research fields equal? Rethinking practice for the use of data from crowd-sourcing market places. Behavior Research Methods 49 (August 2017), 1333--1342. Issue 4.Google ScholarGoogle Scholar
  17. Panagiotis Ipeirotis. 2010. Demographics of Mechanical Turk. Technical Report CeDER-10-01. New York University.Google ScholarGoogle Scholar
  18. G. M. Jolly. 1965. Explicit Estimates from Capture-Recapture Data With Both Death and Immigration- Stochastic Model. Biometrika 52 (1965), 225--247.Google ScholarGoogle ScholarCross RefCross Ref
  19. Jeremy Kees, Christopher Berry, Scot Burton, and Kim Sheehan. 2017. An Analysis of Data Quality: Professional Panels, Student Subject Pools, and Amazon's Mechanical Turk. Journal of Advertising 46, 1 (2017), 141--155.Google ScholarGoogle ScholarCross RefCross Ref
  20. Frederick Charles Lincoln. 1930. Calculating waterfowl abundance on the basis of banding returns. US Dept. of Agriculture, Washington, DC.Google ScholarGoogle Scholar
  21. Jianguo Lu and Dingding Li. 2010. Estimating deep web data source size by capture-recapture method. Information Retrieval 13, 1 (February 2010), 70--95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Winter Mason and Duncan J. Watts. 2009. Financial Incentives and the "Performance of Crowds". In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP '09). ACM, New York, NY, USA, 77--85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Joshua D. Miller, Michael Crowe, Brandon Weiss, Jessica L. Maples-Keller, and Donald R. Lynam. 2017. Using online, crowdsourcing platforms for data collection in personality disorder research: The example of Amazon's Mechanical Turk. Personality Disorders: Theory, Research, and Treatment 8, 1 (2017), 26--34.Google ScholarGoogle ScholarCross RefCross Ref
  24. Shirley Pledger, Kenneth H. Pollock, and James L. Norris. 2003. Open CaptureRecapture Models with Heterogeneity: I. Cormack-Jolly-Seber Model. Biometrics 59, 4 (Dec. 2003), 786--794.Google ScholarGoogle Scholar
  25. Shirley Pledger, Kenneth H. Pollock, and James L. Norris. 2010. Open CaptureRecapture Models with Heterogeneity: II. Jolly-Seber Model. Biometrics 66, 3 (September 2010), 883--890.Google ScholarGoogle Scholar
  26. Joel Ross, Lilly Irani, M. Six Silberman, Andrew Zaldivar, and Bill Tomlinson. 2010. Who Are the Crowdworkers?: Shifting Demographics in Mechanical Turk. In CHI '10 Extended Abstracts on Human Factors in Computing Systems (CHI EA '10). ACM, New York, NY, USA, 2863--2872. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Carl James Schwarz and A. Neil Arnason. 1996. A General Methodology for the Analysis of Capture-Recapture Experiments in Open Populations. Biometrics 52, 3 (1996), 860--873.Google ScholarGoogle ScholarCross RefCross Ref
  28. George Seber. 1982. The Estimation of Animal Abundance and Related Parameters. Charles Griffin, London, UK.Google ScholarGoogle Scholar
  29. G. A. F. Seber. 1964. A Note on the Multiple Recapture Census. Biometrika 52 (1964), 249--259.Google ScholarGoogle ScholarCross RefCross Ref
  30. Victor S. Sheng, Foster Provost, and Panagiotis G. Ipeirotis. 2008. Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '08). ACM, New York, NY, USA, 614--622. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Michael Smithson and Jay Verkuilen. 2006. A better lemon squeezer? Maximumlikelihood regression with beta-distributed dependent variables. Psychological methods 11 (2006), 54--71. Issue 1.Google ScholarGoogle Scholar
  32. Rion Snow, Brendan O'Connor, Daniel Jurafsky, and Andrew Y. Ng. 2008. Cheap and Fast-but is It Good?: Evaluating Non-expert Annotations for Natural Language Tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP '08). Association for Computational Linguistics, Stroudsburg, PA, USA, 254--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Vanessa V. Sochat, Ian W. Eisenberg, A. Zeynep Enkavi, Jamie Li, Patrick G. Bissett, and Russell A. Poldrack. 2016. The Experiment Factory: Standardizing Behavioral Experiments. Frontiers in Psychology 7 (2016), 610.Google ScholarGoogle ScholarCross RefCross Ref
  34. Neil Stewart, Christoph Ungemach, Adam J. L. Harris, Daniel M. Bartels, Ben R. Newell, Gabriele Paolaccik, and Jesse Chandler. 2015. The average laboratory samples a population of 7,300 Amazon Mechanical Turk workers. Judgment and Decision Making 10, 5 (2015), 479--491.Google ScholarGoogle ScholarCross RefCross Ref
  35. Keela S. Thomson and Daniel M. Oppenheimer. 2016. Investigating an alternate form of the cognitive reflection test. Judgment and Decision Making 11, 1 (2016), 99--113.Google ScholarGoogle ScholarCross RefCross Ref
  36. Beth Trushkowsky, Tim Kraska, Michael J. Franklin, and Purnamrita Sarkar. 2016. Answering Enumeration Queries with the Crowd. Commun. ACM 59, 1 (2016), 118--127 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Demographics and Dynamics of Mechanical Turk Workers

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
      February 2018
      821 pages
      ISBN:9781450355810
      DOI:10.1145/3159652

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 February 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WSDM '18 Paper Acceptance Rate81of514submissions,16%Overall Acceptance Rate498of2,863submissions,17%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader