Abstract
Crowdsourcing is a time- and cost-efficient web-based technique for labeling large datasets like those used in Machine Learning. Controlling the output quality in crowdsourcing is an active research domain which has yielded a fair number of methods and approaches. Due to the quantitative and qualitative limitations of the existing evaluation datasets, comparing and evaluating these methods have been very limited. In this paper, we present CrowdED (Crowdsourcing Evaluation Dataset), a rich dataset for evaluating a wide range of quality control methods alongside with CREX (CReate Enrich eXtend), a framework that facilitates the creation of such datasets and guarantees their future-proofing and re-usability through customizable extension and enrichment.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
E.g., demographics and self-evaluation profiles.
- 2.
- 3.
https://www.figure-eight.com. Formerly named CrowdFlower.
- 4.
Yet, it is not the only one since any other task corpus can be used.
- 5.
FE levels range from 1 to 3 where level 3 represents the most experienced and reliable workers and 1 represents all qualified workers.
- 6.
A demo of CREX’s user interface and a real world use scenario can be found on https://project-crowd.eu/.
- 7.
e.g., requester accessible back-end services or API to dynamically modify tasks and assignments.
References
Alsayasneh, M., et al.: Personalized and diverse task composition in crowdsourcing. IEEE Trans. Knowl. Data Eng. 30(1), 128–141 (2018)
Amer-Yahia, S., Gaussier, E., Leroy, V., Pilourdault, J., Borromeo, R.M., Toyama, M.: Task composition in crowdsourcing, pp. 194–203 (2016)
Awwad, T., Bennani, N., Ziegler, K., Sonigo, V., Brunie, L., Kosch, H.: Efficient worker selection through history-based learning in crowdsourcing, vol. 1, pp. 923–928 (2017)
Baba, Y., Kashima, H.: Statistical quality estimation for general crowdsourcing tasks. In: ACM SIGKDD, NY, USA, pp. 554–562 (2013)
Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B., Allahbakhsh, M.: Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions. ACM Comput. Surv. (CSUR) 51(1), 7 (2018)
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39, 1–38 (1977)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database, pp. 248–255 (2009)
Difallah, D.E., Catasta, M., Demartini, G., Cudré-Mauroux, P.: Scaling-up the crowd: micro-task pricing schemes for worker retention and latency improvement (2014)
Ghosh, A., Kale, S., McAfee, P.: Who moderates the moderators?: crowdsourcing abuse detection in user-generated content. In: Proceedings of the 12th ACM Conference on Electronic Commerce, pp. 167–176. ACM (2011)
Gil, Y., Garijo, D., Ratnakar, V., Khider, D., Emile-Geay, J., McKay, N.: A controlled crowdsourcing approach for practical ontology extensions and metadata annotations. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 231–246. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68204-4_24
Quoc Viet Hung, N., Tam, N.T., Tran, L.N., Aberer, K.: An evaluation of aggregation techniques in crowdsourcing. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013. LNCS, vol. 8181, pp. 1–15. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41154-0_1
Ipeirotis, P.G.: Demographics of mechanical turk (2010)
Jin, Y., Carman, M., Kim, D., Xie, L.: Leveraging side information to improve label quality control in crowd-sourcing (2017)
Jung, H.J., Lease, M.: Improving consensus accuracy via z-score and weighted voting. In: Human Computation (2011)
Kamar, E., Horvitz, E.: Incentives for truthful reporting in crowdsourcing. In: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, vol. 3, pp. 1329–1330. International Foundation for Autonomous Agents and Multiagent Systems (2012)
Kanoulas, E., Carterette, B., Hall, M., Clough, P., Sanderson, M.: Overview of the TREC 2011 session track (2011)
Kazai, G., Kamps, J., Milic-Frayling, N.: The face of quality in crowdsourcing relevance labels: Demographics, personality and labeling accuracy. In: CIKM, pp. 2583–2586 (2012)
Le, J., Edmonds, A., Hester, V., Biewald, L.: Ensuring quality in crowdsourced search relevance evaluation: the effects of training question distribution. In: 2010 Workshop on Crowdsourcing for Search Evaluation, pp. 21–26 (2010)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents, pp. 1188–1196 (2014)
Li, H., Yu, B., Zhou, D.: Error rate analysis of labeling by crowdsourcing. In: Machine Learning Meets Crowdsourcing Workshop (2013)
Li, H., Yu, B., Zhou, D.: Error rate bounds in crowdsourcing models. arXiv preprint arXiv:1307.2674 (2013)
Li, H., Zhao, B., Fuxman, A.: The wisdom of minority: Discovering and targeting the right group of workers for crowdsourcing. In: WWW, NY, pp. 165–176 (2014)
Mashhadi, A.J., Capra, L.: Quality control for real-time ubiquitous crowdsourcing. In: UbiCrowd, NY, USA, pp. 5–8 (2011)
Mavridis, P., Gross-Amblard, D., Miklós, Z.: Using hierarchical skills for optimized task assignment in knowledge-intensive crowdsourcing, pp. 843–853 (2016)
Morris, R., Dontcheva, M., Gerber, E.: Priming for better performance in microtask crowdsourcing environments. IEEE Internet Comput. 16(5), 13–19 (2012)
Mousa, H., Benmokhtar, S., Hasan, O., Brunie, L., Younes, O., Hadhoud, M.: A reputation system resilient against colluding and malicious adversaries in mobile participatory sensing applications (2017)
Oleson, D., Sorokin, A., Laughlin, G.P., Hester, V., Le, J., Biewald, L.: Programmatic gold: targeted and scalable quality assurance in crowdsourcing. Hum. Comput. 11(11), 43–48 (2011)
Rahman, H., Roy, S.B., Thirumuruganathan, S., Amer-Yahia, S., Das, G.: Task assignment optimization in collaborative crowdsourcing, pp. 949–954 (2015)
Rahmanian, B., Davis, J.G.: User interface design for crowdsourcing systems, pp. 405–408 (2014)
Raykar, V.C., et al.: Supervised learning from multiple experts: whom to trust when everyone lies a bit. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 889–896. ACM (2009)
Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure (2007)
Rousseeuw, P.J., Kaufman, L.: Finding Groups in Data. Wiley, Hoboken (1990)
Roy, S.B., Lykourentzou, I., Thirumuruganathan, S., Amer-Yahia, S., Das, G.: Task assignment optimization in knowledge-intensive crowdsourcing. VLDB J. 24(4), 467–491 (2015)
Rzeszotarski, J.M., Kittur, A.: Instrumenting the crowd: using implicit behavioral measures to predict task performance. In: UIST, NY, USA, pp. 13–22 (2011)
Salton, G., McGill, M.: Modern information retrieval (1983)
Sarasua, C., Simperl, E., Noy, N., Bernstein, A., Leimeister, J.M.: Crowdsourcing and the semantic web: a research manifesto. Hum. Comput. (HCOMP) 2(1), 3–17 (2015)
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: evaluating non-expert annotations for natural language tasks. In: Conference on Empirical Methods in Natural Language Processing, pp. 254–263. Association for Computational Linguistics (2008)
Welinder, P., Branson, S., Perona, P., Belongie, S.J.: The multidimensional wisdom of crowds, pp. 2424–2432 (2010)
Whitehill, J., Wu, T.F., Bergsma, J., Movellan, J.R., Ruvolo, P.L.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: NIPS, pp. 2035–2043 (2009)
Wilkinson, M.D., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3 (2016)
Ye, B., Wang, Y., Liu, L.: Crowd trust: a context-aware trust model for worker selection in crowdsourcing environments, pp. 121–128 (2015)
Zhou, D., Basu, S., Mao, Y., Platt, J.C.: Learning from the wisdom of crowds by minimax entropy, pp. 2195–2203 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Awwad, T., Bennani, N., Rehn-Sonigo, V., Brunie, L., Kosch, H. (2019). CrowdED and CREX: Towards Easy Crowdsourcing Quality Control Evaluation. In: Welzer, T., Eder, J., Podgorelec, V., Kamišalić Latifić, A. (eds) Advances in Databases and Information Systems. ADBIS 2019. Lecture Notes in Computer Science(), vol 11695. Springer, Cham. https://doi.org/10.1007/978-3-030-28730-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-28730-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28729-0
Online ISBN: 978-3-030-28730-6
eBook Packages: Computer ScienceComputer Science (R0)