CrowdED and CREX: Towards Easy Crowdsourcing Quality Control Evaluation

Awwad, Tarek; Bennani, Nadia; Rehn-Sonigo, Veronika; Brunie, Lionel; Kosch, Harald

doi:10.1007/978-3-030-28730-6_18

CrowdED and CREX: Towards Easy Crowdsourcing Quality Control Evaluation

Tarek Awwad^12,14,
Nadia Bennani¹²,
Veronika Rehn-Sonigo¹³,
Lionel Brunie¹² &
…
Harald Kosch¹⁴

Conference paper
First Online: 13 August 2019

762 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11695))

Abstract

Crowdsourcing is a time- and cost-efficient web-based technique for labeling large datasets like those used in Machine Learning. Controlling the output quality in crowdsourcing is an active research domain which has yielded a fair number of methods and approaches. Due to the quantitative and qualitative limitations of the existing evaluation datasets, comparing and evaluating these methods have been very limited. In this paper, we present CrowdED (Crowdsourcing Evaluation Dataset), a rich dataset for evaluating a wide range of quality control methods alongside with CREX (CReate Enrich eXtend), a framework that facilitates the creation of such datasets and guarantees their future-proofing and re-usability through customizable extension and enrichment.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
E.g., demographics and self-evaluation profiles.
2.
https://www.figure-eight.com/data-for-everyone/.
3.
https://www.figure-eight.com. Formerly named CrowdFlower.
4.
Yet, it is not the only one since any other task corpus can be used.
5.
FE levels range from 1 to 3 where level 3 represents the most experienced and reliable workers and 1 represents all qualified workers.
6.
A demo of CREX’s user interface and a real world use scenario can be found on https://project-crowd.eu/.
7.
e.g., requester accessible back-end services or API to dynamically modify tasks and assignments.

References

Alsayasneh, M., et al.: Personalized and diverse task composition in crowdsourcing. IEEE Trans. Knowl. Data Eng. 30(1), 128–141 (2018)
Article Google Scholar
Amer-Yahia, S., Gaussier, E., Leroy, V., Pilourdault, J., Borromeo, R.M., Toyama, M.: Task composition in crowdsourcing, pp. 194–203 (2016)
Google Scholar
Awwad, T., Bennani, N., Ziegler, K., Sonigo, V., Brunie, L., Kosch, H.: Efficient worker selection through history-based learning in crowdsourcing, vol. 1, pp. 923–928 (2017)
Google Scholar
Baba, Y., Kashima, H.: Statistical quality estimation for general crowdsourcing tasks. In: ACM SIGKDD, NY, USA, pp. 554–562 (2013)
Google Scholar
Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B., Allahbakhsh, M.: Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions. ACM Comput. Surv. (CSUR) 51(1), 7 (2018)
Article Google Scholar
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)
Article Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39, 1–38 (1977)
MathSciNet MATH Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database, pp. 248–255 (2009)
Google Scholar
Difallah, D.E., Catasta, M., Demartini, G., Cudré-Mauroux, P.: Scaling-up the crowd: micro-task pricing schemes for worker retention and latency improvement (2014)
Google Scholar
Ghosh, A., Kale, S., McAfee, P.: Who moderates the moderators?: crowdsourcing abuse detection in user-generated content. In: Proceedings of the 12th ACM Conference on Electronic Commerce, pp. 167–176. ACM (2011)
Google Scholar
Gil, Y., Garijo, D., Ratnakar, V., Khider, D., Emile-Geay, J., McKay, N.: A controlled crowdsourcing approach for practical ontology extensions and metadata annotations. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 231–246. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68204-4_24
Chapter Google Scholar
Quoc Viet Hung, N., Tam, N.T., Tran, L.N., Aberer, K.: An evaluation of aggregation techniques in crowdsourcing. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013. LNCS, vol. 8181, pp. 1–15. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41154-0_1
Chapter Google Scholar
Ipeirotis, P.G.: Demographics of mechanical turk (2010)
Google Scholar
Jin, Y., Carman, M., Kim, D., Xie, L.: Leveraging side information to improve label quality control in crowd-sourcing (2017)
Google Scholar
Jung, H.J., Lease, M.: Improving consensus accuracy via z-score and weighted voting. In: Human Computation (2011)
Google Scholar
Kamar, E., Horvitz, E.: Incentives for truthful reporting in crowdsourcing. In: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, vol. 3, pp. 1329–1330. International Foundation for Autonomous Agents and Multiagent Systems (2012)
Google Scholar
Kanoulas, E., Carterette, B., Hall, M., Clough, P., Sanderson, M.: Overview of the TREC 2011 session track (2011)
Google Scholar
Kazai, G., Kamps, J., Milic-Frayling, N.: The face of quality in crowdsourcing relevance labels: Demographics, personality and labeling accuracy. In: CIKM, pp. 2583–2586 (2012)
Google Scholar
Le, J., Edmonds, A., Hester, V., Biewald, L.: Ensuring quality in crowdsourced search relevance evaluation: the effects of training question distribution. In: 2010 Workshop on Crowdsourcing for Search Evaluation, pp. 21–26 (2010)
Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents, pp. 1188–1196 (2014)
Google Scholar
Li, H., Yu, B., Zhou, D.: Error rate analysis of labeling by crowdsourcing. In: Machine Learning Meets Crowdsourcing Workshop (2013)
Google Scholar
Li, H., Yu, B., Zhou, D.: Error rate bounds in crowdsourcing models. arXiv preprint arXiv:1307.2674 (2013)
Li, H., Zhao, B., Fuxman, A.: The wisdom of minority: Discovering and targeting the right group of workers for crowdsourcing. In: WWW, NY, pp. 165–176 (2014)
Google Scholar
Mashhadi, A.J., Capra, L.: Quality control for real-time ubiquitous crowdsourcing. In: UbiCrowd, NY, USA, pp. 5–8 (2011)
Google Scholar
Mavridis, P., Gross-Amblard, D., Miklós, Z.: Using hierarchical skills for optimized task assignment in knowledge-intensive crowdsourcing, pp. 843–853 (2016)
Google Scholar
Morris, R., Dontcheva, M., Gerber, E.: Priming for better performance in microtask crowdsourcing environments. IEEE Internet Comput. 16(5), 13–19 (2012)
Article Google Scholar
Mousa, H., Benmokhtar, S., Hasan, O., Brunie, L., Younes, O., Hadhoud, M.: A reputation system resilient against colluding and malicious adversaries in mobile participatory sensing applications (2017)
Google Scholar
Oleson, D., Sorokin, A., Laughlin, G.P., Hester, V., Le, J., Biewald, L.: Programmatic gold: targeted and scalable quality assurance in crowdsourcing. Hum. Comput. 11(11), 43–48 (2011)
Google Scholar
Rahman, H., Roy, S.B., Thirumuruganathan, S., Amer-Yahia, S., Das, G.: Task assignment optimization in collaborative crowdsourcing, pp. 949–954 (2015)
Google Scholar
Rahmanian, B., Davis, J.G.: User interface design for crowdsourcing systems, pp. 405–408 (2014)
Google Scholar
Raykar, V.C., et al.: Supervised learning from multiple experts: whom to trust when everyone lies a bit. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 889–896. ACM (2009)
Google Scholar
Rosenberg, A., Hirschberg, J.: V-measure: a conditional entropy-based external cluster evaluation measure (2007)
Google Scholar
Rousseeuw, P.J., Kaufman, L.: Finding Groups in Data. Wiley, Hoboken (1990)
MATH Google Scholar
Roy, S.B., Lykourentzou, I., Thirumuruganathan, S., Amer-Yahia, S., Das, G.: Task assignment optimization in knowledge-intensive crowdsourcing. VLDB J. 24(4), 467–491 (2015)
Article Google Scholar
Rzeszotarski, J.M., Kittur, A.: Instrumenting the crowd: using implicit behavioral measures to predict task performance. In: UIST, NY, USA, pp. 13–22 (2011)
Google Scholar
Salton, G., McGill, M.: Modern information retrieval (1983)
Google Scholar
Sarasua, C., Simperl, E., Noy, N., Bernstein, A., Leimeister, J.M.: Crowdsourcing and the semantic web: a research manifesto. Hum. Comput. (HCOMP) 2(1), 3–17 (2015)
Google Scholar
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: evaluating non-expert annotations for natural language tasks. In: Conference on Empirical Methods in Natural Language Processing, pp. 254–263. Association for Computational Linguistics (2008)
Google Scholar
Welinder, P., Branson, S., Perona, P., Belongie, S.J.: The multidimensional wisdom of crowds, pp. 2424–2432 (2010)
Google Scholar
Whitehill, J., Wu, T.F., Bergsma, J., Movellan, J.R., Ruvolo, P.L.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: NIPS, pp. 2035–2043 (2009)
Google Scholar
Wilkinson, M.D., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3 (2016)
Google Scholar
Ye, B., Wang, Y., Liu, L.: Crowd trust: a context-aware trust model for worker selection in crowdsourcing environments, pp. 121–128 (2015)
Google Scholar
Zhou, D., Basu, S., Mao, Y., Platt, J.C.: Learning from the wisdom of crowds by minimax entropy, pp. 2195–2203 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Université de Lyon, CNRS INSA-Lyon, LIRIS, UMR5205, Lyon, France
Tarek Awwad, Nadia Bennani & Lionel Brunie
FEMTO-ST Institute, Université Bourgogne Franche-Comté/CNRS, Besançon, France
Veronika Rehn-Sonigo
Department of Distributed and Multimedia Information Systems, University of Passau, Passau, Germany
Tarek Awwad & Harald Kosch

Authors

Tarek Awwad
View author publications
You can also search for this author in PubMed Google Scholar
Nadia Bennani
View author publications
You can also search for this author in PubMed Google Scholar
Veronika Rehn-Sonigo
View author publications
You can also search for this author in PubMed Google Scholar
Lionel Brunie
View author publications
You can also search for this author in PubMed Google Scholar
Harald Kosch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tarek Awwad .

Editor information

Editors and Affiliations

University of Maribor, Maribor, Slovenia
Tatjana Welzer
Alpen-Adria Universität Klagenfurt, Klagenfurt, Austria
Johann Eder
University of Maribor, Maribor, Slovenia
Vili Podgorelec
University of Maribor, Maribor, Slovenia
Aida Kamišalić Latifić

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Awwad, T., Bennani, N., Rehn-Sonigo, V., Brunie, L., Kosch, H. (2019). CrowdED and CREX: Towards Easy Crowdsourcing Quality Control Evaluation. In: Welzer, T., Eder, J., Podgorelec, V., Kamišalić Latifić, A. (eds) Advances in Databases and Information Systems. ADBIS 2019. Lecture Notes in Computer Science(), vol 11695. Springer, Cham. https://doi.org/10.1007/978-3-030-28730-6_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-28730-6_18
Published: 13 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28729-0
Online ISBN: 978-3-030-28730-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics