Abstract
Crowdsourcing has been proven to be a useful tool for solving the tasks hard for computers. Due to workers’ uneven qualities, it is crucial to model their reliabilities for computing effective task assignment plans and producing accurate estimations of truths. However, existing reliability models either cannot accurately estimate workers’ fine-grained reliabilities or require external information like text description. In this paper, we consider dividing tasks into clusters (i.e., topics) based on workers’ behaviors, then propose a Bayesian latent topic model for describing the topic distributions and workers’ topical-level expertise. We further present an online task assignment scheme which incorporates the latent topic model to dynamically assign each incoming worker a set of tasks with the Maximum Expected Gain (MEG). The experimental results demonstrate that our method can significantly decrease the number of task assignments and achieve higher accuracy than the state-of-the-art approaches.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Biernacki, C., Celeux, G., Govaert, G.: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Anal. Mach. Intell. 22(7), 719–725 (2000)
Devanur, N.R., Hayes, T.P.: The adwords problem: online keyword matching with budgeted bidders under random permutations. In: Proceedings of the EC, pp. 71–78. ACM (2009)
Du, Y., Xu, H., Sun, Y.-E., Huang, L.: A general fine-grained truth discovery approach for crowdsourced data aggregation. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10177, pp. 3–18. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55753-3_1
Finin, T., Murnane, W., Karandikar, A., Keller, N., Martineau, J., Dredze, M.: Annotating named entities in Twitter data with crowdsourcing. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. 80–88. ACL (2010)
Gabriele, P., Jesse, C., Ipeirotis, P.G.: Running experiments on amazon mechanical turk. Judgm. Decis. Mak. 5(5), 411–419 (2010)
Gao, J., Li, Q., Zhao, B., Fan, W., Han, J.: Truth discovery and crowdsourcing aggregation: a unified perspective. Proc. VLDB Endow. 8(12), 2048–2049 (2015)
Keribin, C., Brault, V., Celeux, G., Govaert, G.: Estimation and selection for the latent block model on categorical data. Stat. Comput. 25(6), 1201–1216 (2015)
Khan, A.R., Garcia-Molina, H.: CrowdDQS: dynamic question selection in crowdsourcing systems. In: Proceedings of SIGMOD, pp. 1447–1462. ACM (2017)
Li, G., Wang, J., Zheng, Y., Franklin, M.J.: Crowdsourced data management: a survey. IEEE Trans. Knowl. Data Eng. 28(9), 2296–2319 (2016)
Li, Q., et al.: A confidence-aware approach for truth discovery on long-tail data. Proc. VLDB Endow. 8(4), 425–436 (2014)
Liu, X., Lu, M., Ooi, B.C., Shen, Y., Wu, S., Zhang, M.: CDAS: a crowdsourcing data analytics system. Proc. VLDB Endow. 5(10), 1040–1051 (2012)
Ma, F., et al.: Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of ACM SIGKDD, pp. 745–754. ACM (2015)
Minka, T.: Estimating a Dirichlet Distribution (2000)
Moreno, P.G., Artés-Rodríguez, A., Teh, Y.W., Perez-Cruz, F.: Bayesian nonparametric crowdsourcing. J. Mach. Learn. Res. 16(1), 1607–1627 (2015)
Oleson, D., Sorokin, A., Laughlin, G.P., Hester, V., Le, J., Biewald, L.: Programmatic gold: targeted and scalable quality assurance in crowdsourcing. Hum. Comput. 11(11) (2011)
Simpson, E., Roberts, S., Psorakis, I., Smith, A.: Dynamic Bayesian combination of multiple imperfect classifiers. In: Guy, T., Karny, M., Wolpert, D. (eds.) Intelligent Systems Reference Library Series: Decision Making and Imperfection, pp. 1–35. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36406-8_1
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: evaluating non-expert annotations for natural language tasks. In: Proceedings of EMNLP, pp. 254–263. Association for Computational Linguistics (2008)
Vuurens, J., de Vries, A.P., Eickhoff, C.: How much spam can you take? An analysis of crowdsourcing results to increase accuracy. In: Proceedings ACM SIGIR Workshop on Crowdsourcing for Information Retrieval (CIR11), pp. 21–26 (2011)
Welinder, P., Branson, S., Perona, P., Belongie, S.J.: The multidimensional wisdom of crowds. In: Proceedings of NIPS, pp. 2424–2432. Curran Associates, Inc. (2010)
Yin, X., Han, J., Yu, P.S.: Truth discovery with multiple conflicting information providers on the web. IEEE Trans. Knowl. Data Eng. 20(6), 796–808 (2008)
Zheng, Y., Li, G., Cheng, R.: DOCS: a domain-aware crowdsourcing system using knowledge bases. Proc. VLDB Endow. 10(4), 361–372 (2016)
Zheng, Y., Li, G., Li, Y., Shan, C., Cheng, R.: Truth inference in crowdsourcing: is the problem solved? Proc. VLDB Endow. 10(5), 541–552 (2017)
Zheng, Y., Wang, J., Li, G., Cheng, R., Feng, J.: QASCA: A quality-aware task assignment system for crowdsourcing applications. In: Proceedings of SIGMOD, pp. 1031–1046. ACM (2015)
Acknowledgements
The research of authors is partially supported by National Natural Science Foundation of China (NSFC) under Grant No. 61672369, No. 61572342, No. 61873177, Natural Science Foundation of Jiangsu Province under Grant No. BK20161258. The research of Hongli Xu is supported by the NSFC under Grant No. 61472383, U1709217, and 61728207.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Du, Y., Sun, YE., Huang, H., Huang, L., Xu, H., Wu, X. (2018). Quality-Aware Online Task Assignment Using Latent Topic Model. In: Tang, S., Du, DZ., Woodruff, D., Butenko, S. (eds) Algorithmic Aspects in Information and Management. AAIM 2018. Lecture Notes in Computer Science(), vol 11343. Springer, Cham. https://doi.org/10.1007/978-3-030-04618-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-04618-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04617-0
Online ISBN: 978-3-030-04618-7
eBook Packages: Computer ScienceComputer Science (R0)