Skip to main content

Quality-Aware Online Task Assignment Using Latent Topic Model

  • Conference paper
  • First Online:
  • 538 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11343))

Abstract

Crowdsourcing has been proven to be a useful tool for solving the tasks hard for computers. Due to workers’ uneven qualities, it is crucial to model their reliabilities for computing effective task assignment plans and producing accurate estimations of truths. However, existing reliability models either cannot accurately estimate workers’ fine-grained reliabilities or require external information like text description. In this paper, we consider dividing tasks into clusters (i.e., topics) based on workers’ behaviors, then propose a Bayesian latent topic model for describing the topic distributions and workers’ topical-level expertise. We further present an online task assignment scheme which incorporates the latent topic model to dynamically assign each incoming worker a set of tasks with the Maximum Expected Gain (MEG). The experimental results demonstrate that our method can significantly decrease the number of task assignments and achieve higher accuracy than the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://sites.google.com/site/crowdscale2013/shared-task/task-fact-eval.

References

  1. Biernacki, C., Celeux, G., Govaert, G.: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Anal. Mach. Intell. 22(7), 719–725 (2000)

    Article  Google Scholar 

  2. Devanur, N.R., Hayes, T.P.: The adwords problem: online keyword matching with budgeted bidders under random permutations. In: Proceedings of the EC, pp. 71–78. ACM (2009)

    Google Scholar 

  3. Du, Y., Xu, H., Sun, Y.-E., Huang, L.: A general fine-grained truth discovery approach for crowdsourced data aggregation. In: Candan, S., Chen, L., Pedersen, T.B., Chang, L., Hua, W. (eds.) DASFAA 2017. LNCS, vol. 10177, pp. 3–18. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-55753-3_1

    Chapter  Google Scholar 

  4. Finin, T., Murnane, W., Karandikar, A., Keller, N., Martineau, J., Dredze, M.: Annotating named entities in Twitter data with crowdsourcing. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. 80–88. ACL (2010)

    Google Scholar 

  5. Gabriele, P., Jesse, C., Ipeirotis, P.G.: Running experiments on amazon mechanical turk. Judgm. Decis. Mak. 5(5), 411–419 (2010)

    Google Scholar 

  6. Gao, J., Li, Q., Zhao, B., Fan, W., Han, J.: Truth discovery and crowdsourcing aggregation: a unified perspective. Proc. VLDB Endow. 8(12), 2048–2049 (2015)

    Article  Google Scholar 

  7. Keribin, C., Brault, V., Celeux, G., Govaert, G.: Estimation and selection for the latent block model on categorical data. Stat. Comput. 25(6), 1201–1216 (2015)

    Article  MathSciNet  Google Scholar 

  8. Khan, A.R., Garcia-Molina, H.: CrowdDQS: dynamic question selection in crowdsourcing systems. In: Proceedings of SIGMOD, pp. 1447–1462. ACM (2017)

    Google Scholar 

  9. Li, G., Wang, J., Zheng, Y., Franklin, M.J.: Crowdsourced data management: a survey. IEEE Trans. Knowl. Data Eng. 28(9), 2296–2319 (2016)

    Article  Google Scholar 

  10. Li, Q., et al.: A confidence-aware approach for truth discovery on long-tail data. Proc. VLDB Endow. 8(4), 425–436 (2014)

    Article  Google Scholar 

  11. Liu, X., Lu, M., Ooi, B.C., Shen, Y., Wu, S., Zhang, M.: CDAS: a crowdsourcing data analytics system. Proc. VLDB Endow. 5(10), 1040–1051 (2012)

    Article  Google Scholar 

  12. Ma, F., et al.: Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of ACM SIGKDD, pp. 745–754. ACM (2015)

    Google Scholar 

  13. Minka, T.: Estimating a Dirichlet Distribution (2000)

    Google Scholar 

  14. Moreno, P.G., Artés-Rodríguez, A., Teh, Y.W., Perez-Cruz, F.: Bayesian nonparametric crowdsourcing. J. Mach. Learn. Res. 16(1), 1607–1627 (2015)

    MathSciNet  MATH  Google Scholar 

  15. Oleson, D., Sorokin, A., Laughlin, G.P., Hester, V., Le, J., Biewald, L.: Programmatic gold: targeted and scalable quality assurance in crowdsourcing. Hum. Comput. 11(11) (2011)

    Google Scholar 

  16. Simpson, E., Roberts, S., Psorakis, I., Smith, A.: Dynamic Bayesian combination of multiple imperfect classifiers. In: Guy, T., Karny, M., Wolpert, D. (eds.) Intelligent Systems Reference Library Series: Decision Making and Imperfection, pp. 1–35. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36406-8_1

    Chapter  Google Scholar 

  17. Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast–but is it good?: evaluating non-expert annotations for natural language tasks. In: Proceedings of EMNLP, pp. 254–263. Association for Computational Linguistics (2008)

    Google Scholar 

  18. Vuurens, J., de Vries, A.P., Eickhoff, C.: How much spam can you take? An analysis of crowdsourcing results to increase accuracy. In: Proceedings ACM SIGIR Workshop on Crowdsourcing for Information Retrieval (CIR11), pp. 21–26 (2011)

    Google Scholar 

  19. Welinder, P., Branson, S., Perona, P., Belongie, S.J.: The multidimensional wisdom of crowds. In: Proceedings of NIPS, pp. 2424–2432. Curran Associates, Inc. (2010)

    Google Scholar 

  20. Yin, X., Han, J., Yu, P.S.: Truth discovery with multiple conflicting information providers on the web. IEEE Trans. Knowl. Data Eng. 20(6), 796–808 (2008)

    Article  Google Scholar 

  21. Zheng, Y., Li, G., Cheng, R.: DOCS: a domain-aware crowdsourcing system using knowledge bases. Proc. VLDB Endow. 10(4), 361–372 (2016)

    Article  Google Scholar 

  22. Zheng, Y., Li, G., Li, Y., Shan, C., Cheng, R.: Truth inference in crowdsourcing: is the problem solved? Proc. VLDB Endow. 10(5), 541–552 (2017)

    Article  Google Scholar 

  23. Zheng, Y., Wang, J., Li, G., Cheng, R., Feng, J.: QASCA: A quality-aware task assignment system for crowdsourcing applications. In: Proceedings of SIGMOD, pp. 1031–1046. ACM (2015)

    Google Scholar 

Download references

Acknowledgements

The research of authors is partially supported by National Natural Science Foundation of China (NSFC) under Grant No. 61672369, No. 61572342, No. 61873177, Natural Science Foundation of Jiangsu Province under Grant No. BK20161258. The research of Hongli Xu is supported by the NSFC under Grant No. 61472383, U1709217, and 61728207.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yu-E Sun or He Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Du, Y., Sun, YE., Huang, H., Huang, L., Xu, H., Wu, X. (2018). Quality-Aware Online Task Assignment Using Latent Topic Model. In: Tang, S., Du, DZ., Woodruff, D., Butenko, S. (eds) Algorithmic Aspects in Information and Management. AAIM 2018. Lecture Notes in Computer Science(), vol 11343. Springer, Cham. https://doi.org/10.1007/978-3-030-04618-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04618-7_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04617-0

  • Online ISBN: 978-3-030-04618-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics