Skip to main content

Uncovering the Latent Structures of Crowd Labeling

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9077))

Included in the following conference series:

Abstract

Crowdsourcing provides a new way to distribute enormous tasks to a crowd of annotators. The divergent knowledge background and personal preferences of crowd annotators lead to noisy (or even inconsistent) answers to a same question. However, diverse labels provide us information about the underlying structures of tasks and annotators. This paper proposes latent-class assumptions for learning-from-crowds models, that is, items can be separated into several latent classes and workers’ annotating behaviors may differ among different classes. We propose a nonparametric model to uncover the latent classes, and also extend the state-of-the-art minimax entropy estimator to learn latent structures. Experimental results on both synthetic data and real data collected from Amazon Mechanical Turk demonstrate our methods can disclose interesting and meaningful latent structures, and incorporating latent class structures can also bring significant improvements on ground truth label recovery for difficult tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast-but is it good?: evaluating non-expert annotations for natural language tasks. In: EMNLP (2008)

    Google Scholar 

  2. Zhu, J., Chen, N., Xing, E.P.: Bayesian inference with posterior regularization and applications to infinite latent svms. JMLR 15, 1799–1847 (2014)

    MATH  MathSciNet  Google Scholar 

  3. Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the em algorithm. Applied Statistics, 20–28 (1979)

    Google Scholar 

  4. Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L.: Learning from crowds. JMLR 11, 1297–1322 (2010)

    MathSciNet  Google Scholar 

  5. Zhou, D., Platt, J.C., Basu, S., Mao, Y.: Learning from the wisdom of crowds by minimax entropy. In: NIPS (2012)

    Google Scholar 

  6. Zhou, D., Liu, Q., Platt, J.C., Meek, C.: Aggregating ordinal labels from crowds by minimax conditional entropy. In: ICML (2014)

    Google Scholar 

  7. Welinder, P., Branson, S., Belongie, S., Perona, P.: The multidimensional wisdom of crowds. In: NIPS (2010)

    Google Scholar 

  8. Sheshadri, A., Lease, M.: Square: a benchmark for research on computing crowd consensus. In: First AAAI Conference on Human Computation and Crowdsourcing (2013)

    Google Scholar 

  9. Tian, Y., Zhu, J.: Learning from crowds in the presence of schools of thought. In: ICDM (2012)

    Google Scholar 

  10. Li, H., Yu, B., Zhou, D.: Error rate analysis of labeling by crowdsourcing. In: ICML Workshop: Machine Learning Meets Crowdsourcing, Atalanta, Georgia, USA (2013)

    Google Scholar 

  11. Gao, C., Zhou, D.: Minimax optimal convergence rates for estimating ground truth from crowdsourced labels. arXiv preprint arXiv:1310.5764 (2013)

  12. Neal, R.M.: Markov chain sampling methods for Dirichlet process mixture models. Journal of computational and graphical statistics 9(2), 249–265 (2000)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Tian, T., Zhu, J. (2015). Uncovering the Latent Structures of Crowd Labeling. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9077. Springer, Cham. https://doi.org/10.1007/978-3-319-18038-0_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18038-0_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18037-3

  • Online ISBN: 978-3-319-18038-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics