Abstract
This paper presents a probabilistic model for coping with multiple annotators for discrete binary classification tasks. Here, annotators decline to label instances when they are unsure and therefore, ignorance and real errors are represented separately. Our model integrates both error and ignorance into a conditional Bayesian model where only the observed instance is needed to infer the label. Furthermore, we provide a more accurate study on the properties of each annotator over previous methods. Extensive experiments on a broad range of data sets validate the effectiveness of learning from multiple naive (ignorant) annotators.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks. In: Proceedings of the 2008 Conference on Empirical Methods on Natural language Processing (2008)
Frazier, M., Goldman, S.A., Mishra, N., Pitt, L.: Learning from a Consistently Ignorant Teacher. In: Philosophical COLT 1994, pp. 328–339 (1994)
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository (2007)
Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H.: Learning from Crowds. Journal of Machine Learning Research 11, 1297–1322 (2010)
Yan, Y., Hermosillo, G., Rosales, R., Bogoni, L., Fung, G., Moy, L., Schmidt, M., Dy, J.G.: Modeling Annotator Expertise: Learning when everybody knows a bit of something. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, Italy, vol. 9 (2010)
Dempster, A., Laird, N., Rubin, D.: Maximum Likelihood Estimation from Incomplete Data. J. of the Royal Statistical Society (B) 39(1) (1977)
Jeffreys, H.: An Invariant Form for the Prior Probability in Estimation Problems. Proceedings of the Royal Society of London (1946)
Haenni, R.: Ignoring ignorance is ignorant. Tech. Rep., Center for Junior Research Fellows, University of Konstanz (2003)
Dawid, A., Skene, A.: Maximum likelihood estimation of observer error-rates using the em algorithm. Applied Statistics 28(1), 20–28 (1979)
Whitehill, J., Ruvolo, P., Bergsma, J., Wu, T., Movellan, J.: Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise. In: Advances in Neural Information Processing Systems (2009)
Vansteelandt, S., Goetghebeur, E., Kenward, M.G., Molenberghs, G.: Ignorance and Uncertainty Regions as Inferential Tools in a Sensitivity Analysis. Statistica Sinica 16, 953–979 (2006)
Shafer, G.: The Mathematical Theory of Evidence. Princeton University Press (1976)
Odewahn, S., Stockwell, E., Pennington, R., Hummphreys, R., Zumach, W.: Automated Star/Galaxy Discrimination with Neural Networks. Astronomical J. 103(1), 318–331 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wolley, C., Quafafou, M. (2012). Learning from Multiple Naive Annotators. In: Zhou, S., Zhang, S., Karypis, G. (eds) Advanced Data Mining and Applications. ADMA 2012. Lecture Notes in Computer Science(), vol 7713. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35527-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-35527-1_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35526-4
Online ISBN: 978-3-642-35527-1
eBook Packages: Computer ScienceComputer Science (R0)