Abstract
Human annotated data is the cornerstone of today’s artificial intelligence efforts, yet data labeling processes can be complicated and expensive, especially when human labelers disagree with each other. The current work practice is to use majority-voted labels to overrule the disagreement. However, in the subjective data labeling tasks such as hate speech annotation, disagreement among individual labelers can be difficult to resolve. In this paper, we explored why such disagreements occur using a mixed-method approach – including interviews with experts, concept mapping exercises, and self-reporting items – to develop a multidimensional scale for distilling the process of how annotators label a hate speech corpus. We tested this scale with 170 annotators in a hate speech annotation task. Results showed that our scale can reveal facets of individual differences among annotators (e.g., age, personality, etc.), and these facets’ relationships to an annotator’s final label decision of an instance. We suggest that this work contributes to the understanding of how humans annotate data. The proposed scale can potentially improve the value of the currently discarded minority-vote labels.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aggarwal, A., Maurya, K., Chaudhary, A.: Comparative study for predicting the severity of cyberbullying across multiple social media platforms. In: 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 871–877 (2020)
Assimakopoulos, S., Muskat, R.V., van der Plas, L., Gatt, A.: Annotating for hate speech: the MaNeCo corpus and some input from critical discourse analysis. In: Proceedings of the 12th Language Resources and Evaluation Conference, pp. 5088–5097 (2020). https://www.aclweb.org/anthology/2020.lrec-1.626. Accessed 23 Sept 2020
Awal, Md.R., Cao, R., Lee, R.K.-W., Mitrović, S.: On analyzing annotation consistency in online abusive behavior datasets. arXiv:2006.13507 [cs] (2020). http://arxiv.org/abs/2006.13507. Accessed 10 Aug 2020
Basile, V., et al.: Semeval-2019 task 5: multilingual detection of hate speech against immigrants and women in twitter. In: Proceedings of the 13th International Workshop on Semantic Evaluation, pp. 54–63 (2019)
Beaujean, A.A.: Latent Variable Modeling Using R: A Step-by-Step Guide. Routledge, London (2014)
Benesch, S.: Dangerous speech: a proposal to prevent group violence. In: Voices That Poison: Dangerous Speech Project (2012)
Benikova, D., Wojatzki, M., Zesch, T.: What does this imply? Examining the impact of implicitness on the perception of hate speech. In: Rehm, G., Declerck, T. (eds.) GSCL 2017. LNCS (LNAI), vol. 10713, pp. 171–179. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73706-5_14
Boyatzis, R.E.: Transforming Qualitative Information: Thematic Analysis and Code Development. Sage, Thousand Oaks (1998)
Chancellor, S., Kalantidis, Y., Pater, J.A., De Choudhury, M., Shamma, D.A.: Multimodal classification of moderated online pro-eating disorder content. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 3213–3226 (2017)
Chandrasekharan, E., Samory, M., Srinivasan, A., Gilbert, E.: The bag of communities: identifying abusive behavior online with preexisting Internet data. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 3175–3187 (2017)
Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing, pp. 71–80 (2012). https://doi.org/10.1109/SocialCom-PASSAT.2012.55
Cortina, J.M.: What is coefficient alpha? An examination of theory and applications. J. Appl. Psychol. 78(1), 98 (1993)
Cowan, G., Cowan, G.: Empathy, ways of knowing, and interdependence as mediators of gender differences in attitudes toward hate speech and freedom of speech. Psychol. Women Q. 300–308 (2003)
Cowan, G., Hodge, C.: Judgments of hate speech: the effects of target group, publicness, and behavioral responses of the target. J. Appl. Soc. Psychol. 26(4), 355–374 (1996)
Cowan, G., Resendez, M., Marshall, E., Quist, R.: Hate speech and constitutional protection: priming values of equality and freedom. J. Soc. Issues 58(2), 247–263 (2002)
Davidson, T., Bhattacharya, D., Weber, I.: Racial bias in hate speech and abusive language detection datasets. arXiv:1905.12516 [cs] 2019. http://arxiv.org/abs/1905.12516. Accessed 10 May 2020
Delgado, R.: Words that wound: a tort action for racial insults, epithets, and name-calling. Harv. CR-CLL Rev. 17, 133 (1982)
Farrell, T., Fernandez, M., Novotny, J., Alani, H.: Exploring Misogyny across the Manosphere in Reddit. In: Proceedings of the 10th ACM Conference on Web Science - WebSci 2019, pp. 87–96 (2019). https://doi.org/10.1145/3292522.3326045
Founta, A.-M., et al.: Large scale crowdsourcing and characterization of twitter abusive behavior. arXiv preprint arXiv:1802.00393 (2018)
Galesic, M., Bosnjak, M.: Effects of questionnaire length on participation and indicators of response quality in a web survey. Publ. Opin. Q. 73(2), 349–360 (2009)
Grimmelmann, J.: The virtues of moderation. Yale JL & Tech. 17, 42 (2015)
Hewitt, S., Tiropanis, T., Bokhove, C.: The problem of identifying misogynist language on Twitter (and other online social spaces). In: Proceedings of the 8th ACM Conference on Web Science - WebSci 2016, pp. 333–335 (2016). https://doi.org/10.1145/2908131.2908183
Hosseini, H., Kannan, S., Zhang, B., Poovendran, R.: Deceiving Google’s perspective API built for detecting toxic comments. arXiv:1702.08138 [cs] (2017). http://arxiv.org/abs/1702.08138. Accessed 9 July 2020
Kennedy, B., et al.: The Gab Hate Corpus: A collection of 27k posts annotated for hate speech (2020)
Kocoń, J., Figas, A., Gruza, M., Puchalska, D., Kajdanowicz, T., Kazienko, P.: Offensive, aggressive, and hate speech analysis: from data-centric to human-centered approach. Inf. Process. Manag. 58(5), 102643 (2021). https://doi.org/10.1016/j.ipm.2021.102643
Kumar, R., Reganti, A.N., Bhatia, A., Maheshwari, T.: Aggression-annotated corpus of Hindi-English code-mixed data. arXiv preprint arXiv:1803.09402 (2018)
Leets, L.: Responses to internet hate sites: is speech too free in cyberspace? Commun. Law Policy 6(2), 287–317 (2001)
Lucas, E., Alm, C.O., Bailey, R.: Understanding human and predictive moderation of online science discourse. In: 2019 IEEE Western New York Image and Signal Processing Workshop (WNYISPW), pp. 1–5 (2019)
Marwick, A.E., Miller, R.: Online harassment, defamation, and hateful speech: a primer of the legal landscape. Fordham Center on Law and Information Policy Report, p. 2 (2014)
Matsuda, M.J.: Public response to racist speech: considering the victim’s story. Mich. Law Rev. 87(8), 2320–2381 (1989)
Matsuda, M.J., Lawrence, C.R., Delgado, R., Crenshaw, K.W.: Words that Wound. Westview, Boulder (1993)
McClelland, K., Hunter, C.: The perceived seriousness of racial harassment. Soc. Probl. 39(1), 92–107 (1992)
McGillicuddy, A.R., Bernard, J.-G., Cranefield, J.A.: Controlling bad behavior in online communities: an examination of moderation work (2016)
Mohan, S., Guha, A., Harris, M., Popowich, F., Schuster, A., Priebe, C.: The impact of toxic language on the health of reddit communities. In: Canadian Conference on Artificial Intelligence, pp. 51–56 (2017)
Moran, M.: Talking about hate speech: a rhetorical analysis of American and Canadian approaches to the regulation of hate speech. Wis. L. Rev. 1425 (1994)
Muller, M., et al.: How data science workers work with data: discovery, capture, curation, design, creation. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–15 (2019)
Netemeyer, R.G., Bearden, W.O., Sharma, S.: Scaling Procedures: Issues and Applications. Sage Publications, Thousand Oaks (2003)
Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content. In: Proceedings of the 25th International Conference on World Wide Web, pp. 145–153 (2016)
Oostdijk, N., van Halteren, H.: N-gram-based recognition of threatening tweets. In: International Conference on Intelligent Text Processing and Computational Linguistics, pp. 183–196 (2013)
Parekh, B.: Hate speech. Publ. Policy Res. 12(4), 213–223 (2006). https://doi.org/10.1111/j.1070-3535.2005.00405.x
Pohjonen, M.: Extreme Speech Online: An Anthropological Critique of Hate Speech Debates, p. 19 (2017)
James Potter, W., Levine-Donnerstein, D.: Rethinking validity and reliability in content analysis. J. Appl. Commun. Res. 27(3), 258–284 (1999). https://doi.org/10.1080/00909889909365539
Power, R.A., Pluess, M.: Heritability estimates of the Big Five personality traits based on common genetic variants. Transl. Psychiatry 5(7), e604–e604 (2015)
Rammstedt, B., John, O.P.: Measuring personality in one minute or less: a 10-item short version of the Big Five Inventory in English and German. J. Res. Pers. 41(1), 203–212 (2007)
Ross, B., Rist, M., Carbonell, G., Cabrera, B., Kurowsky, N., Wojatzki, M.: Measuring the reliability of hate speech annotations: the case of the European refugee crisis. arXiv preprint arXiv:1701.08118 (2017)
Rosseel, Y.: Lavaan: an R package for structural equation modeling and more. Version 0.5–12 (BETA). J. Stat. Softw. 48(2), 1–36 (2012)
Roussos, G., Dovidio, J.F.: Hate speech is in the eye of the beholder: the influence of racial attitudes and freedom of speech beliefs on perceptions of racially motivated threats of violence. Soc. Psychol. Person. Sci. 9(2), 176–185 (2018). https://doi.org/10.1177/1948550617748728
Sanguinetti, M., Poletto, F., Bosco, C., Patti, V., Stranisci, M.: An Italian twitter corpus of hate speech against immigrants. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (2018)
Schmidt, A., Wiegand, M.: A survey on hate speech detection using natural language processing. In: Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, pp. 1–10 (2017). https://doi.org/10.18653/v1/W17-1101
Sellars, A.: Defining hate speech. SSRN Electron. J. (2016). https://doi.org/10.2139/ssrn.2882244
Singh, M., Bansal, D., Sofat, S.: Behavioral analysis and classification of spammers distributing pornographic content in social media. Soc. Netw. Anal. Min. 6(1), 1–18 (2016). https://doi.org/10.1007/s13278-016-0350-0
Van Hee, C., et al.: Detection and fine-grained classification of cyberbullying events. In: International Conference Recent Advances in Natural Language Processing (RANLP), pp. 672–680 (2015)
Wang, D., Mark, G.: Internet censorship in China: examining user awareness and attitudes. ACM Trans. Comput.-Hum. Interact. (TOCHI) 22(6), 1–22 (2015)
Waseem, Z., Davidson, T., Warmsley, D., Weber, I.: Understanding abuse: a typology of abusive language detection subtasks. arXiv:1705.09899 [cs] (2017). http://arxiv.org/abs/1705.09899. Accessed 6 Mar 2020
Waseem, Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88–93 (2016)
White II, M.H., Crandall, C.S.: Freedom of racist speech: ego and expressive threats. J. Pers. Soc. Psychol. 113(3), 413 (2017)
Wojatzki, M., Horsmann, T., Gold, D., Zesch, T.: Do women perceive hate differently: examining the relationship between hate speech, gender, and agreement judgments, p. 11 (2018)
Xiang, G., Fan, B., Wang, L., Hong, J., Rose, C.: Detecting offensive tweets via topical feature discovery over a large scale twitter corpus. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1980–1984 (2012)
Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., Kumar, R.: Predicting the type and target of offensive posts in social media. arXiv preprint arXiv:1902.09666 (2019)
Community Standards. https://www.facebook.com/communitystandards/hate_speech. Accessed 25 Mar 2020
Hateful Conduct Policy. https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy. Accessed 25 Mar 2020
Home. Optimal Workshop. https://www.optimalworkshop.com/. Accessed 14 May 2020
Prolific—Online participant recruitment for surveys and market research. https://www.prolific.co/. Accessed 17 May 2021
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sang, Y., Stanton, J. (2022). The Origin and Value of Disagreement Among Data Labelers: A Case Study of Individual Differences in Hate Speech Annotation. In: Smits, M. (eds) Information for a Better World: Shaping the Global Future. iConference 2022. Lecture Notes in Computer Science(), vol 13192. Springer, Cham. https://doi.org/10.1007/978-3-030-96957-8_36
Download citation
DOI: https://doi.org/10.1007/978-3-030-96957-8_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96956-1
Online ISBN: 978-3-030-96957-8
eBook Packages: Computer ScienceComputer Science (R0)