Skip to main content

What Is Abusive Language?

Integrating Different Views on Abusive Language for Machine Learning

  • Conference paper
  • First Online:
Disinformation in Open Online Media (MISDOOM 2019)

Abstract

Abusive language has been corrupting online conversations since the inception of the internet. Substantial research efforts have been put into the investigation and algorithmic resolution of the problem. Different aspects such as “cyberbullying”, “hate speech” or “profanity” have undergone ample amounts of investigation, however, often using inconsistent vocabulary such as “offensive language” or “harassment”. This led to a state of confusion within the research community. The inconsistency can be considered an inhibitor for the domain: It increases the risk of unintentional redundant work and leads to undifferentiated and thus hard to use and justifiable machine learning classifiers. To remedy this effect, this paper introduces a novel configurable, multi-view approach to define abusive language concepts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    With the term “Socially Unacceptable Discourse” [36] introduced another umbrella term, which, however, so far has not received a similar uptake as “Abusive Language”.

  2. 2.

    The third iteration in the year 2019 is already scheduled [7].

  3. 3.

    For example, Facebook is still opening ever new moderation centers [61] and there is a growing amount of reports on how the moderation of content gets ever more unhandable [40].

  4. 4.

    Given the context of the paper, “language” is assumed to refer to written online comments.

  5. 5.

    The theoretical need to obtain a fully-grounded definition for “cruel” as postulated through [44, 45]’s symbol grounding problem is acknowledged. However, a full grounding is beyond the scope of this study and is hence left for future work of a more apt linguist.

  6. 6.

    We subsume anti-semitism, anti-muslim, and other religious utterance at this point.

References

  1. Abel, A., Meyer, C.M.: The dynamics outside the paper: user contributions to online dictionaries. In: Proceedings of the 3rd eLex Conference ‘Electronic Lexicography in the 21st Century: Thinking Outside the Paper’, pp. 179–194. eLex, Tallinn (2013)

    Google Scholar 

  2. Ackerman, M.S.: The intellectual challenge of CSCW: the gap between social requirements and technical feasibility. Hum. Comput. Interact. 15(2–3), 179–203 (2000)

    Article  Google Scholar 

  3. Al Sohibani, M., Al Osaimi, N., Al Ehaidib, R., Al Muhanna, S., Dahanayake, A.: Factors that influence the quality of crowdsourcing. In: New Trends Database Information Systems II: Selected Papers 18th East European Conference on Advances in Databases and Information Systems and Associated Satellite Events, ADBIS 2014, Ohrid, Macedonia, pp. 287–300 (2015)

    Google Scholar 

  4. Anzovino, M., Fersini, E., Rosso, P.: Automatic identification and classification of misogynistic language on Twitter. In: Silberztein, M., Atigui, F., Kornyshova, E., Métais, E., Meziane, F. (eds.) NLDB 2018. LNCS, vol. 10859, pp. 57–64. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91947-8_6

    Chapter  Google Scholar 

  5. Association of Computational Linguistics: ALW1: 1st Workshop on Abusive Language Online (2017). https://sites.google.com/site/abusivelanguageworkshop2017/home

  6. Association of Computational Linguistics: ALW2: 2nd Workshop on Abusive Language Online (2018). https://sites.google.com/view/alw2018

  7. Association of Computational Linguistics: ALW3: 3rd Workshop on Abusive Language Online (2019). https://sites.google.com/view/alw3/home

  8. Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in Tweet. In: Proceedings 26th International Conference World Wide Web Companion, WWW 2017 Companion, pp. 759–760. International World Wide Web Conferences Steering Committee, Perth, Australia (2017)

    Google Scholar 

  9. Bourgonje, P., Moreno-Schneider, J., Srivastava, A., Rehm, G.: Automatic classification of abusive language and personal attacks in various forms of online communication. In: Rehm, G., Declerck, T. (eds.) GSCL 2017. LNCS (LNAI), vol. 10713, pp. 180–191. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73706-5_15

    Chapter  Google Scholar 

  10. Bretschneider, U., Wöhner, T., Peters, R.: Detecting online harassment in social networks. In: Proceedings International Conference on Information Systems - Building a Better World Through Information Systems, ICIS 2014, pp. 1–14. Association for Information Systems, Auckland, New Zealand (2014)

    Google Scholar 

  11. vom Brocke, J., Simons, A., Niehaves, B., Riemer, K., Plattfaut, R., Cleven, A.: Reconstructing the giant: on the importance of rigour in documenting the literature search process. In: Proceedings 17th European Conference on Information Systems, ECIS 2009, Verona, Italy, pp. 2206–2217 (2009)

    Google Scholar 

  12. Brunk, J., Mattern, J., Riehle, D.M.: Effect of transparency and trust on acceptance of automatic online comment moderation systems. In: Proceedings 21st IEEE Conference on Business Informatics, CBI 2019. IEEE, Moscow, Russia (2019)

    Google Scholar 

  13. Brunk, J., Niemann, M., Riehle, D.M.: Can analytics as a service save the media industry? - The case of online comment moderation. In: Proceedings 21st IEEE Conference on Business Informatics, CBI 2019. IEEE, Moscow (2019)

    Google Scholar 

  14. Burnap, P., Williams, M.L.: Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Sci. 5(1), 11 (2016)

    Article  Google Scholar 

  15. Cambridge University Press: abusive (2017). http://dictionary.cambridge.org/dictionary/english/abusive

  16. Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: Proceedings 2012 ASE/IEEE International Conference on Social Computing, 2012 ASE/IEEE International Conference on Privacy, Security, Risk and Trust, Amsterdam, Netherlands, pp. 71–80 (2012)

    Google Scholar 

  17. Collins: abusive definition and meaning (2017). https://www.collinsdictionary.com/dictionary/english/abusive

  18. Cooper, H.M.: Organizing knowledge syntheses: a taxonomy of literature reviews. Knowl. Soc. 1(1), 104–126 (1988)

    Google Scholar 

  19. Council of Europe: Recommendation No. R (97) 20 of the Committee of Ministers to Member States on “Hate Speech” (1997)

    Google Scholar 

  20. Council of Europe: Recommendation No. R (97) 21 of the Committee of Ministers to Member States on the Media and the Promotion of a Culture of Tolerance (1997)

    Google Scholar 

  21. Council of Europe: European Convention on Human Rights (2010)

    Google Scholar 

  22. Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Eleventh International AAAI Conference on Web and Social Media, Montreal, Canada (2017)

    Google Scholar 

  23. Del Vigna, F., Cimino, A., Dell’Orletta, F., Petrocchi, M., Tesconi, M.: Hate me, hate me not: hate speech detection on Facebook. In: 1st Italian Conference on Cybersecurity, Venice, Italy (2017)

    Google Scholar 

  24. European Commission: Applying EU law (2017). https://ec.europa.eu/info/law/law-making-process/overview-law-making-process/applying-eu-law_en

  25. European Commission against Racism and Intolerance: ECRI General Policy Recommendation No. 1 on Combating Racism, Xenophobia, Antisemitism and Intolerance (1996)

    Google Scholar 

  26. European Commission against Racism and Intolerance: ECRI General Policy Recommendation No. 2 on Specialised Bodies to Combat Racism, Xenophobia, Antisemitism and Intolerance at National Level (1997)

    Google Scholar 

  27. European Commission against Racism and Intolerance: ECRI General Policy Recommendation No. 6 on Combating the Dissemination of Racist, Xenophobic and Antisemitic Material via the Internet (2000)

    Google Scholar 

  28. European Commission against Racism and Intolerance: ECRI General Policy Recommendation No. 7 on National Legislation to Combat Racism and Racial Discrimination (2002)

    Google Scholar 

  29. European Commission against Racism and Intolerance: ECRI General Policy Recommendation No. 15 on Combating Hate Speech (2015)

    Google Scholar 

  30. European Union: Council directive 2000/43/EC of 29 June 2000 implementing the principle of equal treatment between persons irrespective of racial or ethnic origin. Off. J. Eur. Communities L 180, 22–26 (2000)

    Google Scholar 

  31. European Union: The charter of fundamental rights of the European union. Off. J. Eur. Communities C 364, 1–22 (2000)

    Google Scholar 

  32. European Union: Treaty of Lisbon - amending the Treaty on European Union and the Treaty establishing the European community. Off. J. Eur. Union C 306, 1–271 (2007)

    Google Scholar 

  33. European Union: Council framework decision 2008/913/JHA of 28 November 2008 on combating certain forms and expressions of racism and xenophobia by means of criminal law. Off. J. Eur. Union L 328, 55–58 (2008)

    Google Scholar 

  34. European Union: Consolidated version of the treaty on the functioning of the European union. Off. J. Eur. Union C 326, 47–390 (2012)

    Google Scholar 

  35. Faiola, A.: Germany springs to action over hate speech against migrants (2016). https://www.washingtonpost.com/world/europe/germany-springs-to-action-over-hate-speech-against-migrants/2016/01/06/6031218e-b315-11e5-8abc-d09392edc612_story.html?utm_term=.737b4d4453d3

  36. Fišer, D., Erjavec, T., Ljubešić, N.: Legal framework, dataset and annotation schema for socially unacceptable online discourse practices in Slovene. In: Proceedings First Workshop on Abusive Language Online, Vancouver, Canada, pp. 46–51 (2017)

    Google Scholar 

  37. Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51(4), 1–30 (2018)

    Article  Google Scholar 

  38. Gardiner, B., Mansfield, M., Anderson, I., Holder, J., Louter, D., Ulmanu, M.: The dark side of Guardian comments (2016). https://www.theguardian.com/technology/2016/apr/12/the-dark-side-of-guardian-comments

  39. Gilbert, E., Lampe, C., Leavitt, A., Lo, K., Yarosh, L.: Conceptualizing, creating, & controlling constructive and controversial comments. In: Companion 2017 ACM Conference Computer Supported Cooperative Work, Social Computing, Portland, OR, USA, pp. 425–430 (2017)

    Google Scholar 

  40. Gillespie, T.: The scale is just unfathomable (2018). https://logicmag.io/04-the-scale-is-just-unfathomable/

  41. Grudin, J.: Computer-supported cooperative work: history and focus. Computer 27(5), 19–26 (1994)

    Article  Google Scholar 

  42. Guberman, J., Hemphill, L.: Challenges in modifying existing scales for detecting harassment in individual Tweets. In: Proceedings 50th Hawaii International Conference System Sciences, HICSS 2017, pp. 2203–2212. Association for Information Systems, Waikoloa Village, Hawaii, USA (2017)

    Google Scholar 

  43. Hammer, H.L.: Automatic detection of hateful comments in online discussion. In: Maglaras, L.A., Janicke, H., Jones, K. (eds.) INISCOM 2016. LNICST, vol. 188, pp. 164–173. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52569-3_15

    Chapter  Google Scholar 

  44. Harnad, S.: The symbol grounding problem. Physica D 42(1–3), 335–346 (1990)

    Article  Google Scholar 

  45. Harnad, S.: Symbol-grounding problem. In: Encyclopedia of Cognitive Science, vol. 42, pp. 335–346. Wiley, Chichester (2006)

    Google Scholar 

  46. Jay, T., Janschewitz, K.: The pragmatics of swearing. J. Politeness Res. Lang. Behav. Cult. 4(2), 267–288 (2008)

    Google Scholar 

  47. Köffer, S., Riehle, D.M., Höhenberger, S., Becker, J.: Discussing the value of automatic hate speech detection in online debates. In: Drews, P., Funk, B., Niemeyer, P., Xie, L. (eds.) MKWI 2018, Lüneburg, Germany (2018)

    Google Scholar 

  48. Macmillan Publishers Limited: abusive (adjective) definition and synonyms (2017). http://www.macmillandictionary.com/dictionary/british/abusive

  49. Merriam-Webster: Abusive (2017). https://www.merriam-webster.com/dictionary/abusive

  50. Niemann, M.: Abusiveness is non-binary: five shades of gray in German online news-comments. In: Proceedings 21st IEEE Conference Business Informatics, CBI 2019. IEEE, Moscow, Russia (2019)

    Google Scholar 

  51. Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content. In: Proceedings 25th International Conference World Wide Web, pp. 145–153, Montreal, Canada (2016)

    Google Scholar 

  52. Oxford University Press: Abusive (2017). https://en.oxforddictionaries.com/definition/abusive

  53. Parliamentary Assembly: Recommendation 1805 (2007): Blasphemy, religious insults and hate speech against persons on grounds of their religion (2007)

    Google Scholar 

  54. Pater, J.A., Kim, M.K., Mynatt, E.D., Fiesler, C.: Characterizations of online harassment: comparing policies across social media platforms. In: Proceedings 19th International Conference Supporting Group Work, GROUP 2016, pp. 369–374. ACM Press, Sanibel Island, Florida, USA (2016)

    Google Scholar 

  55. Pearson: Abusive (2017). http://www.ldoceonline.com/dictionary/abusive

  56. Poletto, F., Stranisci, M., Sanguinetti, M., Patti, V., Bosco, C.: Hate speech annotation: analysis of an Italian Twitter corpus. In: 4th Italian Conference on Computational Linguistics, CLiC-it 2017, vol. 2006, pp. 1–6. CEUR-WS (2017)

    Google Scholar 

  57. Ravluševičius, P.: The enforcement of the primacy of the European Union law-legal doctrine and practice. Jurisprudence 18(4), 1369–1388 (2011)

    Google Scholar 

  58. Razavi, A.H., Inkpen, D., Uritsky, S., Matwin, S.: Offensive language detection using multi-level classification. In: Farzindar, A., Kešelj, V. (eds.) AI 2010. LNCS (LNAI), vol. 6085, pp. 16–27. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13059-5_5

    Chapter  Google Scholar 

  59. Ross, B., Rist, M., Carbonell, G., Cabrera, B., Kurowsky, N., Wojatzki, M.: Measuring the reliability of hate speech annotations: the case of the European refugee crisis. In: Proceedings 3rd Workshop on Natural Language Processing for Computer-Mediated Communication, Bochum, Germany, pp. 6–9 (2016)

    Google Scholar 

  60. Seo, S., Cho, S.B.: Offensive sentence classification using character-level CNN and transfer learning with fake sentences. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.S. (eds.) International Conference on Neural Information Processing, pp. 532–539. Springer, Cham (2017)

    Chapter  Google Scholar 

  61. Solon, O.: Underpaid and overburdened: the life of a Facebook moderator (2017). https://www.theguardian.com/news/2017/may/25/facebook-moderator-underpaid-overburdened-extreme-content

  62. Sood, S.O., Antin, J., Churchill, E.F.: Using crowdsourcing to improve profanity detection. In: AAAI Spring Symposium Series, Palo Alto, CA, USA, pp. 69–74 (2012)

    Google Scholar 

  63. Sood, S.O., Churchill, E.F., Antin, J.: Automatic identification of personal insults on social news sites. J. Am. Soc. Inf. Sci. Technol. 63(2), 270–285 (2012)

    Article  Google Scholar 

  64. Švec, A., Pikuliak, M., Šimko, M., Bieliková, M.: Improving moderation of online discussions via interpretable neural models. In: Proceedings Second Workshop on Abusive Language Online, ALW2, Brussels, Belgium (2018)

    Google Scholar 

  65. Tuarob, S., Mitrpanont, J.L.: Automatic discovery of abusive thai language usages in social networks. In: Choemprayong, S., Crestani, F., Cunningham, S.J. (eds.) ICADL 2017. LNCS, vol. 10647, pp. 267–278. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70232-2_23

    Chapter  Google Scholar 

  66. Warner, W., Hirschberg, J.: Detecting hate speech on the world wide web. In: Proceedings Second Workshop on Language in Social Media, Montreal, Canada, pp. 19–26 (2012)

    Google Scholar 

  67. Waseem, Z.: Are you a racist or Am I seeing things? Annotator influence on hate speech detection on Twitter. In: Proceedings First Workshop on NLP and Computational Social Science, Austin, Texas, USA, pp. 138–142 (2016)

    Google Scholar 

  68. Waseem, Z., Davidson, T., Warmsley, D., Weber, I.: Understanding abuse: a typology of abusive language detection subtasks. In: Proceedings First Workshop Abusive Language Online, Vancouver, Canada, pp. 78–84 (2017)

    Google Scholar 

  69. Waseem, Z., Hovy, D.: Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In: Proceedings NAACL Student Research Workshop, Stroudsburg, PA, USA, pp. 88–93 (2016)

    Google Scholar 

  70. Watanabe, H., Bouazizi, M., Ohtsuki, T.: Hate speech on Twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access 6, 13825–13835 (2018)

    Article  Google Scholar 

  71. Webster, J., Watson, R.T.: Analyzing the past to prepare for the future: writing a literature review. MIS Q. 26(2), xiii–xxiii (2002)

    Google Scholar 

  72. Yenala, H., Jhanwar, A., Chinnakotla, M.K., Goyal, J.: Deep learning for detecting inappropriate content in text. Int. J. Data Sci. Anal. 6(4), 273–286 (2018)

    Article  Google Scholar 

  73. Yin, D., Xue, Z., Hong, L., Davison, B.D., Kontostathis, A., Edwards, L.: Detection of harassment on Web 2.0. In: Proceedings Content Analysis WEB, CAW2.0, Madrid, Spain, pp. 1–7 (2009)

    Google Scholar 

Download references

Acknowledgments

The research leading to these results received funding from the federal state of North Rhine-Westphalia and the European Regional Development Fund (EFRE.NRW 2014–2020), Project: (No. CM-2-2-036a).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Niemann .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Niemann, M., Riehle, D.M., Brunk, J., Becker, J. (2020). What Is Abusive Language?. In: Grimme, C., Preuss, M., Takes, F., Waldherr, A. (eds) Disinformation in Open Online Media. MISDOOM 2019. Lecture Notes in Computer Science(), vol 12021. Springer, Cham. https://doi.org/10.1007/978-3-030-39627-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-39627-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-39626-8

  • Online ISBN: 978-3-030-39627-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics