Skip to main content

Abstract

The past decade has been characterized by a strong increase in the use of social media and a continuous growth of public online discussion. With the failure of purely manual moderation, platform operators started searching for semi-automated solutions, where the application of Natural Language Processing (NLP) and Machine Learning (ML) techniques is promising. However, this requires huge financial investments for algorithmic implementations, data collection, and model training, which only big players can afford. To support smaller or medium-sized media enterprises (SME), we developed an integrated comment moderation system as an IT platform. This platform acts as a service provider and offers Analytics as a Service (AaaS) to SMEs. Operating such a platform, however, requires a robust technology stack, integrated workflows and well-defined interfaces between all parties. In this paper, we develop and discuss a suitable IT architecture and present a prototypical implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.palletsprojects.com/p/flask/.

  2. 2.

    https://kubernetes.io/.

  3. 3.

    The decision for Intel CPUs is acknowledging Intel’s leading market position for server processors.

References

  1. van Aken, B., Risch, J., Krestel, R., Löser, A.: Challenges for toxic comment classification: an in-depth error analysis. In: Proceedings of the Second Workshop on Abusive Language Online, ALW2, Brussels, Belgium, pp. 33–42 (2018)

    Google Scholar 

  2. Badjatiya, P., Gupta, S., Gupta, M., Varma, V.: Deep learning for hate speech detection in tweet. In: Proceedings of the 26th International Conference on World Wide Web Companion, WWW 2017, Companion, Perth, Australia, pp. 759–760 (2017)

    Google Scholar 

  3. Bilton, R.: Why some publishers are killing their comment sections (2014). https://digiday.com/media/comments-sections/

  4. Boberg, S., Schatto-Eckrodt, T., Frischlich, L., Quandt, T.: The moral gatekeeper? Moderation and deletion of user-generated content in a leading news forum. Media Commun. 6(4), 58–69 (2018)

    Article  Google Scholar 

  5. Brunk, J., Mattern, J., Riehle, D.M.: Effect of transparency and trust on acceptance of automatic online comment moderation systems. In: Proceedings of the 21st IEEE Conference on Business, Informatics, Moscow, Russia, pp. 429–435 (2019)

    Google Scholar 

  6. Brunk, J., Niemann, M., Riehle, D.M.: Can analytics as a service save the online discussion culture? - the case of comment moderation in the media industry. In: Proceedings of the 21st IEEE Conference on Business Informatics, CBI 2019, Moscow, Russia, pp. 472–481 (2019)

    Google Scholar 

  7. Burnap, P., Williams, M.L.: Cyber hate speech on Twitter: an application of machine classification and statistical modeling for policy and decision making. Policy Internet 7(2), 223–242 (2015)

    Article  Google Scholar 

  8. Chatzakou, D., Kourtellis, N., Blackburn, J., De Cristofaro, E., Stringhini, G., Vakali, A.: Mean birds: detecting aggression and bullying on Twitter. In: Proceedings of the 2017 ACM Web Science Conference, WebSci 2017, Troy, New York, USA, pp. 13–22 (2017)

    Google Scholar 

  9. Chen, H., Mckeever, S., Delany, S.J.: Harnessing the power of text mining for the detection of abusive content in social media. In: Angelov, P., Gegov, A., Jayne, C., Shen, Q. (eds.) Advances in Computational Intelligence Systems. AISC, vol. 513, pp. 187–205. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-46562-3_12

    Chapter  Google Scholar 

  10. Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: Proceedings of the 2012 ASE/IEEE International Conference on Social Computing, 2012 ASE/IEEE International Conference on Privacy, Security, Risk Trust, SOCIALCOM-PASSAT 2012, Amsterdam, Netherlands, pp. 71–80 (2012)

    Google Scholar 

  11. Cheng, J.: Report: 80 percent of blogs contain “offensive” content (2007). https://arstechnica.com/information-technology/2007/04/report-80-percent-of-blogs-contain-offensive-content/

  12. Cramer, H., Wielinga, B., Ramlal, S., Evers, V., Rutledge, L., Stash, N.: The effects of transparency on perceived and actual competence of a content-based recommender. In: Proceedings of the Semantic Web User Interaction: Workshop CHI 2008 Exploring HCI Challenges, SWUI 2008, Florence, Italy, pp. 1–10 (2008)

    Google Scholar 

  13. Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection and the problem of offensive language. In: Proceedings of the Eleventh International Conference on Web Social Media, ICWSM 2017, Montreal, Canada, pp. 512–515 (2017)

    Google Scholar 

  14. Diakopoulos, N.: Picking the NYT picks: editorial criteria and automation in the curation of online news comments. #ISOJ, Off. Res. ISOJ J. 5(1), 147–166 (2015)

    Google Scholar 

  15. Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. In: Social Mobile Web, Paper from 2011 ICWSM Workshop, ICWSM 2011, Barcelona, Spain, pp. 11–17 (2011)

    Google Scholar 

  16. Djuric, N., Zhou, J., Morris, R., Grbovic, M., Radosavljevic, V., Bhamidipati, N.: Hate speech detection with comment embeddings. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015 Companion, Florence, Italy, pp. 29–30 (2015)

    Google Scholar 

  17. Etim, B.: The Most Popular Reader Comments on the Times (2015). https://www.nytimes.com/2015/11/23/insider/the-most-popular-reader-comments-on-the-times.html

  18. Feurer, M., Klein, A., Eggensperger, K., Springenberg, J.T., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS 2015, Montreal, Canada, pp. 2755–2763 (2015)

    Google Scholar 

  19. Feurer, M., Klein, A., Eggensperger, K., Springenberg, J.T., Blum, M., Hutter, F.: Auto-sklearn: efficient and robust automated machine learning. In: Hutter, F., Kotthoff, L., Vanschoren, J. (eds.) Automated Machine Learning. TSSCML, pp. 113–134. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05318-5_6

    Chapter  Google Scholar 

  20. Fišer, D., Erjavec, T., Ljubešić, N.: Legal framework, dataset and annotation schema for socially unacceptable online discourse practices in Slovene. In: Waseem, Z., Chung, W.H.K., Hovy, D., Tetreault, J. (eds.) Proceedings of the First Workshop on Abusive Language Online, ALW1, Vancouver, Canada, pp. 46–51 (2017)

    Google Scholar 

  21. Fleischmann, K.R., Wallace, W.A.: A covenant with transparency. Commun. ACM 48(5), 93–97 (2005)

    Article  Google Scholar 

  22. Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51(4), 1–30 (2018). https://doi.org/10.1145/3232676

    Article  Google Scholar 

  23. Gardiner, B., Mansfield, M., Anderson, I., Holder, J., Louter, D., Ulmanu, M.: The dark side of Guardian comments (2016). https://www.theguardian.com/technology/2016/apr/12/the-dark-side-of-guardian-comments

  24. Gefen, D., Karahanna, E., Straub, D.W.: Trust and TAM in online shopping: an integrated model. MIS Q. 27(1), 51–90 (2003)

    Article  Google Scholar 

  25. Gelber, K.: Differentiating hate speech: a systemic discrimination approach. Crit. Rev. Int. Soc. Polit. Philos. 1–22 (2019)

    Google Scholar 

  26. Goodman, B., Flaxman, S.: European union regulations on algorithmic decision-making and a “Right to Explanation”. AI Mag. 38(3), 50 (2017)

    Article  Google Scholar 

  27. Gregor, S., Benbasat, I.: Explanations from intelligent systems: theoretical foundations and implications for practice. MIS Q. 23(4), 497–530 (1999)

    Article  Google Scholar 

  28. Hine, G.E., et al.: Kek, cucks, and god emperor trump: a measurement study of 4chan’s politically incorrect forum and its effects on the web. In: Proceedings of the 11th International Conference Web Social Media, ICWSM 2017, Montral, Canada, pp. 92–101 (2017)

    Google Scholar 

  29. Howe, J.: The rise of crowdsourcing. Wired Mag. (2006)

    Google Scholar 

  30. Hutter, F., Kotthoff, L., Vanschoren, J. (eds.): Automated Machine Learning: Methods, Systems, Challenges. Springer, Heidelberg (2018, in press). http://automl.org/book

  31. Köffer, S., Riehle, D.M., Höhenberger, S., Becker, J.: Discussing the value of automatic hate speech detection in online debates. In: Tagungsband Multikonferenz Wirtschaftsinformatik 2018. MKWI 2018, Lüneburg, Germany (2018)

    Google Scholar 

  32. Kolhatkar, V., Taboada, M.: Constructive language in news comments. In: Proceedings of the First Workshop on Abusive Language Online, ALW1, Vancouver, Canada, pp. 11–17 (2017)

    Google Scholar 

  33. Lee, Y., Yoon, S., Jung, K.: Comparative studies of detecting abusive language on Twitter. In: Proceedings of the Second Workshop on Abusive Language Online, ALW2, Brussels, Belgium, pp. 101–106 (2018)

    Google Scholar 

  34. Lewis, S.C., Holton, A.E., Coddington, M.: Reciprocal journalism: a concept of mutual exchange between journalists and audiences. J. Pract. 8(2), 229–241 (2014)

    Google Scholar 

  35. Lukyanenko, R., Parsons, J., Wiersma, Y., Wachinger, G., Huber, B., Meldt, R.: Representing crowd knowledge: guidelines for conceptual modeling of user-generated content. J. Assoc. Inf. Syst. 18(4), 297–339 (2017)

    Google Scholar 

  36. Mansfield, M.: How we analysed 70m comments on the Guardian website (2016). https://www.theguardian.com/technology/2016/apr/12/how-we-analysed-70m-comments-guardian-website

  37. Mathur, P., Sawhney, R., Ayyar, M., Shah, R.R.: Did you offend me? Classification of offensive Tweets in Hinglish language. In: Proceedings of the Second Workshop on Abusive Language Online, ALW2, Brussels, Belgium, pp. 138–148 (2018)

    Google Scholar 

  38. McKnight, D.H., Choudhury, V., Kacmar, C.: The impact of initial consumer trust on intentions to transact with a web site: a trust building model. J. Strateg. Inf. Syst. 11(3–4), 297–323 (2002)

    Article  Google Scholar 

  39. Mehdad, Y., Tetreault, J.: Do characters abuse more than words? In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2016, Los Angeles, CA, USA, pp. 299–303 (2016)

    Google Scholar 

  40. Niemann, M., Riehle, D.M., Brunk, J., Becker, J.: What is abusive language? Integrating different views on abusive language for machine learning. In: Grimme, C., Preuss, M., Takes, F.W., Waldherr, A. (eds.) MISDOOM 2019. LNCS, vol. 12021, pp. 59–73. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39627-5_6

    Chapter  Google Scholar 

  41. Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content. In: Proceedings of the 25th International Conference on World Wide Web, WWW 2016, Montreal, Canada, pp. 145–153 (2016)

    Google Scholar 

  42. Osterwalder, A., Pigneur, Y.: Business Model Generation: A Handbook for Visionaries, Game Changers, and Challengers. Wiley, Hoboken (2010)

    Google Scholar 

  43. Owotoki, P., Mayer-Lindenberg, F.: Transparency of computational intelligence models. In: Bramer, M., Coenen, F., Tuson, A. (eds.) SGAI 2006, pp. 387–392. Springer, London (2007). https://doi.org/10.1007/978-1-84628-663-6_29

    Chapter  Google Scholar 

  44. Papacharissi, Z.: Democracy online: civility, politeness, and the democratic potential of online political discussion groups. New Media Soc. 6(2), 259–283 (2004)

    Article  Google Scholar 

  45. Park, J.H., Fung, P.: One-step and two-step classification for abusive language detection on Twitter. In: Proceedings of the First Workshop on Abusive Language Online, ALW1, Vancouver, Canada, pp. 41–45 (2017)

    Google Scholar 

  46. Pavlopoulos, J., Malakasiotis, P., Androutsopoulos, I.: Deep learning for user comment moderation. In: Proceedings of the First Workshop on Abusive Language Online, ALW1, Vancouver, Canada, pp. 25–35 (2017)

    Google Scholar 

  47. Plöchinger, S.: Über den Hass (2016). http://ploechinger.tumblr.com/post/140370770262/%C3%BCber-den-hass

  48. Pöyhtäri, R.: Limits of hate speech and freedom of speech on moderated news websites in Finland, Sweden, the Netherlands and the UK. Annales–Series historia et sociologia izhaja štirikrat letno 24(3), 513–524 (2014)

    Google Scholar 

  49. Reynolds, K., Kontostathis, A., Edwards, L.: Using machine learning to detect cyberbullying. In: Proceedings of the 10th International Conference on Machine Learning and Applications and Workshops, ICMLA 2011, Honolulu, Hawaii, USA, pp. 241–244 (2011)

    Google Scholar 

  50. Sahlgren, M., Isbister, T., Olsson, F.: Learning representations for detecting abusive language. In: Proceedings of the Second Workshop on Abusive Language Online, ALW2, Brussels, Belgium, pp. 115–123 (2018)

    Google Scholar 

  51. Samek, W., Wiegand, T., Müller, K.R.: Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. ITU J. ICT Discov. 1(1), 39–48 (2017)

    Google Scholar 

  52. Schmidt, A., Wiegand, M.: A survey on hate speech detection using natural language processing. In: Ku, L.W., Li, C.T. (eds.) Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, SocialNLP 2017, Valencia, Spain, pp. 1–10 (2017)

    Google Scholar 

  53. Serrà, J., Leontiadis, I., Spathis, D., Stringhini, G., Blackburn, J.: Class-based prediction errors to categorize text with out-of-vocabulary words. In: Proceedings of the First Workshop on Abusive Language Online, ALW1, Vancouver, Canada, pp. 36–40 (2017)

    Google Scholar 

  54. Sinha, R., Swearingen, K.: The role of transparency in recommender systems. In: Extended Abstracts on Human Factors in Computing Systems, CHI 2002, Minneapolis, MN, USA, pp. 830–831 (2002)

    Google Scholar 

  55. Sood, S.O., Antin, J., Churchill, E.F.: Using crowdsourcing to improve profanity detection. In: AAAI Spring Symposium Series, Palo Alto, CA, USA, pp. 69–74 (2012)

    Google Scholar 

  56. Švec, A., Pikuliak, M., Šimko, M., Bieliková, M.: Improving moderation of online discussions via interpretable neural models. In: Proceedings of the Second Workshop on Abusive Language Online, ALW2, Brussels, Belgium, pp. 60–65 (2018)

    Google Scholar 

  57. The Coral Project Community (2016). https://community.coralproject.net/t/shutting-down-onsite-comments-a-comprehensive-list-of-all-news-organisations/347

  58. W3Techs: Usage Statistics and Market Share of Linux for Websites (2020). https://w3techs.com/technologies/details/os-linux

  59. Wang, C.: Interpreting neural network hate speech classifiers. In: Proceedings of the Second Workshop on Abusive Language Online, ALW2, Brussels, Belgium, pp. 86–92 (2018)

    Google Scholar 

  60. Wulczyn, E., Thain, N., Dixon, L.: Ex Machina. In: Proceedings of the 26th International Conference on World Wide Web, WWW 2017, Perth, Australia, pp. 1391–1399 (2017)

    Google Scholar 

  61. Yin, D., Xue, Z., Hong, L., Davison, B.D., Kontostathis, A., Edwards, L.: Detection of harassment on web 2.0. In: Proceedings of the Content Analysis in the WEB, CAW 2.0, Madrid, Spain, pp. 1–7 (2009)

    Google Scholar 

Download references

Acknowledgements

The research leading to these results received funding from the federal state of North Rhine-Westphalia and the European Regional Development Fund (EFRE.NRW 2014–2020), Project: (No. CM-2-2-036a).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dennis M. Riehle .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Riehle, D.M., Niemann, M., Brunk, J., Assenmacher, D., Trautmann, H., Becker, J. (2020). Building an Integrated Comment Moderation System – Towards a Semi-automatic Moderation Tool. In: Meiselwitz, G. (eds) Social Computing and Social Media. Participation, User Experience, Consumer Experience, and Applications of Social Computing. HCII 2020. Lecture Notes in Computer Science(), vol 12195. Springer, Cham. https://doi.org/10.1007/978-3-030-49576-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-49576-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-49575-6

  • Online ISBN: 978-3-030-49576-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics