skip to main content
10.1145/3583780.3615260acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Hateful Comment Detection and Hate Target Type Prediction for Video Comments

Published:21 October 2023Publication History

ABSTRACT

With the widespread increase in hateful content on the web, hate detection has become more crucial than ever. Although vast literature exists on hate detection from text, images and videos, interestingly, there has been no previous work on hateful comment detection (HCD) from video pages. HCD is critical for comment moderation and for flagging controversial videos. Comments are often short, contextual and convoluted making the problem challenging. Toward solving this problem, we contribute a dataset, HateComments, consisting of 2071 comments for 401 videos obtained from two popular video sharing platforms. We investigate two related tasks: binary HCD and 4-class multi-label hate target-type prediction (HTP). We systematically explore the importance of various forms of context for effective HCD. Our initial experiments show that our best method which leverages rich video context (like description, transcript and visual input) leads to an HCD accuracy of ~78.6% and an ROC AUC score of ~0.61 for HTP. Code and data is at https://drive.google.com/file/d/1EUbWDUokv1CYkWKlwByUC6yIuBGUw2MN/.

Skip Supplemental Material Section

Supplemental Material

shp3000-video.mp4

mp4

103.6 MB

References

  1. Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web companion (WWW '17). ACM, Perth, Australia, 759--760.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Thales Bertaglia, Andreea Grigoriu, Michel Dumontier, and Gijs van Dijck. 2021. Abusive Language on Social Media Through the Legal Looking Glass. In Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021). 191--200.Google ScholarGoogle ScholarCross RefCross Ref
  3. Mohit Bhardwaj, Md Shad Akhtar, Asif Ekbal, Amitava Das, and Tanmoy Chakraborty. 2020. Hostility detection dataset in Hindi. arXiv:2011.03588 (2020).Google ScholarGoogle Scholar
  4. Pete Burnap and Matthew L Williams. 2016. Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Science, Vol. 5 (2016), 1--15.Google ScholarGoogle ScholarCross RefCross Ref
  5. Mohit Chandra, Ashwin Pathak, Eesha Dutta, Paryul Jain, Manish Gupta, Manish Shrivastava, and Ponnurangam Kumaraguru. 2020. AbuseAnalyzer: Abuse detection, severity and target prediction for Gab posts. In Proceedings of the 28th International Conference on Computational Linguistics (COLING '20). ICCL, Barcelona, Spain (Online), 6277--6283.Google ScholarGoogle ScholarCross RefCross Ref
  6. Lu Cheng, Kai Shu, Siqi Wu, Yasin N Silva, Deborah L Hall, and Huan Liu. 2020. Unsupervised cyberbullying detection via time-informed Gaussian mixture model. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (CIKM '20). ACM, Online, 185--194.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Abhishek Das, Japsimar Singh Wahi, and Siyao Li. 2020. Detecting hate speech in multi-modal memes. arXiv:2012.14891 (2020).Google ScholarGoogle Scholar
  8. Mithun Das, Rohit Raj, Punyajoy Saha, Binny Mathew, Manish Gupta, and Animesh Mukherjee. 2023. HateMM: A Multi-Modal Dataset for Hate Video Classification. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 17. 1014--1023.Google ScholarGoogle ScholarCross RefCross Ref
  9. Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Proceedings of the Eleventh International Conference on Web and Social Media (ICWSM '17). AAAI Press, Montré al, Qué bec, Canada, 512--515.Google ScholarGoogle ScholarCross RefCross Ref
  10. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google ScholarGoogle Scholar
  11. Paula Fortuna and Sérgio Nunes. 2018. A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR), Vol. 51, 4 (2018), 1--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Antigoni Maria Founta, Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Athena Vakali, and Ilias Leontiadis. 2019. A unified deep learning architecture for abuse detection. In Proceedings of the 10th ACM Conference on Web Science (WebSci '19). ACM, Boston, MA, USA, 105--114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Raul Gomez, Jaume Gibert, Lluis Gomez, and Dimosthenis Karatzas. 2020. Exploring hate speech detection in multimodal publications. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV '20). IEEE, Snowmass Village, CO, USA, 1470--1478.Google ScholarGoogle ScholarCross RefCross Ref
  14. Maarten Grootendorst. 2020. Keybert: Minimal keyword extraction with bert. Internet]. Available: https://maartengr. github. io/KeyBERT/index. html (2020).Google ScholarGoogle Scholar
  15. Vijayasaradhi Indurthi, Bakhtiyar Syed, Manish Shrivastava, Nikhil Chakravartula, Manish Gupta, and Vasudeva Varma. 2019. FERMI at SemEval-2019 task 5: Using sentence embeddings to identify hate speech against immigrants and women in Twitter. In Proceedings of the 13th international workshop on semantic evaluation. 70--74.Google ScholarGoogle ScholarCross RefCross Ref
  16. Mika Juuti, Tommi Gröndahl, Adrian Flanagan, and N Asokan. 2020. A little goes a long way: Improving toxic language classification despite data scarcity. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings (EMNLP '20). ACL, Online, 2991--3009.Google ScholarGoogle ScholarCross RefCross Ref
  17. Sweta Karlekar and Mohit Bansal. 2018. SafeCity: Understanding diverse forms of sexual harassment personal stories. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP '18). ACL, Brussels, Belgium, 2805--2811.Google ScholarGoogle ScholarCross RefCross Ref
  18. Brendan Kennedy, Xisen Jin, Aida Mostafazadeh Davani, Morteza Dehghani, and Xiang Ren. 2020. Contextualizing hate speech classifiers with post-hoc explanation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL '20). ACL, Online, 5435--5442.Google ScholarGoogle ScholarCross RefCross Ref
  19. Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, and Davide Testuggine. 2020. The Hateful Memes Challenge: Detecting hate speech in multimodal memes. In Proceedings of the Annual Conference on Neural Information Processing Systems (NeuIPS '20, Vol. 33). Online.Google ScholarGoogle Scholar
  20. Sean MacAvaney, Hao-Ren Yao, Eugene Yang, Katina Russell, Nazli Goharian, and Ophir Frieder. 2019. Hate speech detection: Challenges and solutions. PLOS One, Vol. 14, 8 (2019), e0221152.Google ScholarGoogle ScholarCross RefCross Ref
  21. Pulkit Parikh, Harika Abburi, Pinkesh Badjatiya, Radhika Krishnan, Niyati Chhaya, Manish Gupta, and Vasudeva Varma. 2019. Multi-label categorization of accounts of sexism using a neural framework. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP '19). ACL, Hong Kong, China, 1642--1652.Google ScholarGoogle ScholarCross RefCross Ref
  22. Pulkit Parikh, Harika Abburi, Niyati Chhaya, Manish Gupta, and Vasudeva Varma. 2021. Categorizing sexism and misogyny through neural approaches. ACM Trans. Web, Vol. 15, 4, Article 17 (June 2021).Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jing Qian, Mai ElSherief, Elizabeth Belding, and William Yang Wang. 2018. Hierarchical CVAE for fine-grained hate speech classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP '18). ACL, Brussels, Belgium, 3550--3559.Google ScholarGoogle ScholarCross RefCross Ref
  24. Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.Google ScholarGoogle Scholar
  25. Ashima Suvarna and Grusha Bhalla. 2020. #NotAWhore! A computational linguistic perspective of rape culture and victimization on social media. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. ACL, Online, 328--335.Google ScholarGoogle ScholarCross RefCross Ref
  26. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, Vol. 30 (2017).Google ScholarGoogle Scholar
  27. Zeerak Waseem. 2016. Are you a racist or am I seeing things? Annotator influence on hate speech detection on Twitter. In Proceedings of the First Workshop on NLP and Computational Social Science. ACL, Austin, TX, USA, 138--142.Google ScholarGoogle ScholarCross RefCross Ref
  28. Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In Proceedings of the NAACL Student Research Workshop. ACL, San Diego, CA, USA, 88--93.Google ScholarGoogle ScholarCross RefCross Ref
  29. Ziqi Zhang and Lei Luo. 2019. Hate speech detection: A solved problem? The challenging case of long tail on Twitter. Semantic Web, Vol. 10, 5 (2019), 925--945.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Haoti Zhong, Hao Li, Anna Squicciarini, Sarah Rajtmajer, Christopher Griffin, David Miller, and Cornelia Caragea. 2016. Content-driven detection of cyberbullying on the Instagram social network. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI '16). AAAI Press, New York, NY, USA, 3952--3958.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Hateful Comment Detection and Hate Target Type Prediction for Video Comments

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
          October 2023
          5508 pages
          ISBN:9798400701245
          DOI:10.1145/3583780

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 21 October 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        • Article Metrics

          • Downloads (Last 12 months)121
          • Downloads (Last 6 weeks)21

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader