Abstract:
Detecting hate speech in low-resource languages, like Myanmar, is a critical yet challenging task due to limited linguistic resources. In this study, we propose a methodo...Show MoreMetadata
Abstract:
Detecting hate speech in low-resource languages, like Myanmar, is a critical yet challenging task due to limited linguistic resources. In this study, we propose a methodology leveraging a hate speech dictionary derived from annotated data to enhance classification accuracy. Our contributions include manually collecting data from Facebook and annotating them to build a comprehensive corpus. Utilizing fastText models, we investigate the impact of filtering hate speech content from sen-tences, comparing classifiers trained on unfiltered long sentences against those trained on hate speech-filtered short sentences (which are obtained by lexicon-based filtering). Our experiments demonstrate notable accuracy enhancements achieved through the incorporation of the hate speech lexicon, with the best accu-racy reaching 0.771. This research underscores the effectiveness of lexicon-based filtering in augmenting hate speech detection capabilities in low-resource language settings.
Published in: 2024 21st International Joint Conference on Computer Science and Software Engineering (JCSSE)
Date of Conference: 19-22 June 2024
Date Added to IEEE Xplore: 02 August 2024
ISBN Information: