Abstract
Humanity has profited enormously from the interchange of information and the expanding use of social media but it has also raised a number of challenges, such as the persistence of hate speech. This growing problem on social media platforms, latterly studies used a different type of point engineering system and machine literacy algorithms to automatically descry hate comments on numerous data. As we know, several studies have been done so far and compared several point engineering strategies with machine literacy algorithms to discover which strategy is the most productive. This investigation aims to examine the performance of multiple engineering approaches with five machine literacy algorithms. The data sets contain the class orders hate speech, not hate speech and offensive comments independently. These social media posts are split into these two groups. To recognize the particular traits of hate speech text messages, the appropriate n-gram feature sets are extracted. The n-gram TF-IDF weights provide the foundation for these feature models. The main aspiration of this research work is to analyze, and resolve the above problem and compare algorithms and features used in machine learning to automatically detect hate speech and specified them like labeling into various classes like hate speech, offensive, and neither, etc. After using different classifiers, “Random Forest” has come up with better accuracy, precision, and recall compared to SVM (Support Vector Machine), Naive Bayes, Logistic Regression, Ada Boost, and Gradient boost algorithms. This system achieved an accuracy of 90.26% using a Random Forest. The experimental result showed that the “Random Forest” provided the best all-around accuracy from the model that has been made and it is more accurate than compare to other work done in recent times on this. So, the result obtain from the model, based on the resulting intensity of the comments can be extracted.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chaffey, D.: Global social media statistics research summary 2022. Smart Insights (2022). https://www.smartinsights.com/social-media-marketing/social-media-strategy/new-global-social-media-research
Shepherd, J.: 22 essential Twitter statistics you need to know in 2022. The Social Shepherd (2022). https://thesocialshepherd.com/blog/twitter-statistics
Kovács, G., Alonso, P., Saini, R.: Challenges of hate speech detection in social media. SN Comput. Sci. 2(2) (2021). https://doi.org/10.1007/s42979-021-00457-3
Ahammed, S., Rahman, M., Niloy, M.H., Chowdhury, S.M.H.: Implementation of machine learning to detect hate speech in Bangla language. In: 2019 8th International Conference System Modeling and Advancement in Research Trends (SMART), pp. 317–320. IEEE (2019)
Burnap, P., Williams, M.L.: Us and them: identifying cyber hate on Twitter across multiple protected characteristics. EPJ Data Sci. 5(1), 11 (2016)
MacAvaney, S., Yao, H.R., Yang, E., Russell, K., Goharian, N., Frieder, O.: Hate speech detection: challenges and solutions. PLoS ONE 14(8), e0221152 (2019)
Srinivasan, R., Subalalitha, C.N.: Sentimental analysis from imbalanced code-mixed data using machine learning approaches. Distrib. Parallel Databases 41, 1–16 (2021)
Tulkens, S., et al.: A dictionary-based approach to racism detection in Dutch social media. ArXiv preprint arXiv: 1608.08738 (2016)
Upadhyay, I.S., Wadhawan, A., Mamidi, R.: HopefulMen@ LT-EDI-EACL2021: hope speech detection using Indic transliteration and transformers (2021). arXiv preprint arXiv:2102.12082
Warner, W., Hirschberg, J.: Detecting hate speech on the world wide web. In: Proceeding LSM 2012, Proceedings of the Second Workshop on language in Social Media, no. Lsm, pp. 19–26 (2012)
Watanabe, H., Bouazizi, M., Ohtsuki, T.: Hate speech on Twitter: a pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access 6, 13825–13835 (2018). https://doi.org/10.1109/ACCESS.2018.2806394
Gitari, N.D., Zuping, Z., Damien, H., Long, J.: A lexicon-based approach for hate speech detection. In. J. Multimed. Ubiquit. Eng. 10(4), 215–230 (2015)
Sharif, O., Hossain, E., Hoque, M.M.: NLP-CUET@DravidianLangTech-EACL2021: offensive language detection from multilingual code-mixed text using transformers. arXiv:2103.00455 [cs] (2021). Accessed 11 Feb 2023
Schmidt, A., Wiegand, M.: A survey on hate speech detection using natural language processing. In: Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media (2017)
Jaki, S., De Smedt, T.: Right-wing German hate speech on twitter: analysis and automatic detection. arXiv preprint arXiv:1910.07518 (2019)
Malmasi, S., Zampieri, M.: Detecting Hate speech in social media. arXiv:1712.06427 [cs] (2017)
Zimbra, D., Abbasi, A., Zeng, D., Chen, H.: The state-of-the-art in twitter sentiment analysis. ACM Trans. Manage. Inf. Syst. 9(2), 1–29 (2018). https://doi.org/10.1145/3185045
Hate Speech and Offensive Language Dataset. http://www.kaggle.com, http://www.kaggle.com/datasets/mrmorj/hate-speech-and-offensive-language-dataset
Support Vector Machines. Scikit-learn. http://scikitlearn.org/stable/modules/svm.html
Logistic regression. Wikipedia (2023). http://en.m.wikipedia.org/wiki/. Logisticregression. Accessed 11 Feb 2023
Machine Learning Random Forest Algorithm - Javatpoint. http://www.javatpoint.com, http://www.javatpoint.com/machine-learning-random-forest-algorithm
ML - Gradient Boosting. GeeksforGeeks (2020). http://www.geeksforgeeks.org/ml-gradient-boosting/
Saini, A.: AdaBoost algorithm - a complete guide for beginners. Analytics Vidhya (2021). http://www.analyticsvidhya.com/blog/2021/09/adaboost-algorithm-a-complete-guide-for-beginners/
Confusion Matrix - an overview \(|\) ScienceDirect Topics. http://www.sciencedirect.com, http://www.sciencedirect.com/topics/engineering/confusion-matrix
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 IFIP International Federation for Information Processing
About this paper
Cite this paper
Haider, F., Dipty, I., Rahman, F., Assaduzzaman, M., Sohel, A. (2023). Social Media Hate Speech Detection Using Machine Learning Approach. In: Chandran K R, S., N, S., A, B., Hamead H, S. (eds) Computational Intelligence in Data Science. ICCIDS 2023. IFIP Advances in Information and Communication Technology, vol 673. Springer, Cham. https://doi.org/10.1007/978-3-031-38296-3_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-38296-3_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38295-6
Online ISBN: 978-3-031-38296-3
eBook Packages: Computer ScienceComputer Science (R0)