Abstract
Cyberbullying detection is a global issue that must be addressed to improve the cyberspace for millions of online users, services, and organizations. Online harassment of the general public and celebrities is now commonplace on social media, particularly in Bangladesh. In this paper, we present a novel multi-feature transformer followed by a deep neural network for multiple-dimensional cyberbullying detection. Using online Bangla textual data, we introduce the user’s social profile, the lexical features, the contextual embedding, and the semantic similarities among word associations in Bangla in order to develop an effective and robust cyberbullying detection system. Our proposed method can detect cyberbullying in Bangla with a 98% detection accuracy for threats and a 90% detection accuracy for sarcastic comments. The aggregate accuracy of all six multiclass labels is 86.3%. In addition, the experimental results find that the proposed technique outperforms the state-of-the-art methods for detecting cyberbully in Bangla.
Z. Wahid and A. Al Imran—Both authors contributed to this work equally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmed, M.F., Mahmud, Z., Biash, Z.T., Ryen, A.A.N., Hossain, A., Ashraf, F.B.: Bangla online comments dataset. Mendeley Data 1 (2021)
Ahmed, M.F., Mahmud, Z., Biash, Z.T., Ryen, A.A.N., Hossain, A., Ashraf, F.B.: Cyberbullying detection using deep neural network from social media comments in bangla language. arXiv preprint arXiv:2106.04506 (2021)
Ahmed, M.T., Rahman, M., Nur, S., Islam, A., Das, D.: Deployment of machine learning and deep learning algorithms in detecting cyberbullying in bangla and romanized bangla text: a comparative study. In: 2021 International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), pp. 1–10. IEEE (2021)
Al Imran, A., Wahid, Z., Ahmed, T.: BNnet: a deep neural network for the identification of satire and fake bangla news. In: Chellappan, S., Choo, K.-K.R., Phan, N.H. (eds.) CSoNet 2020. LNCS, vol. 12575, pp. 464–475. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66046-8_38
Albawi, S., Mohammed, T.A., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET), pp. 1–6. IEEE (2017)
Aporna, A.A., Azad, I., Amlan, N.S., Mehedi, M.H.K., Mahbub, M.J.A., Rasel, A.A.: Classifying offensive speech of bangla text and analysis using explainable AI. In: Advances in Computing and Data Sciences: 6th International Conference, ICACDS 2022, Kurnool, India, 22–23 April 2022, Revised Selected Papers, Part I. pp. 133–144. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-12638-3_12
Aurpa, T.T., Sadik, R., Ahmed, M.S.: Abusive bangla comments detection on facebook using transformer-based deep learning models. Social Netw. Anal. Min. 12(1), 24 (2022)
Biau, G., Scornet, E.: A random forest guided tour. TEST 25(2), 197–227 (2016). https://doi.org/10.1007/s11749-016-0481-7
Church, K.W.: Word2vec. Nat. Lang. Eng. 23(1), 155–162 (2017)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Emon, M.I.H., Iqbal, K.N., Mehedi, M.H.K., Mahbub, M.J.A., Rasel, A.A.: Detection of bangla hate comments and cyberbullying in social media using nlp and transformer models. In: Advances in Computing and Data Sciences: 6th International Conference, ICACDS 2022, Kurnool, India, 22–23 April 2022, Revised Selected Papers, Part I, pp. 86–96. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-12638-3_8
Graves, A., Graves, A.: Long short-term memory. In: Supervised Sequence Labelling with Recurrent Neural Networks, pp. 37–45 (2012)
Han, X., Yue, Q., Chu, J., Han, Z., Shi, Y., Wang, C.: Multi-feature fusion transformer for chinese named entity recognition. In: 2022 41st Chinese Control Conference (CCC), pp. 4227–4232. IEEE (2022)
Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Jahan, M., Ahamed, I., Bishwas, M.R., Shatabda, S.: Abusive comments detection in bangla-english code-mixed and transliterated text. In: 2019 2nd International Conference on Innovation in Engineering and Technology (ICIET), pp. 1–6. IEEE (2019)
Kowsher, M., Sami, A.A., Prottasha, N.J., Arefin, M.S., Dhar, P.K., Koshiba, T.: Bangla-bert: transformer-based efficient model for transfer learning and language understanding. IEEE Access 10, 91855–91870 (2022)
LaValley, M.P.: Logistic regression. Circulation 117(18), 2395–2399 (2008)
Liu, B., et al.: Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, vol. 1. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19460-3
Liu, G., Li, C., Yang, Q.: Neuralwalk: trust assessment in online social networks with neural networks. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pp. 1999–2007. IEEE (2019)
Liu, Y., Zheng, H., Feng, X., Chen, Z.: Short-term traffic flow prediction with conv-lstm. In: 2017 9th International Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–6. IEEE (2017)
Mahmud, M.R., Afrin, M., Razzaque, M.A., Miller, E., Iwashige, J.: A rule based bengali stemmer. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2750–2756. IEEE (2014)
Meng, Y., et al.: Pretraining text encoders with adversarial mixture of training signal generators. arXiv preprint arXiv:2204.03243 (2022)
Nova, F.F., Rifat, M.R., Saha, P., Ahmed, S.I., Guha, S.: Online sexual harassment over anonymous social media in Bangladesh. In: Proceedings of the Tenth International Conference on Information and Communication Technologies and Development, pp. 1–12 (2019)
Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)
Psichogios, D.C., Ungar, L.H.: A hybrid neural network-first principles approach to process modeling. AIChE J. 38(10), 1499–1511 (1992)
Rezwana Rashid, T.T.: Laws protecting victims from cyber harassment (2021). https://www.thedailystar.net/law-our-rights/news/laws-protecting-victims-cyber-harassment-2196491
Ritu, S.S., Mondal, J., Mia, M.M., Al Marouf, A.: Bangla abusive language detection using machine learning on radio message gateway. In: 2021 6th International Conference on Communication and Electronics Systems (ICCES), pp. 1725–1729. IEEE (2021)
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)
Sen, O., et al.: Bangla natural language processing: a comprehensive analysis of classical, machine learning, and deep learning based methods. IEEE Access 10, 38999–39044 (2022)
Sharma, G.: Pros and cons of different sampling techniques. Int. J. Appl. Res. 3(7), 749–752 (2017)
Suthaharan, S.: Support vector machine. In: Machine Learning Models and Algorithms for Big Data Classification. ISIS, vol. 36, pp. 207–235. Springer, Boston, MA (2016). https://doi.org/10.1007/978-1-4899-7641-3_9
Talpur, K.R., Yuhaniz, S.S., Amir, N.: Cyberbullying detection: current trends and future directions. J. Theor. Appl. Inf. Technol. 98, 3197–3208 (2020)
Wahid, Z., Imran, A.A., Rifat, M.R.I.: BNnetXtreme: an enhanced methodology for Bangla fake news detection online. In: Computational Data and Social Networks: 11th International Conference, CSoNet 2022, 5–7 December 2022, Proceedings, pp. 157–166. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-26303-3_14
Wani, M.A., Agarwal, N., Bours, P.: Sexual-predator detection system based on social behavior biometric (ssb) features. Procedia Comput. Sci. 189, 116–127 (2021)
Webb, G.I., Keogh, E., Miikkulainen, R.: Naïve bayes. Encycl. Mach. Learn. 15, 713–714 (2010)
Zhao, Y., Tao, X.: ZYJ123@ DravidianLangTech-EACL2021: offensive language identification based on xlm-roberta with dpcnn. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 216–221 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 IFIP International Federation for Information Processing
About this paper
Cite this paper
Wahid, Z., Al Imran, A. (2023). Multi-feature Transformer for Multiclass Cyberbullying Detection in Bangla. In: Maglogiannis, I., Iliadis, L., MacIntyre, J., Dominguez, M. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 675. Springer, Cham. https://doi.org/10.1007/978-3-031-34111-3_37
Download citation
DOI: https://doi.org/10.1007/978-3-031-34111-3_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34110-6
Online ISBN: 978-3-031-34111-3
eBook Packages: Computer ScienceComputer Science (R0)