Multi-feature Transformer for Multiclass Cyberbullying Detection in Bangla

Wahid, Zaman; Al Imran, Abdullah

doi:10.1007/978-3-031-34111-3_37

Zaman Wahid¹⁹ &
Abdullah Al Imran²⁰

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 675))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence Applications and Innovations

920 Accesses
4 Citations

Abstract

Cyberbullying detection is a global issue that must be addressed to improve the cyberspace for millions of online users, services, and organizations. Online harassment of the general public and celebrities is now commonplace on social media, particularly in Bangladesh. In this paper, we present a novel multi-feature transformer followed by a deep neural network for multiple-dimensional cyberbullying detection. Using online Bangla textual data, we introduce the user’s social profile, the lexical features, the contextual embedding, and the semantic similarities among word associations in Bangla in order to develop an effective and robust cyberbullying detection system. Our proposed method can detect cyberbullying in Bangla with a 98% detection accuracy for threats and a 90% detection accuracy for sarcastic comments. The aggregate accuracy of all six multiclass labels is 86.3%. In addition, the experimental results find that the proposed technique outperforms the state-of-the-art methods for detecting cyberbully in Bangla.

Z. Wahid and A. Al Imran—Both authors contributed to this work equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmed, M.F., Mahmud, Z., Biash, Z.T., Ryen, A.A.N., Hossain, A., Ashraf, F.B.: Bangla online comments dataset. Mendeley Data 1 (2021)
Google Scholar
Ahmed, M.F., Mahmud, Z., Biash, Z.T., Ryen, A.A.N., Hossain, A., Ashraf, F.B.: Cyberbullying detection using deep neural network from social media comments in bangla language. arXiv preprint arXiv:2106.04506 (2021)
Ahmed, M.T., Rahman, M., Nur, S., Islam, A., Das, D.: Deployment of machine learning and deep learning algorithms in detecting cyberbullying in bangla and romanized bangla text: a comparative study. In: 2021 International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), pp. 1–10. IEEE (2021)
Google Scholar
Al Imran, A., Wahid, Z., Ahmed, T.: BNnet: a deep neural network for the identification of satire and fake bangla news. In: Chellappan, S., Choo, K.-K.R., Phan, N.H. (eds.) CSoNet 2020. LNCS, vol. 12575, pp. 464–475. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66046-8_38
Chapter Google Scholar
Albawi, S., Mohammed, T.A., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET), pp. 1–6. IEEE (2017)
Google Scholar
Aporna, A.A., Azad, I., Amlan, N.S., Mehedi, M.H.K., Mahbub, M.J.A., Rasel, A.A.: Classifying offensive speech of bangla text and analysis using explainable AI. In: Advances in Computing and Data Sciences: 6th International Conference, ICACDS 2022, Kurnool, India, 22–23 April 2022, Revised Selected Papers, Part I. pp. 133–144. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-12638-3_12
Aurpa, T.T., Sadik, R., Ahmed, M.S.: Abusive bangla comments detection on facebook using transformer-based deep learning models. Social Netw. Anal. Min. 12(1), 24 (2022)
Article Google Scholar
Biau, G., Scornet, E.: A random forest guided tour. TEST 25(2), 197–227 (2016). https://doi.org/10.1007/s11749-016-0481-7
Article MathSciNet MATH Google Scholar
Church, K.W.: Word2vec. Nat. Lang. Eng. 23(1), 155–162 (2017)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Emon, M.I.H., Iqbal, K.N., Mehedi, M.H.K., Mahbub, M.J.A., Rasel, A.A.: Detection of bangla hate comments and cyberbullying in social media using nlp and transformer models. In: Advances in Computing and Data Sciences: 6th International Conference, ICACDS 2022, Kurnool, India, 22–23 April 2022, Revised Selected Papers, Part I, pp. 86–96. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-12638-3_8
Graves, A., Graves, A.: Long short-term memory. In: Supervised Sequence Labelling with Recurrent Neural Networks, pp. 37–45 (2012)
Google Scholar
Han, X., Yue, Q., Chu, J., Han, Z., Shi, Y., Wang, C.: Multi-feature fusion transformer for chinese named entity recognition. In: 2022 41st Chinese Control Conference (CCC), pp. 4227–4232. IEEE (2022)
Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Jahan, M., Ahamed, I., Bishwas, M.R., Shatabda, S.: Abusive comments detection in bangla-english code-mixed and transliterated text. In: 2019 2nd International Conference on Innovation in Engineering and Technology (ICIET), pp. 1–6. IEEE (2019)
Google Scholar
Kowsher, M., Sami, A.A., Prottasha, N.J., Arefin, M.S., Dhar, P.K., Koshiba, T.: Bangla-bert: transformer-based efficient model for transfer learning and language understanding. IEEE Access 10, 91855–91870 (2022)
Article Google Scholar
LaValley, M.P.: Logistic regression. Circulation 117(18), 2395–2399 (2008)
Article Google Scholar
Liu, B., et al.: Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, vol. 1. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19460-3
Book MATH Google Scholar
Liu, G., Li, C., Yang, Q.: Neuralwalk: trust assessment in online social networks with neural networks. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pp. 1999–2007. IEEE (2019)
Google Scholar
Liu, Y., Zheng, H., Feng, X., Chen, Z.: Short-term traffic flow prediction with conv-lstm. In: 2017 9th International Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–6. IEEE (2017)
Google Scholar
Mahmud, M.R., Afrin, M., Razzaque, M.A., Miller, E., Iwashige, J.: A rule based bengali stemmer. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2750–2756. IEEE (2014)
Google Scholar
Meng, Y., et al.: Pretraining text encoders with adversarial mixture of training signal generators. arXiv preprint arXiv:2204.03243 (2022)
Nova, F.F., Rifat, M.R., Saha, P., Ahmed, S.I., Guha, S.: Online sexual harassment over anonymous social media in Bangladesh. In: Proceedings of the Tenth International Conference on Information and Communication Technologies and Development, pp. 1–12 (2019)
Google Scholar
Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)
Article Google Scholar
Psichogios, D.C., Ungar, L.H.: A hybrid neural network-first principles approach to process modeling. AIChE J. 38(10), 1499–1511 (1992)
Article Google Scholar
Rezwana Rashid, T.T.: Laws protecting victims from cyber harassment (2021). https://www.thedailystar.net/law-our-rights/news/laws-protecting-victims-cyber-harassment-2196491
Ritu, S.S., Mondal, J., Mia, M.M., Al Marouf, A.: Bangla abusive language detection using machine learning on radio message gateway. In: 2021 6th International Conference on Communication and Electronics Systems (ICCES), pp. 1725–1729. IEEE (2021)
Google Scholar
Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)
Sen, O., et al.: Bangla natural language processing: a comprehensive analysis of classical, machine learning, and deep learning based methods. IEEE Access 10, 38999–39044 (2022)
Article Google Scholar
Sharma, G.: Pros and cons of different sampling techniques. Int. J. Appl. Res. 3(7), 749–752 (2017)
Google Scholar
Suthaharan, S.: Support vector machine. In: Machine Learning Models and Algorithms for Big Data Classification. ISIS, vol. 36, pp. 207–235. Springer, Boston, MA (2016). https://doi.org/10.1007/978-1-4899-7641-3_9
Chapter MATH Google Scholar
Talpur, K.R., Yuhaniz, S.S., Amir, N.: Cyberbullying detection: current trends and future directions. J. Theor. Appl. Inf. Technol. 98, 3197–3208 (2020)
Google Scholar
Wahid, Z., Imran, A.A., Rifat, M.R.I.: BNnetXtreme: an enhanced methodology for Bangla fake news detection online. In: Computational Data and Social Networks: 11th International Conference, CSoNet 2022, 5–7 December 2022, Proceedings, pp. 157–166. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-26303-3_14
Wani, M.A., Agarwal, N., Bours, P.: Sexual-predator detection system based on social behavior biometric (ssb) features. Procedia Comput. Sci. 189, 116–127 (2021)
Article Google Scholar
Webb, G.I., Keogh, E., Miikkulainen, R.: Naïve bayes. Encycl. Mach. Learn. 15, 713–714 (2010)
Google Scholar
Zhao, Y., Tao, X.: ZYJ123@ DravidianLangTech-EACL2021: offensive language identification based on xlm-roberta with dpcnn. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 216–221 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Calgary, Calgary, AB, Canada
Zaman Wahid
University of Liverpool, Liverpool, UK
Abdullah Al Imran

Authors

Zaman Wahid
View author publications
You can also search for this author in PubMed Google Scholar
Abdullah Al Imran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zaman Wahid .

Editor information

Editors and Affiliations

University of Piraeus, Piraeus, Greece
Ilias Maglogiannis
Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
University of Sunderland, Sunderland, UK
John MacIntyre
University of Leon, León, Spain
Manuel Dominguez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wahid, Z., Al Imran, A. (2023). Multi-feature Transformer for Multiclass Cyberbullying Detection in Bangla. In: Maglogiannis, I., Iliadis, L., MacIntyre, J., Dominguez, M. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 675. Springer, Cham. https://doi.org/10.1007/978-3-031-34111-3_37

Download citation

DOI: https://doi.org/10.1007/978-3-031-34111-3_37
Published: 01 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-34110-6
Online ISBN: 978-3-031-34111-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Federation for Information Processing (opens in a new tab)

Multi-feature Transformer for Multiclass Cyberbullying Detection in Bangla