Skip to main content

Multi-feature Transformer for Multiclass Cyberbullying Detection in Bangla

  • Conference paper
  • First Online:
Artificial Intelligence Applications and Innovations (AIAI 2023)

Abstract

Cyberbullying detection is a global issue that must be addressed to improve the cyberspace for millions of online users, services, and organizations. Online harassment of the general public and celebrities is now commonplace on social media, particularly in Bangladesh. In this paper, we present a novel multi-feature transformer followed by a deep neural network for multiple-dimensional cyberbullying detection. Using online Bangla textual data, we introduce the user’s social profile, the lexical features, the contextual embedding, and the semantic similarities among word associations in Bangla in order to develop an effective and robust cyberbullying detection system. Our proposed method can detect cyberbullying in Bangla with a 98% detection accuracy for threats and a 90% detection accuracy for sarcastic comments. The aggregate accuracy of all six multiclass labels is 86.3%. In addition, the experimental results find that the proposed technique outperforms the state-of-the-art methods for detecting cyberbully in Bangla.

Z. Wahid and A. Al Imran—Both authors contributed to this work equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ahmed, M.F., Mahmud, Z., Biash, Z.T., Ryen, A.A.N., Hossain, A., Ashraf, F.B.: Bangla online comments dataset. Mendeley Data 1 (2021)

    Google Scholar 

  2. Ahmed, M.F., Mahmud, Z., Biash, Z.T., Ryen, A.A.N., Hossain, A., Ashraf, F.B.: Cyberbullying detection using deep neural network from social media comments in bangla language. arXiv preprint arXiv:2106.04506 (2021)

  3. Ahmed, M.T., Rahman, M., Nur, S., Islam, A., Das, D.: Deployment of machine learning and deep learning algorithms in detecting cyberbullying in bangla and romanized bangla text: a comparative study. In: 2021 International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), pp. 1–10. IEEE (2021)

    Google Scholar 

  4. Al Imran, A., Wahid, Z., Ahmed, T.: BNnet: a deep neural network for the identification of satire and fake bangla news. In: Chellappan, S., Choo, K.-K.R., Phan, N.H. (eds.) CSoNet 2020. LNCS, vol. 12575, pp. 464–475. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66046-8_38

    Chapter  Google Scholar 

  5. Albawi, S., Mohammed, T.A., Al-Zawi, S.: Understanding of a convolutional neural network. In: 2017 International Conference on Engineering and Technology (ICET), pp. 1–6. IEEE (2017)

    Google Scholar 

  6. Aporna, A.A., Azad, I., Amlan, N.S., Mehedi, M.H.K., Mahbub, M.J.A., Rasel, A.A.: Classifying offensive speech of bangla text and analysis using explainable AI. In: Advances in Computing and Data Sciences: 6th International Conference, ICACDS 2022, Kurnool, India, 22–23 April 2022, Revised Selected Papers, Part I. pp. 133–144. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-12638-3_12

  7. Aurpa, T.T., Sadik, R., Ahmed, M.S.: Abusive bangla comments detection on facebook using transformer-based deep learning models. Social Netw. Anal. Min. 12(1), 24 (2022)

    Article  Google Scholar 

  8. Biau, G., Scornet, E.: A random forest guided tour. TEST 25(2), 197–227 (2016). https://doi.org/10.1007/s11749-016-0481-7

    Article  MathSciNet  MATH  Google Scholar 

  9. Church, K.W.: Word2vec. Nat. Lang. Eng. 23(1), 155–162 (2017)

    Article  Google Scholar 

  10. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  11. Emon, M.I.H., Iqbal, K.N., Mehedi, M.H.K., Mahbub, M.J.A., Rasel, A.A.: Detection of bangla hate comments and cyberbullying in social media using nlp and transformer models. In: Advances in Computing and Data Sciences: 6th International Conference, ICACDS 2022, Kurnool, India, 22–23 April 2022, Revised Selected Papers, Part I, pp. 86–96. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-12638-3_8

  12. Graves, A., Graves, A.: Long short-term memory. In: Supervised Sequence Labelling with Recurrent Neural Networks, pp. 37–45 (2012)

    Google Scholar 

  13. Han, X., Yue, Q., Chu, J., Han, Z., Shi, Y., Wang, C.: Multi-feature fusion transformer for chinese named entity recognition. In: 2022 41st Chinese Control Conference (CCC), pp. 4227–4232. IEEE (2022)

    Google Scholar 

  14. Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)

  15. Jahan, M., Ahamed, I., Bishwas, M.R., Shatabda, S.: Abusive comments detection in bangla-english code-mixed and transliterated text. In: 2019 2nd International Conference on Innovation in Engineering and Technology (ICIET), pp. 1–6. IEEE (2019)

    Google Scholar 

  16. Kowsher, M., Sami, A.A., Prottasha, N.J., Arefin, M.S., Dhar, P.K., Koshiba, T.: Bangla-bert: transformer-based efficient model for transfer learning and language understanding. IEEE Access 10, 91855–91870 (2022)

    Article  Google Scholar 

  17. LaValley, M.P.: Logistic regression. Circulation 117(18), 2395–2399 (2008)

    Article  Google Scholar 

  18. Liu, B., et al.: Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, vol. 1. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19460-3

    Book  MATH  Google Scholar 

  19. Liu, G., Li, C., Yang, Q.: Neuralwalk: trust assessment in online social networks with neural networks. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pp. 1999–2007. IEEE (2019)

    Google Scholar 

  20. Liu, Y., Zheng, H., Feng, X., Chen, Z.: Short-term traffic flow prediction with conv-lstm. In: 2017 9th International Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–6. IEEE (2017)

    Google Scholar 

  21. Mahmud, M.R., Afrin, M., Razzaque, M.A., Miller, E., Iwashige, J.: A rule based bengali stemmer. In: 2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 2750–2756. IEEE (2014)

    Google Scholar 

  22. Meng, Y., et al.: Pretraining text encoders with adversarial mixture of training signal generators. arXiv preprint arXiv:2204.03243 (2022)

  23. Nova, F.F., Rifat, M.R., Saha, P., Ahmed, S.I., Guha, S.: Online sexual harassment over anonymous social media in Bangladesh. In: Proceedings of the Tenth International Conference on Information and Communication Technologies and Development, pp. 1–12 (2019)

    Google Scholar 

  24. Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)

    Article  Google Scholar 

  25. Psichogios, D.C., Ungar, L.H.: A hybrid neural network-first principles approach to process modeling. AIChE J. 38(10), 1499–1511 (1992)

    Article  Google Scholar 

  26. Rezwana Rashid, T.T.: Laws protecting victims from cyber harassment (2021). https://www.thedailystar.net/law-our-rights/news/laws-protecting-victims-cyber-harassment-2196491

  27. Ritu, S.S., Mondal, J., Mia, M.M., Al Marouf, A.: Bangla abusive language detection using machine learning on radio message gateway. In: 2021 6th International Conference on Communication and Electronics Systems (ICCES), pp. 1725–1729. IEEE (2021)

    Google Scholar 

  28. Sanh, V., Debut, L., Chaumond, J., Wolf, T.: Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019)

  29. Sen, O., et al.: Bangla natural language processing: a comprehensive analysis of classical, machine learning, and deep learning based methods. IEEE Access 10, 38999–39044 (2022)

    Article  Google Scholar 

  30. Sharma, G.: Pros and cons of different sampling techniques. Int. J. Appl. Res. 3(7), 749–752 (2017)

    Google Scholar 

  31. Suthaharan, S.: Support vector machine. In: Machine Learning Models and Algorithms for Big Data Classification. ISIS, vol. 36, pp. 207–235. Springer, Boston, MA (2016). https://doi.org/10.1007/978-1-4899-7641-3_9

    Chapter  MATH  Google Scholar 

  32. Talpur, K.R., Yuhaniz, S.S., Amir, N.: Cyberbullying detection: current trends and future directions. J. Theor. Appl. Inf. Technol. 98, 3197–3208 (2020)

    Google Scholar 

  33. Wahid, Z., Imran, A.A., Rifat, M.R.I.: BNnetXtreme: an enhanced methodology for Bangla fake news detection online. In: Computational Data and Social Networks: 11th International Conference, CSoNet 2022, 5–7 December 2022, Proceedings, pp. 157–166. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-26303-3_14

  34. Wani, M.A., Agarwal, N., Bours, P.: Sexual-predator detection system based on social behavior biometric (ssb) features. Procedia Comput. Sci. 189, 116–127 (2021)

    Article  Google Scholar 

  35. Webb, G.I., Keogh, E., Miikkulainen, R.: Naïve bayes. Encycl. Mach. Learn. 15, 713–714 (2010)

    Google Scholar 

  36. Zhao, Y., Tao, X.: ZYJ123@ DravidianLangTech-EACL2021: offensive language identification based on xlm-roberta with dpcnn. In: Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages, pp. 216–221 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zaman Wahid .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wahid, Z., Al Imran, A. (2023). Multi-feature Transformer for Multiclass Cyberbullying Detection in Bangla. In: Maglogiannis, I., Iliadis, L., MacIntyre, J., Dominguez, M. (eds) Artificial Intelligence Applications and Innovations. AIAI 2023. IFIP Advances in Information and Communication Technology, vol 675. Springer, Cham. https://doi.org/10.1007/978-3-031-34111-3_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-34111-3_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-34110-6

  • Online ISBN: 978-3-031-34111-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics