Abstract
The abusive posts detection problem is more complicated than it seems due to its unseemly, unstructured noisy data and unpredictable context. The learning performance of the neural networks attracts researchers to get the highest performing output. Still, there are some limitations for noisy data while training for a neural network. In our work, we have proposed an approach that considers the assets of both the machine learning and neural network to get the most optimum result. Our approach performs with the F1 score of 92.79.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abitbol, J.L., Karsai, M., Magué, J.P., Chevrot, J.P., Fleury, E.: Socioeconomic dependencies of linguistic patterns in Twitter: a multivariate analysis. In: Proceedings of the 2018 World Wide Web Conference, pp. 1125–1134 (2018). https://doi.org/10.1145/3178876.3186011
Alam, S., Yao, N.: The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis. Comput. Math. Organ. Theory 25(3), 319–335 (2018). https://doi.org/10.1007/s10588-018-9266-8
Backstrom, L., Boldi, P., Rosa, M., Ugander, J., Vigna, S.: Four degrees of separation. In: Proceedings of the 4th Annual ACM Web Science Conference, pp. 33–42 (2012)
Castelle, M.: The linguistic ideologies of deep abusive language classification. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pp. 160–170 (2018). https://doi.org/10.18653/v1/w18-5120
Chatzakou, D., Kourtellis, N., Blackburn, J., De Cristofaro, E., Stringhini, G., Vakali, A.: Mean birds: detecting aggression and bullying on Twitter. In: Proceedings of the 2017 ACM on Web Science Conference, pp. 13–22 (2017)
Chen, H., McKeever, S., Delany, S.J.: A comparison of classical versus deep learning techniques for abusive content detection on social media sites. In: Staab, S., Koltsova, O., Ignatov, D.I. (eds.) SocInfo 2018. LNCS, vol. 11185, pp. 117–133. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01129-1_8
Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing, pp. 71–80. IEEE (2012). https://doi.org/10.1109/socialcom-passat.2012.55
Cheng, J.: Report: 80 percent of blogs contain offensive content. ARS Technica. 2011 (2007)
Dadvar, M., Trieschnigg, D., de Jong, F.: Experts and machines against bullies: a hybrid approach to detect cyberbullies. In: Sokolova, M., van Beek, P. (eds.) AI 2014. LNCS (LNAI), vol. 8436, pp. 275–281. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06483-3_25
Dos Santos, C., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 69–78 (2014). https://www.aclweb.org/anthology/C14-1008.pdf
Edunov, S., Diuk, C., Filiz, I.O., Bhagat, S., Burke, M.: Three and a half degrees of separation. Res. Facebook 694 (2016)
Founta, A.M., et al.: Large scale crowdsourcing and characterization of Twitter abusive behavior. In: Twelfth International AAAI Conference on Web and Social Media (2018)
Hinduja, S., Patchin, J.W.: Cyberbullying fact sheet: identification, prevention, and response. Cyberbullying Research Center (2010). Accessed 30 Jan 2011
Hinduja, S., Patchin, J.W.: Cyberbullying fact sheet: identification, prevention, and response. Cyberbullying Research Center (2021)
Koufakou, A., Pamungkas, E.W., Basile, V., Patti, V.: HurtBERT: incorporating lexical features with BERT for the detection of abusive language. In: Proceedings of the Fourth Workshop on Online Abuse and Harms, pp. 34–43 (2020). https://doi.org/10.18653/v1/2020.alw-1.5
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600 (2010). https://doi.org/10.1145/1772690.1772751
Lee, Y., Yoon, S., Jung, K.: Comparative studies of detecting abusive language on Twitter, pp. 101–106 (2018). https://doi.org/10.18653/v1/w18-5113
Mathur, P., Sawhney, R., Ayyar, M., Shah, R.: Did you offend me? Classification of offensive Tweets in Hinglish language. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pp. 138–148 (2018). https://doi.org/10.18653/v1/w18-5118
Mehdad, Y., Tetreault, J.: Do characters abuse more than words? In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 299–303 (2016). https://doi.org/10.18653/v1/w16-3638
Narang, K., Brew, C.: Abusive language detection using syntactic dependency graphs. In: Proceedings of the Fourth Workshop on Online Abuse and Harms, pp. 44–53 (2020). https://doi.org/10.18653/v1/2020.alw-1.6
Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content. In: Proceedings of the 25th International Conference on World Wide Web, pp. 145–153 (2016). https://doi.org/10.1145/2872427.2883062
Patchin, J.W., Hinduja, S.: Summary of our cyberbullying research (2004–2016). Cyberbullying Research Center, pp. 1–2 (2016)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014). https://doi.org/10.3115/v1/d14-1162
Razavi, A.H., Inkpen, D., Uritsky, S., Matwin, S.: Offensive language detection using multi-level classification. In: Farzindar, A., Kešelj, V. (eds.) AI 2010. LNCS (LNAI), vol. 6085, pp. 16–27. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13059-5_5
van Rosendaal, J., Caselli, T., Nissim, M.: Lower bias, higher density abusive language datasets: a recipe. In: Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language, pp. 14–19 (2020). https://www.aclweb.org/anthology/2020.restup-1.4.pdf
Sjöbergh, J., Araki, K.: A multi-lingual dictionary of dirty words. In: LREC. Citeseer (2008)
Vidgen, B., Harris, A., Nguyen, D., Tromble, R., Hale, S., Margetts, H.: Challenges and frontiers in abusive content detection. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/w19-3509
Wiegand, M., Ruppenhofer, J., Kleinbauer, T.: Detection of abusive language: the problem of biased datasets. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Long And Short Papers), vol. 1, pp. 602–608 (2019). https://www.aclweb.org/anthology/N19-1060.pdf
Xiang, G., Fan, B., Wang, L., Hong, J., Rose, C.: Detecting offensive tweets via topical feature discovery over a large scale Twitter corpus. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1980–1984 (2012). https://doi.org/10.1145/2396761.2398556
Xu, Z., Zhu, S.: Filtering offensive language in online communities using grammatical relations. In: Proceedings of the Seventh Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, pp. 1–10 (2010)
Zhou, C., Sun, C., Liu, Z., Lau, F.: A C-LSTM neural network for text classification. arXiv preprint arXiv:1511.08630 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Tiwari, P., Rai, S. (2021). Mind Your Tweet: Abusive Tweet Detection. In: Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2021. Lecture Notes in Computer Science(), vol 12997. Springer, Cham. https://doi.org/10.1007/978-3-030-87802-3_63
Download citation
DOI: https://doi.org/10.1007/978-3-030-87802-3_63
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87801-6
Online ISBN: 978-3-030-87802-3
eBook Packages: Computer ScienceComputer Science (R0)