Mind Your Tweet: Abusive Tweet Detection

Tiwari, Paras; Rai, Sawan

doi:10.1007/978-3-030-87802-3_63

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12997))

Included in the following conference series:

International Conference on Speech and Computer

1587 Accesses
2 Citations

Abstract

The abusive posts detection problem is more complicated than it seems due to its unseemly, unstructured noisy data and unpredictable context. The learning performance of the neural networks attracts researchers to get the highest performing output. Still, there are some limitations for noisy data while training for a neural network. In our work, we have proposed an approach that considers the assets of both the machine learning and neural network to get the most optimum result. Our approach performs with the F1 score of 92.79.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Abitbol, J.L., Karsai, M., Magué, J.P., Chevrot, J.P., Fleury, E.: Socioeconomic dependencies of linguistic patterns in Twitter: a multivariate analysis. In: Proceedings of the 2018 World Wide Web Conference, pp. 1125–1134 (2018). https://doi.org/10.1145/3178876.3186011
Alam, S., Yao, N.: The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis. Comput. Math. Organ. Theory 25(3), 319–335 (2018). https://doi.org/10.1007/s10588-018-9266-8
Article Google Scholar
Backstrom, L., Boldi, P., Rosa, M., Ugander, J., Vigna, S.: Four degrees of separation. In: Proceedings of the 4th Annual ACM Web Science Conference, pp. 33–42 (2012)
Google Scholar
Castelle, M.: The linguistic ideologies of deep abusive language classification. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pp. 160–170 (2018). https://doi.org/10.18653/v1/w18-5120
Chatzakou, D., Kourtellis, N., Blackburn, J., De Cristofaro, E., Stringhini, G., Vakali, A.: Mean birds: detecting aggression and bullying on Twitter. In: Proceedings of the 2017 ACM on Web Science Conference, pp. 13–22 (2017)
Google Scholar
Chen, H., McKeever, S., Delany, S.J.: A comparison of classical versus deep learning techniques for abusive content detection on social media sites. In: Staab, S., Koltsova, O., Ignatov, D.I. (eds.) SocInfo 2018. LNCS, vol. 11185, pp. 117–133. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01129-1_8
Chapter Google Scholar
Chen, Y., Zhou, Y., Zhu, S., Xu, H.: Detecting offensive language in social media to protect adolescent online safety. In: 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing, pp. 71–80. IEEE (2012). https://doi.org/10.1109/socialcom-passat.2012.55
Cheng, J.: Report: 80 percent of blogs contain offensive content. ARS Technica. 2011 (2007)
Google Scholar
Dadvar, M., Trieschnigg, D., de Jong, F.: Experts and machines against bullies: a hybrid approach to detect cyberbullies. In: Sokolova, M., van Beek, P. (eds.) AI 2014. LNCS (LNAI), vol. 8436, pp. 275–281. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06483-3_25
Chapter Google Scholar
Dos Santos, C., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 69–78 (2014). https://www.aclweb.org/anthology/C14-1008.pdf
Edunov, S., Diuk, C., Filiz, I.O., Bhagat, S., Burke, M.: Three and a half degrees of separation. Res. Facebook 694 (2016)
Google Scholar
Founta, A.M., et al.: Large scale crowdsourcing and characterization of Twitter abusive behavior. In: Twelfth International AAAI Conference on Web and Social Media (2018)
Google Scholar
Hinduja, S., Patchin, J.W.: Cyberbullying fact sheet: identification, prevention, and response. Cyberbullying Research Center (2010). Accessed 30 Jan 2011
Google Scholar
Hinduja, S., Patchin, J.W.: Cyberbullying fact sheet: identification, prevention, and response. Cyberbullying Research Center (2021)
Google Scholar
Koufakou, A., Pamungkas, E.W., Basile, V., Patti, V.: HurtBERT: incorporating lexical features with BERT for the detection of abusive language. In: Proceedings of the Fourth Workshop on Online Abuse and Harms, pp. 34–43 (2020). https://doi.org/10.18653/v1/2020.alw-1.5
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600 (2010). https://doi.org/10.1145/1772690.1772751
Lee, Y., Yoon, S., Jung, K.: Comparative studies of detecting abusive language on Twitter, pp. 101–106 (2018). https://doi.org/10.18653/v1/w18-5113
Mathur, P., Sawhney, R., Ayyar, M., Shah, R.: Did you offend me? Classification of offensive Tweets in Hinglish language. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pp. 138–148 (2018). https://doi.org/10.18653/v1/w18-5118
Mehdad, Y., Tetreault, J.: Do characters abuse more than words? In: Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pp. 299–303 (2016). https://doi.org/10.18653/v1/w16-3638
Narang, K., Brew, C.: Abusive language detection using syntactic dependency graphs. In: Proceedings of the Fourth Workshop on Online Abuse and Harms, pp. 44–53 (2020). https://doi.org/10.18653/v1/2020.alw-1.6
Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language detection in online user content. In: Proceedings of the 25th International Conference on World Wide Web, pp. 145–153 (2016). https://doi.org/10.1145/2872427.2883062
Patchin, J.W., Hinduja, S.: Summary of our cyberbullying research (2004–2016). Cyberbullying Research Center, pp. 1–2 (2016)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014). https://doi.org/10.3115/v1/d14-1162
Razavi, A.H., Inkpen, D., Uritsky, S., Matwin, S.: Offensive language detection using multi-level classification. In: Farzindar, A., Kešelj, V. (eds.) AI 2010. LNCS (LNAI), vol. 6085, pp. 16–27. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13059-5_5
Chapter Google Scholar
van Rosendaal, J., Caselli, T., Nissim, M.: Lower bias, higher density abusive language datasets: a recipe. In: Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language, pp. 14–19 (2020). https://www.aclweb.org/anthology/2020.restup-1.4.pdf
Sjöbergh, J., Araki, K.: A multi-lingual dictionary of dirty words. In: LREC. Citeseer (2008)
Google Scholar
Vidgen, B., Harris, A., Nguyen, D., Tromble, R., Hale, S., Margetts, H.: Challenges and frontiers in abusive content detection. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/w19-3509
Wiegand, M., Ruppenhofer, J., Kleinbauer, T.: Detection of abusive language: the problem of biased datasets. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (Long And Short Papers), vol. 1, pp. 602–608 (2019). https://www.aclweb.org/anthology/N19-1060.pdf
Xiang, G., Fan, B., Wang, L., Hong, J., Rose, C.: Detecting offensive tweets via topical feature discovery over a large scale Twitter corpus. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1980–1984 (2012). https://doi.org/10.1145/2396761.2398556
Xu, Z., Zhu, S.: Filtering offensive language in online communities using grammatical relations. In: Proceedings of the Seventh Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, pp. 1–10 (2010)
Google Scholar
Zhou, C., Sun, C., Liu, Z., Lau, F.: A C-LSTM neural network for text classification. arXiv preprint arXiv:1511.08630 (2015)

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology (BHU) Varanasi, Varanasi, 221005, India
Paras Tiwari
Department of Computer Science and Engineering, Indian Institute of Information Technology, Design and Manufacturing, Jabalpur, 482005, India
Sawan Rai

Authors

Paras Tiwari
View author publications
You can also search for this author in PubMed Google Scholar
Sawan Rai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paras Tiwari .

Editor information

Editors and Affiliations

St. Petersburg Federal Research Center of the Russian Academy of Sciences, St. Petersburg, Russia
Alexey Karpov
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tiwari, P., Rai, S. (2021). Mind Your Tweet: Abusive Tweet Detection. In: Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2021. Lecture Notes in Computer Science(), vol 12997. Springer, Cham. https://doi.org/10.1007/978-3-030-87802-3_63

Download citation

DOI: https://doi.org/10.1007/978-3-030-87802-3_63
Published: 22 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87801-6
Online ISBN: 978-3-030-87802-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics