Detecting Toxic Comments Using FastText, CNN, and LSTM Models

Gandhi, Hetvi; Bachwani, Rounak; Nanade, Archana

doi:10.1007/978-3-031-37940-6_20

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1848))

Included in the following conference series:

International Conference on Advances in Computing and Data Sciences

292 Accesses
1 Citations

Abstract

The use of social media has become a necessary daily activity. It provides a platform to share news, information, and social interaction. However, many people now take social media platforms for granted, using them to harass and threaten others, which results in cyberbullying. Toxic comments are online remarks that are insulting, abusive, or inappropriate, and frequently cause other users to quit a debate. People are unable to openly express their thoughts owing to cyberbullying and harassment. Identifying and classifying such remarks by hand is a time-consuming, inefficient, and unreliable operation. To solve this issue, this research article focuses on developing a deep learning system to analyze toxicity and produce efficient results in order to restrict its negative consequences, which will aid institutions to put the necessary measures into practice. Our proposed model uses Long Short-Term Memory (LSTM) along with FastText word embedding, resulting in a model with high performance. To make the social networking experience better, this model tries to improve the detection of different sorts of toxicity. Toxic, Severe Toxic, Obscene, Threat, Insult, and Identity-hate are the six categories that our methodology divides such comments into. Multi-Label Classification aids us in providing an automatic answer to the problem of poisonous remarks that were faced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Website. https://backlinko.com/social-media-users. Accessed 19 Mar 2023
Jung, J., Petkanic, P., Nan, D., Kim, J.H.: When a girl awakened the world: a user and social message analysis of Greta Thunberg. Sustainability 12, 2707 (2020). https://doi.org/10.3390/su12072707
Article Google Scholar
Amedie, J.: The Impact of Social Media on Society (2015)
Google Scholar
Duggan, M.: Online harassment. Pew Research Center (2014)
Google Scholar
Konikoff, D.: Gatekeepers of toxicity: reconceptualizing Twitter’s abuse and hate speech policies. Policy Internet 13, 502–521 (2021). https://doi.org/10.1002/poi3.265
Article Google Scholar
Minar, M.R., Naher, J.: Recent Advances in Deep Learning: An Overview (2018)
Google Scholar
Di, W., Bhardwaj, A., Wei, J.: Deep Learning Essentials, January 2018
Google Scholar
Santos, I., Nedjah, N., de Macedo Mourelle, L.: Sentiment analysis using convolutional neural network with fastText embeddings. In: 2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI), pp. 1–5 (2017). https://doi.org/10.1109/LA-CCI.2017.8285683
Website. https://medium.com/techiepedia/binary-image-classifier-cnn-usingtensorflow-a3f5d6746697. Accessed 19 Mar 2023
Website. https://colah.github.io/posts/2015-08-Understanding-LSTMs/. Accessed 19 Mar 2023
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)
Google Scholar
Plisson, J., Lavrac, N., Mladenic, D.: A rule based approach to word lemmatization. In: Proceedings of IS, vol. 3 (2004)
Google Scholar
Husnain, M., Khalid, A., Shafi, N.: A novel preprocessing technique for toxic comment classification. In: 2021 International Conference on Artificial Intelligence (ICAI), pp. 22–27 (2021). https://doi.org/10.1109/ICAI52203.2021.9445252
Mestry, S., Singh, H., Chauhan, R., Bisht, V., Tiwari, K.: Automation in social networking comments with the help of robust fastText and CNN. In: 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), pp. 1–4 (2019). https://doi.org/10.1109/ICIICT1.2019.8741503
Rahul, Kajla, H., Hooda, J., Saini, G.: Classification of online toxic comments using machine learning algorithms. In: 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 1119–1123 (2020). https://doi.org/10.1109/ICICCS48265.2020.9120939
Rupapara, V., Rustam, F., Shahzad, H.F., Mehmood, A., Ashraf, I., Choi, G.S.: Impact of SMOTE on imbalanced text features for toxic comments classification using RVVC model. IEEE Access 9, 78621–78634 (2021). https://doi.org/10.1109/ACCESS.2021.3083638
Article Google Scholar
Zhang, J., Li, Y., Tian, J., Li, T.: LSTM-CNN hybrid model for text classification. In: 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), pp. 1675–1680 (2018). https://doi.org/10.1109/IAEAC.2018.8577620
Sumanth, P., Samiuddin, S., Jamal, K., Domakonda, S., Shivani, P.: Toxic speech classification using machine learning algorithms. In: 2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC), pp. 257–263 (2022). https://doi.org/10.1109/ICESIC53714.2022.9783475
Dubey, K., Nair, R., Khan, M.U., Shaikh, P.S.: Toxic comment detection using LSTM. In: 2020 Third International Conference on Advances in Electronics, Computers and Communications (ICAECC), pp. 1–8 (2020). https://doi.org/10.1109/ICAECC50550.2020.9339521
Vichare, M., Thorat, S., Uberoi, C.S., Khedekar, S., Jaikar, S.: Toxic comment analysis for online learning. In: 2021 2nd International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS), pp. 130–135 (2021). https://doi.org/10.1109/ACCESS51619.2021.9563344
Pavel, M.I., Razzak, R., Sengupta, K., Niloy, M.D.K., Muqith, M.B., Tan, S.Y.: Toxic comment classification implementing CNN combining word embedding technique. In: Smys, S., Balas, V.E., Kamel, K.A., Lafata, P. (eds.) Inventive Computation and Information Technologies. LNNS, vol. 173, pp. 897–909. Springer, Singapore (2021). https://doi.org/10.1007/978-981-33-4305-4_65
Chapter Google Scholar
Varma, R., Verma, Y., Vijayvargiya, P., Churi, P.P.: A systematic survey on deep learning and machine learning approaches of fake news detection in the pre- and post- COVID-19 pandemic. Int. J. Intell. Comput. Cybern. 14(4), 617–646 (2021). https://doi.org/10.1108/IJICC-04-2021-0069
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Engineering Department, Mukesh Patel School of Technology Management, and Engineering, NMIMS University, Mumbai, India
Hetvi Gandhi, Rounak Bachwani & Archana Nanade

Authors

Hetvi Gandhi
View author publications
You can also search for this author in PubMed Google Scholar
Rounak Bachwani
View author publications
You can also search for this author in PubMed Google Scholar
Archana Nanade
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hetvi Gandhi .

Editor information

Editors and Affiliations

Consilio Research Lab, Tallinn, Estonia
Mayank Singh
Jaypee University of Engineering and Technology, Guna, India
Vipin Tyagi
Jaypee University of Information Technology, Waknaghat, India
P.K. Gupta
Institute of Information Theory and Automation, Prague, Czech Republic
Jan Flusser
University of Ottawa, Ottawa, ON, Canada
Tuncer Ören

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gandhi, H., Bachwani, R., Nanade, A. (2023). Detecting Toxic Comments Using FastText, CNN, and LSTM Models. In: Singh, M., Tyagi, V., Gupta, P., Flusser, J., Ören, T. (eds) Advances in Computing and Data Sciences. ICACDS 2023. Communications in Computer and Information Science, vol 1848. Springer, Cham. https://doi.org/10.1007/978-3-031-37940-6_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-37940-6_20
Published: 23 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37939-0
Online ISBN: 978-3-031-37940-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics