Skip to main content

Detecting Toxic Comments Using FastText, CNN, and LSTM Models

  • Conference paper
  • First Online:
Advances in Computing and Data Sciences (ICACDS 2023)

Abstract

The use of social media has become a necessary daily activity. It provides a platform to share news, information, and social interaction. However, many people now take social media platforms for granted, using them to harass and threaten others, which results in cyberbullying. Toxic comments are online remarks that are insulting, abusive, or inappropriate, and frequently cause other users to quit a debate. People are unable to openly express their thoughts owing to cyberbullying and harassment. Identifying and classifying such remarks by hand is a time-consuming, inefficient, and unreliable operation. To solve this issue, this research article focuses on developing a deep learning system to analyze toxicity and produce efficient results in order to restrict its negative consequences, which will aid institutions to put the necessary measures into practice. Our proposed model uses Long Short-Term Memory (LSTM) along with FastText word embedding, resulting in a model with high performance. To make the social networking experience better, this model tries to improve the detection of different sorts of toxicity. Toxic, Severe Toxic, Obscene, Threat, Insult, and Identity-hate are the six categories that our methodology divides such comments into. Multi-Label Classification aids us in providing an automatic answer to the problem of poisonous remarks that were faced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Website. https://backlinko.com/social-media-users. Accessed 19 Mar 2023

  2. Jung, J., Petkanic, P., Nan, D., Kim, J.H.: When a girl awakened the world: a user and social message analysis of Greta Thunberg. Sustainability 12, 2707 (2020). https://doi.org/10.3390/su12072707

    Article  Google Scholar 

  3. Amedie, J.: The Impact of Social Media on Society (2015)

    Google Scholar 

  4. Duggan, M.: Online harassment. Pew Research Center (2014)

    Google Scholar 

  5. Konikoff, D.: Gatekeepers of toxicity: reconceptualizing Twitter’s abuse and hate speech policies. Policy Internet 13, 502–521 (2021). https://doi.org/10.1002/poi3.265

    Article  Google Scholar 

  6. Minar, M.R., Naher, J.: Recent Advances in Deep Learning: An Overview (2018)

    Google Scholar 

  7. Di, W., Bhardwaj, A., Wei, J.: Deep Learning Essentials, January 2018

    Google Scholar 

  8. Santos, I., Nedjah, N., de Macedo Mourelle, L.: Sentiment analysis using convolutional neural network with fastText embeddings. In: 2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI), pp. 1–5 (2017). https://doi.org/10.1109/LA-CCI.2017.8285683

  9. Website. https://medium.com/techiepedia/binary-image-classifier-cnn-usingtensorflow-a3f5d6746697. Accessed 19 Mar 2023

  10. Website. https://colah.github.io/posts/2015-08-Understanding-LSTMs/. Accessed 19 Mar 2023

  11. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)

    Google Scholar 

  12. Plisson, J., Lavrac, N., Mladenic, D.: A rule based approach to word lemmatization. In: Proceedings of IS, vol. 3 (2004)

    Google Scholar 

  13. Husnain, M., Khalid, A., Shafi, N.: A novel preprocessing technique for toxic comment classification. In: 2021 International Conference on Artificial Intelligence (ICAI), pp. 22–27 (2021). https://doi.org/10.1109/ICAI52203.2021.9445252

  14. Mestry, S., Singh, H., Chauhan, R., Bisht, V., Tiwari, K.: Automation in social networking comments with the help of robust fastText and CNN. In: 2019 1st International Conference on Innovations in Information and Communication Technology (ICIICT), pp. 1–4 (2019). https://doi.org/10.1109/ICIICT1.2019.8741503

  15. Rahul, Kajla, H., Hooda, J., Saini, G.: Classification of online toxic comments using machine learning algorithms. In: 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), pp. 1119–1123 (2020). https://doi.org/10.1109/ICICCS48265.2020.9120939

  16. Rupapara, V., Rustam, F., Shahzad, H.F., Mehmood, A., Ashraf, I., Choi, G.S.: Impact of SMOTE on imbalanced text features for toxic comments classification using RVVC model. IEEE Access 9, 78621–78634 (2021). https://doi.org/10.1109/ACCESS.2021.3083638

    Article  Google Scholar 

  17. Zhang, J., Li, Y., Tian, J., Li, T.: LSTM-CNN hybrid model for text classification. In: 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), pp. 1675–1680 (2018). https://doi.org/10.1109/IAEAC.2018.8577620

  18. Sumanth, P., Samiuddin, S., Jamal, K., Domakonda, S., Shivani, P.: Toxic speech classification using machine learning algorithms. In: 2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC), pp. 257–263 (2022). https://doi.org/10.1109/ICESIC53714.2022.9783475

  19. Dubey, K., Nair, R., Khan, M.U., Shaikh, P.S.: Toxic comment detection using LSTM. In: 2020 Third International Conference on Advances in Electronics, Computers and Communications (ICAECC), pp. 1–8 (2020). https://doi.org/10.1109/ICAECC50550.2020.9339521

  20. Vichare, M., Thorat, S., Uberoi, C.S., Khedekar, S., Jaikar, S.: Toxic comment analysis for online learning. In: 2021 2nd International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS), pp. 130–135 (2021). https://doi.org/10.1109/ACCESS51619.2021.9563344

  21. Pavel, M.I., Razzak, R., Sengupta, K., Niloy, M.D.K., Muqith, M.B., Tan, S.Y.: Toxic comment classification implementing CNN combining word embedding technique. In: Smys, S., Balas, V.E., Kamel, K.A., Lafata, P. (eds.) Inventive Computation and Information Technologies. LNNS, vol. 173, pp. 897–909. Springer, Singapore (2021). https://doi.org/10.1007/978-981-33-4305-4_65

    Chapter  Google Scholar 

  22. Varma, R., Verma, Y., Vijayvargiya, P., Churi, P.P.: A systematic survey on deep learning and machine learning approaches of fake news detection in the pre- and post- COVID-19 pandemic. Int. J. Intell. Comput. Cybern. 14(4), 617–646 (2021). https://doi.org/10.1108/IJICC-04-2021-0069

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hetvi Gandhi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gandhi, H., Bachwani, R., Nanade, A. (2023). Detecting Toxic Comments Using FastText, CNN, and LSTM Models. In: Singh, M., Tyagi, V., Gupta, P., Flusser, J., Ören, T. (eds) Advances in Computing and Data Sciences. ICACDS 2023. Communications in Computer and Information Science, vol 1848. Springer, Cham. https://doi.org/10.1007/978-3-031-37940-6_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-37940-6_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-37939-0

  • Online ISBN: 978-3-031-37940-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics