skip to main content
research-article

Toxic Comment Classification Based on Bidirectional Gated Recurrent Unit and Convolutional Neural Network

Published: 21 December 2021 Publication History

Abstract

For English toxic comment classification, this paper presents the model that combines Bi-GRU and CNN optimized by global average pooling (BG-GCNN) based on the bidirectional gated recurrent unit (Bi-GRU) and global pooling optimized convolution neural network (CNN). The model treats each type of toxic comment as a binary classification. First, Bi-GRU is used to extract the time-series features of the comment and then the dimensionality is reduced through global pooling optimized convolution neural network. Finally, the classification result is output by Sigmoid function. Comparative experiments show the BG-GCNN model has a better classification effect than Text-CNN, LSTM, Bi-GRU, and other models. The Macro-F1 value of the toxic comment dataset on the Kaggle competition platform is 0.62. The F1 values of the three toxic label classification results (toxic, obscene, and insult label) are 0.81, 0.84, and 0.74, respectively, which are the highest values in the comparative experiment.

Reference

[1]
Support and Safety Team. 2015. Harassment Survey. Wikimedia Foundation, 2015. https://foundation.wikimedia.org/wiki/File:Harassment_Survey_2015_-_Results_Report.pdf.
[2]
K. Dinakar, R. Reichart, and H. Lieberman. 2011. Modeling the detection of textual cyberbullying. In Fifth International AAAI Conference on Weblogs and Social Media.
[3]
J. M. Xu, K. S. Jun, X. Zhu, and A. Bellmore. 2012. Learning from bullying traces in social media. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 656–666.
[4]
T. Davidson, D. Warmsley, M. Macy, and I. Weber. 2017. Automated hate speech detection and the problem of offensive language. In Eleventh International AAAI Conference on Web and Social Media.
[5]
S. V. Georgakopoulos, S. K. Tasoulis, A. G. Vrahatis, and V. P. Plagianakos. 2018. Convolutional neural networks for toxic comment classification. In Proceedings of the 10th Hellenic Conference on Artificial Intelligence. 1–6.
[6]
Y. Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882.
[7]
L. Sterckx. An Evaluation of Neural Network Models for Toxic Comment Classification.
[8]
N. Nikhil, R. Pahwa, M. K. Nirala, and R. Khilnani. 2018. LSTM with attention for aggression detection. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). 52–57.
[9]
R. Kumar, G. Bhanodai, R. Pamula, and M. R. Chennuru. 2018. TRAC-1 shared task on aggression identification: IIT (ISM)@ COLING’18. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). 58–65.
[10]
R. Pronko. 2019. Simple bidirectional LSTM solution for text classification. Proceedings of the Pol Eval 2019 Workshop, 2019: 111.
[11]
S. Srivastava, P. Khurana, and V. Tewari. 2018. Identifying aggression and toxicity in comments using capsule network. In Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018). 98–105.
[12]
J. L. Elman, E. A. Bates, M. H. Johnson, A. Karmiloff-Smith, K. Plunkett, and D. Parisi. 1998. Rethinking innateness: A connectionist perspective on development, Vol. 10. MIT Press.
[13]
S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735–1780.
[14]
J. Cheng, L. Dong, and M. Lapata. 2016. Long short-term memory-networks for machine reading. arXiv preprint arXiv:1601.06733.
[15]
Li Peng, Yang Yuanwei, Gao Xianjun, Du Lihui, Zhou Yi, Jiang Meiyue, and Zhang Jingbo. 2020. Chinese speech recognition based on bi-directional circulatory neural network [J/OL]. Applied Acoustics, 2020(03):1–8 [2020-06-02]. http://kns.cnki.net/kcms/detail/11.2121.o4.20200506.1009.022.html.
[16]
Xu Yang and Liao Xiaoqin. 2020. Discriminatory discriminations of converting bidirectional gated circulatory units and convolutional neural networks. Journal of Wuhan University (Science Edition) 66, 02 (2020), 111–116.
[17]
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278–2324.
[18]
M. Lin, Q. Chen, and S. Yan. 2013. Network in network. arXiv preprint arXiv:1312.4400.
[19]
B. Zhou, A. Khosla, A. Lapedriza et al. 2016. Learning deep features for discriminative localization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2921–2929.
[20]
J. Zhao, K. Li, X. Xi, S. Wang, V. Saravanan, and R. D. Samuel. 2020. Analysis of complex cognitive task and pattern recognition using distributed patterns of EEG signals with cognitive functions. Neural Computing and Applications. DOI:
[21]
M. Z. Asghar, F. Subhan, H. Ahmad, W. Z. Khan, S. Hakak, T. R. Gadekallu, and M. Alazab. 2020. Senti-eSystem: A sentiment-based eSystem -using hybridized fuzzy and deep neural network for measuring customer satisfaction. Software: Practice and Experience 51, 3 (2020), 571–594. DOI:
[22]
A. O. Rodriguez, D. E. Mateus, P. A. Garcia, A. G. Acosta, and C. E. Marin. 2019. Segmentation methods for image classification using a convolutional neural network on AR-sandbox. IFIP Advances in Information and Communication Technology Artificial Intelligence Applications and Innovations. 391–398. DOI:
[23]
BalaAnand Muthu et al. A framework for extractive text summarization based on deep learning modified neural network classifier. ACM Transactions on Asian and Low-Resource Language Information Processing 2020. DOI:https://doi.org/10.1145/3392048

Cited By

View all
  • (2025)Urdu Toxic Comment Classification With PURUTT Corpus DevelopmentIEEE Access10.1109/ACCESS.2025.353586213(21635-21651)Online publication date: 2025
  • (2024)Design of a Chaotic Communication System Based on Deep Learning With Two-Dimensional ReshapingIEEE Transactions on Vehicular Technology10.1109/TVT.2024.338262573:7(10421-10434)Online publication date: Jul-2024
  • (2024)Leveraging Deep Learning for Detecting Toxicity in Online Comments2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10726256(1-6)Online publication date: 24-Jun-2024
  • Show More Cited By

Index Terms

  1. Toxic Comment Classification Based on Bidirectional Gated Recurrent Unit and Convolutional Neural Network

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 3
    May 2022
    413 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3505182
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 December 2021
    Accepted: 01 August 2021
    Received: 01 October 2020
    Published in TALLIP Volume 21, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Toxic comments classification
    2. bidirectional gated recurrent unit
    3. global pooling
    4. convolution neural network

    Qualifiers

    • Research-article
    • Refereed

    Funding Sources

    • National Natural Science Foundation in Higher Education of Anhui, China
    • Anhui Province Excellent Talents Project

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)36
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 27 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Urdu Toxic Comment Classification With PURUTT Corpus DevelopmentIEEE Access10.1109/ACCESS.2025.353586213(21635-21651)Online publication date: 2025
    • (2024)Design of a Chaotic Communication System Based on Deep Learning With Two-Dimensional ReshapingIEEE Transactions on Vehicular Technology10.1109/TVT.2024.338262573:7(10421-10434)Online publication date: Jul-2024
    • (2024)Leveraging Deep Learning for Detecting Toxicity in Online Comments2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10726256(1-6)Online publication date: 24-Jun-2024
    • (2023)A Novel Blockchain Strategy for Third Party Aware Crosschain Transaction FrameworkWireless Personal Communications: An International Journal10.1007/s11277-023-10588-w131:4(2897-2917)Online publication date: 19-Jul-2023
    • (2023)Secure Data Transmission Using Optimized Cryptography and Steganography Using Syndrome-Trellis CodingWireless Personal Communications: An International Journal10.1007/s11277-023-10298-3130:1(551-578)Online publication date: 28-Mar-2023

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media