Abstract
Cyberbullying and hate speeches are common issues in online etiquette. To tackle this highly concerned problem, we propose a text classification model based on convolutional neural networks for the de facto verbal aggression dataset built in our previous work and observe significant improvement, thanks to the proposed 2D TF-IDF features instead of pre-trained methods. Experiments are conducted to demonstrate that the proposed system outperforms our previous methods and other existing methods. A case study of word vectors is carried out to address the difficulty in using pre-trained word vectors for our short-text classification task, demonstrating the necessities of introducing 2D TF-IDF features. Furthermore, we also conduct visual analysis on the convolutional and pooling layers of the convolutional neural networks trained.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Pang B, Lee L (2008) Opinion mining and sentiment analysis. Foundations and Trends®. Inf Retrieval 2(1–2):1–135
Zhang W, Xu H, Wan W (2012) Weakness Finder: find product weakness from Chinese reviews by using aspects based sentiment analysis. Expert Syst Appl 39(11):10283–10291
Long W, Tang Y-R, Tian Y-J (2016) Investor sentiment identification based on the universum SVM. Neural Comput Appl. https://doi.org/10.1007/s00521-016-2684-y
Hájek P (2018) Combining bag-of-words and sentiment features of annual reports to predict abnormal stock returns. Neural Comput Appl 29(7):343–358. https://doi.org/10.1007/s00521-017-3194-2
Pak A, Paroubek P (2010) Twitter as a corpus for sentiment analysis and opinion mining. In: LREc, vol 10, no. 2010
Kouloumpis E, Wilson T, Moore JD (2011) Twitter sentiment analysis: the good the bad and the omg! Icwsm 11(538–541):164
Mullen T, Malouf R (2006) A preliminary investigation into sentiment analysis of informal political discourse. In: AAAI spring symposium: computational approaches to analyzing weblogs, pp 159–162
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, vol 1. Association for Computational Linguistics, pp 142–150
Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp 1422–1432
Liu B, Zhang L (2012) A survey of opinion mining and sentiment analysis. In: Mining text data. Springer, New York, pp 415–463
Chen J, Yan S, Wong KC (2017). Aggressivity detection on social network comments. In: Proceedings of the 2017 international conference on intelligent systems, metaheuristics & swarm intelligence. ACM, pp 103–107
Go A, Bhayani R, Huang L (2009) Twitter sentiment classification using distant supervision. CS224 N Project Report, Stanford, 1(2009), 12
Fellbaum C (1998) WordNet. Wiley, New York
Porter MF (1980) An algorithm for suffix stripping. Program 14(3):130–137
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Hosmer DW Jr, Lemeshow S, Sturdivant RX (2013) Applied logistic regression, vol 398. Wiley, New York
Salton G, McGill MJ (1986) Introduction to modern information retrieval. McGraw-Hill, Inc., New York
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Kim Y (2014) Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882
Lee G, Jeong J, Seo S, Kim C, Kang P (2017) Sentiment classification with word attention based on weakly supervised leaning. arXiv preprint arXiv:1709.09885
Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32(14):2627–2636
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013). Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp 3111–3119
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Dos Santos CN, Gatti M (2014) Deep convolutional neural networks for sentiment analysis of short texts. In: COLING, pp 69–78
Gal Y, Ghahramani Z (2016) A theoretically grounded application of dropout in recurrent neural networks. In Advances in neural information processing systems, pp 1019–1027
Salton G, Buckley C (1988) Term-weighting approaches in automatic text retrieval. Inf Process Manag 24(5):513–523
Williams RJ, Zipser D (1989) A learning algorithm for continually running fully recurrent neural networks. Neural Comput 1(2):270–280
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Mikolov T, Karafiát M, Burget L, Cernocký J, Khudanpur S (2010) Recurrent neural network based language model. In Interspeech, vol 2, p 3
Yang Z, Yang D, Dyer C, He X, Smola AJ, Hovy EH (2016) Hierarchical attention networks for document classification. In: HLT-NAACL, pp 1480–1489
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
Refaeilzadeh P, Tang L, Liu H (2009) Cross-validation. In: Encyclopedia of database systems. Springer, New York, pp 532–538
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874
Yang Y (1999) An evaluation of statistical approaches to text categorization. Inf Retrieval 1(1):69–90
Acknowledgements
The work described in this paper was substantially supported by two grants from the Research Grants Council of the Hong Kong Special Administrative Region (CityU 21200816) and (CityU 11203217).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All the authors declare that they have no conflict of interest.
Rights and permissions
About this article
Cite this article
Chen, J., Yan, S. & Wong, KC. Verbal aggression detection on Twitter comments: convolutional neural network for short-text sentiment analysis. Neural Comput & Applic 32, 10809–10818 (2020). https://doi.org/10.1007/s00521-018-3442-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3442-0