Convolutional Neural Networks for Twitter Text Toxicity Analysis

Georgakopoulos, Spiros V.; Tasoulis, Sotiris K.; Vrahatis, Aristidis G.; Plagianakos, Vassilis P.

doi:10.1007/978-3-030-16841-4_38

Convolutional Neural Networks for Twitter Text Toxicity Analysis

Spiros V. Georgakopoulos⁷,
Sotiris K. Tasoulis⁷,
Aristidis G. Vrahatis⁷ &
…
Vassilis P. Plagianakos⁷

Conference paper
First Online: 03 April 2019

1120 Accesses
4 Citations

Part of the book series: Proceedings of the International Neural Networks Society ((INNS,volume 1))

Abstract

Toxic comment classification is an emerging research field with several studies that have address several tasks in the detection of unwanted messages on communication platforms. Although sentiment analysis is an accurate approach for observing the crowd behavior, it is incapable of discovering other types of information in text, such as toxicity, which can usually reveal hidden information. Towards this direction, a model for temporal tracking of comments toxicity is proposed using tweets related to the hashtag under study. More specifically, a classifier is trained for toxic comments prediction using a Convolutional Neural Network model. Next, given a hashtag all relevant tweets are parsed and used as input in the classifier, hence, the knowledge about toxic texts is transferred to a new dataset for categorization. In the meantime, an adapted change detection approach is applied for monitoring the toxicity trend changes over time within the hashtag tweets. Our experimental results showed that toxic comment classification on twitter conversations can reveal significant knowledge and changes in the toxicity are accurately identified over time.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge.

References

Anastasia, S., Budi, I.: Twitter sentiment analysis of online transportation service providers. In: 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 359–365, October 2016
Google Scholar
Basseville, M., Nikiforov, I.V.: Detection of abrupt changes: theory and application (1993)
Google Scholar
Bottou, L.: On-line learning and stochastic approximations. In: On-Line Learning in Neural Networks, pp. 9–42. Cambridge University Press, New York (1998). http://dl.acm.org/citation.cfm?id=304710.304720
Burgess, J., Bruns, A.: (Not) the Twitter election: the dynamics of the# ausvotes conversation in relation to the Australian media ecology. Journal. Pract. 6(3), 384–402 (2012)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Enli, G.S., Skogerbø, E.: Personalized campaigns in party-centred politics: Twitter and Facebook as arenas for political communication. Inf. Commun. Soc. 16(5), 757–774 (2013)
Article Google Scholar
Gal, Y., Ghahramani, Z.: A theoretically grounded application of dropout in recurrent neural networks. In: Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Barcelona, Spain, 5–10 December 2016, pp. 1019–1027 (2016)
Google Scholar
Georgakopoulos, S.V., Tasoulis, S.K., Plagianakos, V.P.: Efficient change detection for high dimensional data streams. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 2219–2222, October 2015
Google Scholar
Georgakopoulos, S.V., Tasoulis, S.K., Vrahatis, A.G., Plagianakos, V.P.: Convolutional neural networks for toxic comment classification. CoRR abs/1802.09957 (2018). http://arxiv.org/abs/1802.09957
Granjon, P.: The CUSUM algorithm a small review (2014)
Google Scholar
Haselmayer, M., Jenny, M.: Sentiment analysis of political communication: combining a dictionary approach with crowdcoding. Qual. Quant. 51(6), 2623–2646 (2017)
Article Google Scholar
Hester, J.: glue: Interpreted String Literals (2017). https://CRAN.R-project.org/package=glue, r package version 1.2.0
Hosseini, H., Kannan, S., Zhang, B., Poovendran, R.: Deceiving Google’s perspective API built for detecting toxic comments. arXiv preprint arXiv:1702.08138 (2017)
Kalucki, J.: Twitter streaming API (2010). http://apiwiki.twitter.com/Streaming-API-Documentation
Kearney, M.W.: rtweet: Collecting Twitter Data (2017). R package version 0.6.0
Google Scholar
Killick, R., Fearnhead, P., Eckley, I.: Optimal detection of changepoints with a linear computational cost 107, 1590–1598 (2012)
Google Scholar
Killick, R., Haynes, K., Eckley, I.A.: changepoint: an R package for changepoint analysis (2016). https://CRAN.R-project.org/package=changepoint. R package version 2.2.2
Kušen, E., Strembeck, M.: Politics, sentiments, and misinformation: an analysis of the Twitter discussion on the 2016 Austrian presidential elections. Online Soc. Netw. Media 5, 37–50 (2018)
Article Google Scholar
Li, S.: Application of recurrent neural networks in toxic comment classification. Ph.D. thesis, UCLA (2018)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Neural and Information Processing System (NIPS) (2013)
Google Scholar
Page, E.S.: Continuous inspection schemes. Biometrika 41(1/2), 100–115 (1954)
Article MathSciNet Google Scholar
Pagolu, V.S., Reddy, K.N., Panda, G., Majhi, B.: Sentiment analysis of Twitter data for predicting stock market movements. In: 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), pp. 1345–1350, October 2016
Google Scholar
Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2011)
Book Google Scholar
Ranco, G., Aleksovski, D., Caldarelli, G., Grčar, M., Mozetič, I.: The effects of Twitter sentiment on stock price returns. PLoS One 10(9), e0138441 (2015)
Article Google Scholar
Ringsquandl, M., Petkovic, D.: Analyzing political sentiment on Twitter. In: AAAI Spring Symposium: Analyzing Microtext. AAAI Technical report, vol. SS-13-01. AAAI (2013)
Google Scholar
Risch, J., Krestel, R.: Aggression identification using deep learning and data augmentation. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC 2018), pp. 150–158 (2018)
Google Scholar
Tasoulis, S., Doukas, C., Plagianakos, V., Maglogiannis, I.: Statistical data mining of streaming motion data for activity and fall recognition in assistive environments. Neurocomputing 107, 87–96 (2013). Timely Neural Networks Applications in Engineering
Article Google Scholar
Tasoulis, S.K., Vrahatis, A.G., Georgakopoulos, S.V., Plagianakos, V.P.: Real time sentiment change detection of Twitter data streams. CoRR abs/1804.00482 (2018)
Google Scholar
Thelwall, M.: The heart and soul of the web? Sentiment strength detection in the social web with sentistrength, pp. 119–134. Springer, Cham (2017)
Google Scholar
Wang, H., Can, D., Kazemzadeh, A., Bar, F., Narayanan, S.: A system for real-time Twitter sentiment analysis of 2012 US presidential election cycle. In: Proceedings of the ACL 2012 System Demonstrations, pp. 115–120. Association for Computational Linguistics (2012)
Google Scholar
Wickham, H.: stringr: Simple, Consistent Wrappers for Common String Operations (2017). https://CRAN.R-project.org/package=stringr. R package version 1.2.0
Wulczyn, E., Thain, N., Dixon, L.: Ex machina: personal attacks seen at scale. In: Proceedings of the 26th International Conference on World Wide Web, WWW 2017, pp. 1391–1399. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva (2017)
Google Scholar
Wulczyn, E., Thain, N., Dixon, L.: Wikipedia talk labels: aggression (2017)
Google Scholar
Wulczyn, E., Thain, N., Dixon, L.: Wikipedia talk labels: personal attacks (2017)
Google Scholar

Download references

Acknowledgment

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research. This project has received funding from the Hellenic Foundation for Research and Innovation (HFRI) and the General Secretariat for Research and Technology (GSRT), under grant agreement No 1901.

Author information

Authors and Affiliations

Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
Spiros V. Georgakopoulos, Sotiris K. Tasoulis, Aristidis G. Vrahatis & Vassilis P. Plagianakos

Authors

Spiros V. Georgakopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Sotiris K. Tasoulis
View author publications
You can also search for this author in PubMed Google Scholar
Aristidis G. Vrahatis
View author publications
You can also search for this author in PubMed Google Scholar
Vassilis P. Plagianakos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vassilis P. Plagianakos .

Editor information

Editors and Affiliations

Department of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genova, Genoa, Italy
Luca Oneto
Department of Mathematics, University of Padova, Padua, Italy
Nicolò Navarin
Department of Mathematics, University of Padova, Padua, Italy
Alessandro Sperduti
Department of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genova, Genoa, Italy
Davide Anguita

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Georgakopoulos, S.V., Tasoulis, S.K., Vrahatis, A.G., Plagianakos, V.P. (2020). Convolutional Neural Networks for Twitter Text Toxicity Analysis. In: Oneto, L., Navarin, N., Sperduti, A., Anguita, D. (eds) Recent Advances in Big Data and Deep Learning. INNSBDDL 2019. Proceedings of the International Neural Networks Society, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-030-16841-4_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-16841-4_38
Published: 03 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16840-7
Online ISBN: 978-3-030-16841-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics