Skip to main content

Convolutional Neural Networks for Twitter Text Toxicity Analysis

  • Conference paper
  • First Online:

Part of the book series: Proceedings of the International Neural Networks Society ((INNS,volume 1))

Abstract

Toxic comment classification is an emerging research field with several studies that have address several tasks in the detection of unwanted messages on communication platforms. Although sentiment analysis is an accurate approach for observing the crowd behavior, it is incapable of discovering other types of information in text, such as toxicity, which can usually reveal hidden information. Towards this direction, a model for temporal tracking of comments toxicity is proposed using tweets related to the hashtag under study. More specifically, a classifier is trained for toxic comments prediction using a Convolutional Neural Network model. Next, given a hashtag all relevant tweets are parsed and used as input in the classifier, hence, the knowledge about toxic texts is transferred to a new dataset for categorization. In the meantime, an adapted change detection approach is applied for monitoring the toxicity trend changes over time within the hashtag tweets. Our experimental results showed that toxic comment classification on twitter conversations can reveal significant knowledge and changes in the toxicity are accurately identified over time.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge.

References

  1. Anastasia, S., Budi, I.: Twitter sentiment analysis of online transportation service providers. In: 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 359–365, October 2016

    Google Scholar 

  2. Basseville, M., Nikiforov, I.V.: Detection of abrupt changes: theory and application (1993)

    Google Scholar 

  3. Bottou, L.: On-line learning and stochastic approximations. In: On-Line Learning in Neural Networks, pp. 9–42. Cambridge University Press, New York (1998). http://dl.acm.org/citation.cfm?id=304710.304720

  4. Burgess, J., Bruns, A.: (Not) the Twitter election: the dynamics of the# ausvotes conversation in relation to the Australian media ecology. Journal. Pract. 6(3), 384–402 (2012)

    Google Scholar 

  5. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    MATH  Google Scholar 

  6. Enli, G.S., Skogerbø, E.: Personalized campaigns in party-centred politics: Twitter and Facebook as arenas for political communication. Inf. Commun. Soc. 16(5), 757–774 (2013)

    Article  Google Scholar 

  7. Gal, Y., Ghahramani, Z.: A theoretically grounded application of dropout in recurrent neural networks. In: Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Barcelona, Spain, 5–10 December 2016, pp. 1019–1027 (2016)

    Google Scholar 

  8. Georgakopoulos, S.V., Tasoulis, S.K., Plagianakos, V.P.: Efficient change detection for high dimensional data streams. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 2219–2222, October 2015

    Google Scholar 

  9. Georgakopoulos, S.V., Tasoulis, S.K., Vrahatis, A.G., Plagianakos, V.P.: Convolutional neural networks for toxic comment classification. CoRR abs/1802.09957 (2018). http://arxiv.org/abs/1802.09957

  10. Granjon, P.: The CUSUM algorithm a small review (2014)

    Google Scholar 

  11. Haselmayer, M., Jenny, M.: Sentiment analysis of political communication: combining a dictionary approach with crowdcoding. Qual. Quant. 51(6), 2623–2646 (2017)

    Article  Google Scholar 

  12. Hester, J.: glue: Interpreted String Literals (2017). https://CRAN.R-project.org/package=glue, r package version 1.2.0

  13. Hosseini, H., Kannan, S., Zhang, B., Poovendran, R.: Deceiving Google’s perspective API built for detecting toxic comments. arXiv preprint arXiv:1702.08138 (2017)

  14. Kalucki, J.: Twitter streaming API (2010). http://apiwiki.twitter.com/Streaming-API-Documentation

  15. Kearney, M.W.: rtweet: Collecting Twitter Data (2017). R package version 0.6.0

    Google Scholar 

  16. Killick, R., Fearnhead, P., Eckley, I.: Optimal detection of changepoints with a linear computational cost 107, 1590–1598 (2012)

    Google Scholar 

  17. Killick, R., Haynes, K., Eckley, I.A.: changepoint: an R package for changepoint analysis (2016). https://CRAN.R-project.org/package=changepoint. R package version 2.2.2

  18. Kušen, E., Strembeck, M.: Politics, sentiments, and misinformation: an analysis of the Twitter discussion on the 2016 Austrian presidential elections. Online Soc. Netw. Media 5, 37–50 (2018)

    Article  Google Scholar 

  19. Li, S.: Application of recurrent neural networks in toxic comment classification. Ph.D. thesis, UCLA (2018)

    Google Scholar 

  20. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Neural and Information Processing System (NIPS) (2013)

    Google Scholar 

  21. Page, E.S.: Continuous inspection schemes. Biometrika 41(1/2), 100–115 (1954)

    Article  MathSciNet  Google Scholar 

  22. Pagolu, V.S., Reddy, K.N., Panda, G., Majhi, B.: Sentiment analysis of Twitter data for predicting stock market movements. In: 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), pp. 1345–1350, October 2016

    Google Scholar 

  23. Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Cambridge University Press, Cambridge (2011)

    Book  Google Scholar 

  24. Ranco, G., Aleksovski, D., Caldarelli, G., Grčar, M., Mozetič, I.: The effects of Twitter sentiment on stock price returns. PLoS One 10(9), e0138441 (2015)

    Article  Google Scholar 

  25. Ringsquandl, M., Petkovic, D.: Analyzing political sentiment on Twitter. In: AAAI Spring Symposium: Analyzing Microtext. AAAI Technical report, vol. SS-13-01. AAAI (2013)

    Google Scholar 

  26. Risch, J., Krestel, R.: Aggression identification using deep learning and data augmentation. In: Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC 2018), pp. 150–158 (2018)

    Google Scholar 

  27. Tasoulis, S., Doukas, C., Plagianakos, V., Maglogiannis, I.: Statistical data mining of streaming motion data for activity and fall recognition in assistive environments. Neurocomputing 107, 87–96 (2013). Timely Neural Networks Applications in Engineering

    Article  Google Scholar 

  28. Tasoulis, S.K., Vrahatis, A.G., Georgakopoulos, S.V., Plagianakos, V.P.: Real time sentiment change detection of Twitter data streams. CoRR abs/1804.00482 (2018)

    Google Scholar 

  29. Thelwall, M.: The heart and soul of the web? Sentiment strength detection in the social web with sentistrength, pp. 119–134. Springer, Cham (2017)

    Google Scholar 

  30. Wang, H., Can, D., Kazemzadeh, A., Bar, F., Narayanan, S.: A system for real-time Twitter sentiment analysis of 2012 US presidential election cycle. In: Proceedings of the ACL 2012 System Demonstrations, pp. 115–120. Association for Computational Linguistics (2012)

    Google Scholar 

  31. Wickham, H.: stringr: Simple, Consistent Wrappers for Common String Operations (2017). https://CRAN.R-project.org/package=stringr. R package version 1.2.0

  32. Wulczyn, E., Thain, N., Dixon, L.: Ex machina: personal attacks seen at scale. In: Proceedings of the 26th International Conference on World Wide Web, WWW 2017, pp. 1391–1399. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva (2017)

    Google Scholar 

  33. Wulczyn, E., Thain, N., Dixon, L.: Wikipedia talk labels: aggression (2017)

    Google Scholar 

  34. Wulczyn, E., Thain, N., Dixon, L.: Wikipedia talk labels: personal attacks (2017)

    Google Scholar 

Download references

Acknowledgment

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X Pascal GPU used for this research. This project has received funding from the Hellenic Foundation for Research and Innovation (HFRI) and the General Secretariat for Research and Technology (GSRT), under grant agreement No 1901.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vassilis P. Plagianakos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Georgakopoulos, S.V., Tasoulis, S.K., Vrahatis, A.G., Plagianakos, V.P. (2020). Convolutional Neural Networks for Twitter Text Toxicity Analysis. In: Oneto, L., Navarin, N., Sperduti, A., Anguita, D. (eds) Recent Advances in Big Data and Deep Learning. INNSBDDL 2019. Proceedings of the International Neural Networks Society, vol 1. Springer, Cham. https://doi.org/10.1007/978-3-030-16841-4_38

Download citation

Publish with us

Policies and ethics