Skip to main content

Using Machine Learning to Detect the Signs of Radicalization and Hate Speech on Twitter

  • Conference paper
  • First Online:
  • 458 Accesses

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 255))

Abstract

The paper deals with the issue of hate speech and radicalization. Oftentimes, they are spread by means of social media. Twitter lets one express their views in a relatively anonymous way; however, it seems to be a simple, yet effective tool for disseminating offensive or radical contents, too. The paper proposes an effective solution which applies machine learning for detecting signs of radicalization and hate speech in Twitter posts. The authors decided to use the Polish language, which due to the level of its complexity is known to pose a challenge for automated sentiment analysis. The authors also needed to create their own dataset of posts containing hate speech, as prior to the experiment, there existed no such datasets in the language. In the paper, the underlying technologies are first presented, then the course of experiment is described and the final conclusions are given thereafter.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Alshalan, R., Al-Khalifa, H.: A deep learning approach for automatic hate speech detection in the Saudi Twittersphere. Appl. Sci. 10(23), 8614 (2020). https://doi.org/10.3390/app10238614

    Article  Google Scholar 

  2. Article 19: UN HRC maintains consensus on Internet resolution (2018). https://tinyurl.com/tp3p7pu3

  3. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)

    MATH  Google Scholar 

  4. Berger, J., Morgan, J.: The ISIS Twitter census defining and describing the population of ISIS supporters on Twitter. Technical report, The Brookings Project on U.S. Relations with the Islamic World, Washington (2015)

    Google Scholar 

  5. Bloomfield, E.F., Tillery, D.: The circulation of climate change denial online: rhetorical and networking strategies on Facebook. Environ. Commun. 13(1), 23–34 (2019). https://doi.org/10.1080/17524032.2018.1527378

    Article  Google Scholar 

  6. Bobriakov, I.: Sentiment analysis with naive bayes and LSTM. Data Science Central (2020). https://tinyurl.com/5mdzkf4h

  7. Bradshaw, S., Howard, P.N.: The global disinformation order 2019 global inventory of organised social media manipulation. Technical report, Computational Propaganda Research Project (2019). https://tinyurl.com/mz9nf5j8

  8. Choraś, M., et al.: Advanced machine learning techniques for fake news (online disinformation) detection: a systematic mapping study. Appl. Soft Comput. 101, 107050 (2020)

    Google Scholar 

  9. De Souza, G.A., Da Costa-Abreu, M.: Automatic offensive language detection from Twitter data using machine learning and feature selection of metadata. In: 2020 IJCNN, pp. 1–6. IEEE (2020). https://doi.org/10.1109/IJCNN48605.2020.9207652

  10. Fauzi, M.A.: Word2Vec model for sentiment analysis of product reviews in Indonesian language. In. J. Electr. Comput. Eng. (IJECE) 9(1), 525 (2019). https://doi.org/10.11591/ijece.v9i1.pp525-530

    Article  Google Scholar 

  11. Fbi: How Do Violent Extremists Make Contact? (2021). https://www.fbi.gov/cve508/teen-website/how

  12. Gaydhani, A., Doma, V., Kendre, S., Bhagwat, L.: Detecting hate speech and offensive language on Twitter using machine learning: an N-gram and TFIDF based approach (2018)

    Google Scholar 

  13. Internet World Stats: Internet Usage Statistics; The Internet Big Picture; World Internet Users and 2021 Population Stats (2021). https://www.internetworldstats.com/stats.htm

  14. Jacobo, J.: This is what Trump told supporters before many stormed Capitol Hill. ABC News (2021). https://tinyurl.com/w5aaar5c

  15. Jang, B., Kim, I., Kim, J.W.: Word2vec convolutional neural networks for classification of news articles and tweets. PLOS One 14(8), e0220,976 (2019). https://doi.org/10.1371/journal.pone.0220976

    Article  Google Scholar 

  16. Khattak, F.K., Jeblee, S., Pou-Prom, C., Abdalla, M., Meaney, C., Rudzicz, F.: A survey of word embeddings for clinical text. J. Biomed. Inf. X 4, 100,057 (2019). https://doi.org/10.1016/j.yjbinx.2019.100057

    Article  Google Scholar 

  17. Kula, S., Choraś, M., Kozik, R.: Application of the BERT-based architecture in fake news detection. In: Conference on Complex, Intelligent, and Software Intensive Systems, pp. 239–249. Springer (2020)

    Google Scholar 

  18. Lewis, R.: Alternative influence; Broadcasting the reactionary right on YouTube. Data & Society (2018). https://tinyurl.com/4pys8w93

  19. Liu, B.: Sentiment analysis and subjectivity. Handb. Nat. Lang. Process. 2(2010), 627–666 (2010)

    Google Scholar 

  20. Lyons, D.: The 6 hardest languages For English speakers to learn. Babbel Magazine (2021). https://tinyurl.com/drb83774

  21. Ma, L., Zhang, Y.: Using Word2Vec to process big text data. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 2895–2897. IEEE (2015). https://doi.org/10.1109/BigData.2015.7364114

  22. McDonald, S., Ramscar, M.: Testing the distributioanl hypothesis: the influence of context on judgements of semantic similarity. In: Proceedings of the Annual Meeting of the Cognitive Science Society, vol. 23 (2001)

    Google Scholar 

  23. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013). http://arxiv.org/abs/1301.3781

  24. Mussiraliyeva, S., Bolatbek, M., Omarov, B., Medetbek, Z., Baispay, G., Ospanov, R.: On detecting online radicalization and extremism using natural language processing. In: 2020 21st International Arab Conference on Information Technology (ACIT), pp. 1–5. IEEE (2020). https://doi.org/10.1109/ACIT50332.2020.9300086

  25. Nugroho, K., et al.: Improving random forest method to detect hatespeech and offensive word. In: 2019 ICOIACT, pp. 514–518. IEEE (2019). https://doi.org/10.1109/ICOIACT46704.2019.8938451

  26. Pereira-Kohatsu, J.C., Quijano-Sánchez, L., Liberatore, F., Camacho-Collados, M.: Detecting and monitoring hate speech in Twitter. Sensors 19(21), 4654 (2019). https://doi.org/10.3390/s19214654

    Article  Google Scholar 

  27. Pitsilis, G.K., Ramampiaro, H., Langseth, H.: Effective hate-speech detection in Twitter data using recurrent neural networks. Appl. Intell. 48(12), 4730–4742 (2018). https://doi.org/10.1007/s10489-018-1242-y

    Article  Google Scholar 

  28. Ran: Extremists’ Use of Video Gaming - Strategies and Narratives (2020)

    Google Scholar 

  29. Staudemeyer, R.C., Morris, E.R.: Understanding LSTM - a tutorial into long short-term memory recurrent neural networks (2019)

    Google Scholar 

  30. The Washington Post: How rumors on WhatsApp led to a mob killing in India | The Fact Checker. The Washington Post (2020)

    Google Scholar 

  31. United Nations Organization: United Nations Strategy and Plan of Action on Hate Speech (2020)

    Google Scholar 

  32. Westerlund, M.: The emergence of deepfake technology: a review. Technol. Innov. Manage. Rev. 9(11), 39–52 (2019). https://doi.org/10.22215/timreview/1282

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aleksandra Pawlicka .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kuchczyński, M., Pawlicka, A., Pawlicki, M., Choraś, M. (2022). Using Machine Learning to Detect the Signs of Radicalization and Hate Speech on Twitter. In: Choraś, M., Choraś, R.S., Kurzyński, M., Trajdos, P., Pejaś, J., Hyla, T. (eds) Progress in Image Processing, Pattern Recognition and Communication Systems. CORES IP&C ACS 2021 2021 2021. Lecture Notes in Networks and Systems, vol 255. Springer, Cham. https://doi.org/10.1007/978-3-030-81523-3_21

Download citation

Publish with us

Policies and ethics