skip to main content
10.1145/3292522.3326050acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
poster

Extremist Propaganda Tweet Classification with Deep Learning in Realistic Scenarios

Published:26 June 2019Publication History

ABSTRACT

In this work, we tackled the problem of the automatic classification of the extremist propaganda on Twitter, focusing on the Islamic State of Iraq and al-Sham (ISIS). We built and published several datasets, obtained by mixing 15,684 ISIS propaganda tweets with a variable number of neutral tweets, related to ISIS, and random ones, accounting for imbalances up to 1%. We considered three state-of-the-art, deep learning techniques, representative of the main current approaches to text classification, and two strong linear machine learning baselines. We compared their performance when varying the composition of the training and test sets, in order to explore different training strategies, and to evaluate the results when approaching realistic conditions. We demonstrated that a Recurrent-Convolutional Neural Network, based on pre-trained word embeddings, can reach an excellent F1 score of 0.9 on the most challenging test condition (1%-imbalance).

References

  1. Swati Agarwal and Ashish Sureka. 2015. Using KNN and SVM based one-class classifier for detecting online radicalization on Twitter. In ICDCIT'15. Springer, 431--442. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Michael Ashcroft, Ali Fisher, Lisa Kaati, Enghin Omer, and Nico Prucha. 2015. Detecting jihadist messages on twitter. In EISIC'15. IEEE, 161--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Marco Avvenuti, Stefano Cresci, Leonardo Nizzoli, and Maurizio Tesconi. 2018. GSP (Geo-Semantic-Parsing): Geoparsing and Geotagging with machine learning on top of linked data. In ESWC'18. Springer, 17--32.Google ScholarGoogle ScholarCross RefCross Ref
  4. Stefano Cresci, Salvatore Minutoli, Leonardo Nizzoli, Serena Tardelli, and Maurizio Tesconi. 2019. Enriching Digital Libraries with Crowdsensed Data. In IRCDL'19. Springer, 144--158.Google ScholarGoogle Scholar
  5. Tiziano Fagni, Leonardo Nizzoli, Marinella Petrocchi, and Maurizio Tesconi. 2019. Six Things I Hate About You (in Italian) and Six Classification Strategies to More and More Effectively Find Them. In ITASEC'19.Google ScholarGoogle Scholar
  6. Emilio Ferrara, Wen-Qiang Wang, Onur Varol, Alessandro Flammini, and Aram Galstyan. 2016. Predicting online extremism, content adopters, and interaction reciprocity. In SOCINFO'16. Springer, 22--39.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Andrew H Johnston and Gary M Weiss. 2017. Identifying Sunni extremist propaganda with deep learning. In SSCI'2017. IEEE, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  8. Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In AAAI'15, Vol. 333. 2267--2273. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in pre-training distributed word representations. In LREC'18. ELRA.Google ScholarGoogle Scholar
  10. David Omand, Jamie Bartlett, and Carl Miller. 2012. Introducing social media intelligence (SOCMINT). Intelligence and National Security 27, 6 (2012), 801--823.Google ScholarGoogle ScholarCross RefCross Ref
  11. Sida Wang and Christopher D Manning. 2012. Baselines and bigrams: Simple, good sentiment and topic classification. In ACL'12. 90--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In NIPS'15. 649--657. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Peng Zhou, Zhenyu Qi, Suncong Zheng, Jiaming Xu, Hongyun Bao, and Bo Xu. 2016. Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. In COLING'16. 3485--3495.Google ScholarGoogle Scholar

Index Terms

  1. Extremist Propaganda Tweet Classification with Deep Learning in Realistic Scenarios

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WebSci '19: Proceedings of the 10th ACM Conference on Web Science
        June 2019
        395 pages
        ISBN:9781450362023
        DOI:10.1145/3292522

        Copyright © 2019 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 26 June 2019

        Check for updates

        Qualifiers

        • poster

        Acceptance Rates

        WebSci '19 Paper Acceptance Rate41of130submissions,32%Overall Acceptance Rate218of875submissions,25%

        Upcoming Conference

        Websci '24
        16th ACM Web Science Conference
        May 21 - 24, 2024
        Stuttgart , Germany

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader