ABSTRACT
In this work, we tackled the problem of the automatic classification of the extremist propaganda on Twitter, focusing on the Islamic State of Iraq and al-Sham (ISIS). We built and published several datasets, obtained by mixing 15,684 ISIS propaganda tweets with a variable number of neutral tweets, related to ISIS, and random ones, accounting for imbalances up to 1%. We considered three state-of-the-art, deep learning techniques, representative of the main current approaches to text classification, and two strong linear machine learning baselines. We compared their performance when varying the composition of the training and test sets, in order to explore different training strategies, and to evaluate the results when approaching realistic conditions. We demonstrated that a Recurrent-Convolutional Neural Network, based on pre-trained word embeddings, can reach an excellent F1 score of 0.9 on the most challenging test condition (1%-imbalance).
- Swati Agarwal and Ashish Sureka. 2015. Using KNN and SVM based one-class classifier for detecting online radicalization on Twitter. In ICDCIT'15. Springer, 431--442. Google ScholarDigital Library
- Michael Ashcroft, Ali Fisher, Lisa Kaati, Enghin Omer, and Nico Prucha. 2015. Detecting jihadist messages on twitter. In EISIC'15. IEEE, 161--164. Google ScholarDigital Library
- Marco Avvenuti, Stefano Cresci, Leonardo Nizzoli, and Maurizio Tesconi. 2018. GSP (Geo-Semantic-Parsing): Geoparsing and Geotagging with machine learning on top of linked data. In ESWC'18. Springer, 17--32.Google ScholarCross Ref
- Stefano Cresci, Salvatore Minutoli, Leonardo Nizzoli, Serena Tardelli, and Maurizio Tesconi. 2019. Enriching Digital Libraries with Crowdsensed Data. In IRCDL'19. Springer, 144--158.Google Scholar
- Tiziano Fagni, Leonardo Nizzoli, Marinella Petrocchi, and Maurizio Tesconi. 2019. Six Things I Hate About You (in Italian) and Six Classification Strategies to More and More Effectively Find Them. In ITASEC'19.Google Scholar
- Emilio Ferrara, Wen-Qiang Wang, Onur Varol, Alessandro Flammini, and Aram Galstyan. 2016. Predicting online extremism, content adopters, and interaction reciprocity. In SOCINFO'16. Springer, 22--39.Google ScholarDigital Library
- Andrew H Johnston and Gary M Weiss. 2017. Identifying Sunni extremist propaganda with deep learning. In SSCI'2017. IEEE, 1--6.Google ScholarCross Ref
- Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In AAAI'15, Vol. 333. 2267--2273. Google ScholarDigital Library
- Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, and Armand Joulin. 2018. Advances in pre-training distributed word representations. In LREC'18. ELRA.Google Scholar
- David Omand, Jamie Bartlett, and Carl Miller. 2012. Introducing social media intelligence (SOCMINT). Intelligence and National Security 27, 6 (2012), 801--823.Google ScholarCross Ref
- Sida Wang and Christopher D Manning. 2012. Baselines and bigrams: Simple, good sentiment and topic classification. In ACL'12. 90--94. Google ScholarDigital Library
- Xiang Zhang, Junbo Zhao, and Yann LeCun. 2015. Character-level convolutional networks for text classification. In NIPS'15. 649--657. Google ScholarDigital Library
- Peng Zhou, Zhenyu Qi, Suncong Zheng, Jiaming Xu, Hongyun Bao, and Bo Xu. 2016. Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. In COLING'16. 3485--3495.Google Scholar
Index Terms
- Extremist Propaganda Tweet Classification with Deep Learning in Realistic Scenarios
Recommendations
Predicting Tweet Retweetability during Hurricane Disasters
Twitter is a vital source for obtaining information, especially during events such as natural disasters. Users can spread information on Twitter either by crafting new posts, which are called "tweets," or by using the retweet mechanism to re-post ...
Academic Tweet Classification with Spreading activation based Label propagation algorithm using Tweet centric features
ICIA-16: Proceedings of the International Conference on Informatics and AnalyticsSocial network like Twitter is used by researchers and academicians to develop their professional relationship and as well it acts as a communication tool to share their research ideas, and research results. Among the enormous number of tweets, certain ...
IRA Propaganda on Twitter: Stoking Antagonism and Tweeting Local News
SMSociety '18: Proceedings of the 9th International Conference on Social Media and SocietyThis paper presents preliminary findings of a content analysis of tweets posted by false accounts operated by the Internet Research Agency (IRA) in St Petersburg. We relied on a historical database of tweets to retrieve 4,539 tweets posted by IRA-linked ...
Comments