Abstract
Antisemitism is a global phenomenon on the rise that is negatively affecting Jews and communities more broadly. It has been argued that social media has opened up new opportunities for antisemites to disseminate material and organize. It is, therefore, necessary to get a picture of the scope and nature of antisemitism on social media. However, identifying antisemitic messages in large datasets is not trivial and more work is needed in this area. In this paper, we present and describe an annotated dataset that can be used to train tweet classifiers. We first explain how we created our dataset and approached identifying antisemitic content by experts. We then describe the annotated data, where 11% of conversations about Jews (January 2019–August 2020) and 13% of conversations about Israel (January–August 2020) were labeled antisemitic. Another important finding concerns lexical differences across queries and labels. We find that antisemitic content often relates to conspiracies of Jewish global dominance, the Middle East conflict, and the Holocaust.









Similar content being viewed by others
Notes
The asterisk serves as a wildcard in our search query. Thus, the query includes "ZioNazi,” "ZioNazis" and "ZioNazism.".
IHRA Working Definition of Antisemitism, see https://www.holocaustremembrance.com/resources/working-definitions-charters/working-definition-antisemitism.
The two tweets did not contain enough information to reach an agreement. One of them was a reply to another user that contained only the word "Jews." In the series of tweets, it could mean that Jews were being blamed for something. However, it was unclear at the time the two commenters were discussing it. Both tweets have since been deleted.
For the keyword “ZioNazi*” we only find an agreement of 38%. This is due to one annotator choosing the following paragraph consistently for all antisemitic tweets: “Drawing comparisons of contemporary Israeli policy to that of the Nazis.” This relatively low number is a side effect of our annotation portal only allowing the choice of one Working Definition paragraph when annotating.
S. Bird, E. Klein, and E. Loper (2009), Natural language processing with Python: analyzing text with the natural language toolkit," O’Reilly Media, Inc.,” PorterStemming() package.
Scikit-learn: Machine Learning in Python, Pedregosa et al. JMLR 12, pp. 2825–2830, 2011., CountVectorizer() package.
The token 'palestinian' appears most frequently in both antisemitic messages with 1.46% (n = 18) and non-antisemitic messages with 2.3% (n = 67).
In particular, tweets including the insult "Kikes" show a high usage of emojis and mixed language with a span of N = 722 unique tokens for tweets classified as antisemitic, and 34.67% tokens (n = 250) represent emojis. In contrast, non-antisemitic tweets with a span of n = 788 unique tokens contain 4.82% (n = 38), only a margin of tokens that represent emojis.
Figure 2 shows a word graph using the top 50 words by keyword (based on Sklearn’s CountVectorizer term frequency) for edge weights connecting words to the tweet annotation type and the Louvain Modularity algorithm within the Gephi application for grouping. Before graphing, English stop words and keywords were removed. The NLTK Porter Stemmer package was applied for word stemming. This resulted in a total of 238 unique words.
This includes two duplicates in the randomized samples.
The input was the full tweet texts for each corpus. A corpus is one keyword and either antisemitic or non-antisemitic coding. Sklearn’s CountVectorizer function was used to create a text frequency vector filtering words used in only one tweet and those used in more than 95% of the tweets, for a maximum of 10,000 words applying its word analyzer feature and stemming words into tokens.
For further reference see Yener (2020) Step by Step: Twitter Sentiment Analysis in Python, in Towards Data Science (https://towardsdatascience.com/step-by-step-twitter-sentiment-analysis-in-python-d6f650ade58d), accessed 20 September 2021.
For further API reference see https://textblob.readthedocs.io/en/dev/quickstart.html#sentiment-analysis.
Clarifying the concept of “followers” and “friends”: According to Twitter, “friends” are those whom the Twitter user follows (back), and “followers” are those who follow a particular user.
Usernames are only displayed if they have more than 1000 followers or refer to an organization, institute, or NGO. Otherwise, they appear blackened.
According to the profile's description, the bot slices random text and subsequently publishes it on Twitter. After examining the profile more thoroughly, we can assume that this user is not a human subject.
References
Adorno TW (1950) Prejudice in the Interview Material. In: Adorno TW, Frenkel-Brunswik E, Levinson DJ, Sanford RN (eds) The authoritarian personality. Studies in prejudice series, vol 1. Harper & Brothers, Manhattan, pp 605–653
American Jewish Committee (2021) The state of antisemitism in America 2021. Accessed from https://www.ajc.org/AntisemitismReport2021.
Anti-Defamation League (2022) Audit of antisemitic incidents 2021. Accessed from https://adl.org/resources/report/audit-antisemitic-incidents-2021.
Barlow E (2021) The social media pogrom. Tablet Magazine. May 25, 2021
Bruns A (2020) Big social data approaches in internet studies: the case of Twitter. In: Hunsinger J, Allen MM, Klastrup L (eds) Second international handbook of internet research. Springer, Dordrecht, pp 65–81
Center for the Study of Contemporary European Jewry, Tel Aviv University (2022) Antisemitism worldwide. Report 2021. Accessed from https://cst.tau.ac.il/annual-reports-on-worldwide-antisemitism/
Community Security Trust (2020) Coronavirus and the plague of antisemitism. Research brief. Accessed from https://cst.org.uk/data/file/d/9/Coronavirus%20and%20the%20plague%20of%20antisemitism.1615560607.pdf
Community Security Trust (2022) Incidents report 2021. Accessed from https://cst.org.uk/data/file/f/f/Incidents%20Report%202021.1644318940.pdf
Davidson T, Bhattacharya D, Weber I (2019) Racial bias in hate speech and abusive language detection datasets. Proceedings of the third workshop on abusive language online. Association for Computational Linguistics, Florence, pp 25–35
Deutscher B (2021) Drucksache 20/38. Antwort der Bundesregierung auf die Kleine Anfrage der Abgeordneten Petra Pau, Nicole Gohlke, Gökay Akbulut, weiterer Abgeordneter und der Fraktion DIE LINKE.—Drucksache 20/6—Antisemitische Straftaten im dritten Quartal 2021. Accessed from https://dserver.bundestag.de/btd/20/000/2000038.pdf
European Commission, Directorate General for Justice and Consumers, Milo Comerford, Lea Gerster (2021) The rise of antisemitism online during the pandemic: a study of French and German content. Publications Office, Luxembourg
European Union Agency for Fundamental Rights (2018) Experiences and perceptions of antisemitism. Second survey on discrimination and hate crime against Jews in the EU. Luxembourg. Accessed from https://fra.europa.eu/en/publication/2018/experiences-and-perceptions-antisemitism-second-survey-discrimination-and-hate.
Herf J (2021) IHRA and JDA: Examining definitions of antisemitism in 2021. Fathom, April. Accessed from https://fathomjournal.org/ihra-and-jda-examining-definitions-of-antisemitism-in-2021/
Jikeli G, Cavar D, Miehling D (2019) “Annotating Antisemitic Online Content. Towards an Applicable Definition of Antisemitism”. https://doi.org/10.5967/3r3m-na89
Jikeli G, Awasthi D, Axelrod D, Miehling D, Wagh P, Joeng W (2021) “Detecting Anti-Jewish messages on social media. Building an annotated corpus that can serve as a preliminary gold standard.” In Proceedings of the ICWSM Workshops. US: ICWSM. https://doi.org/10.36190/2021.14
Malmasi S, Marcos Z (2017) Detecting hate speech in social media. Accessed from http://arxiv.org/abs/1712.06427
Marcus KL (2015) The definition of anti-semitism. Oxford University Press, New York
Porat D (2019) The working definition of antisemitism—a 2018 perception. In: Lange A, Mayerhofer K, Porat D, Schiffman LH (eds) Comprehending and confronting antisemitism. De Gruyter, Stroudsburg, pp 475–488
Schwarz-Friesel M (2019) ‘Antisemitism 2.0’—the spreading of Jew-hatred on the World Wide Web. In: Lange A, Mayerhofer K, Porat D, Schiffman LH (eds) Comprehending and confronting antisemitism. De Gruyter, Stroudsburg, pp 311–338
Service de Protection de la Communauté Juive (2022) 2021 Rapport sur l’antisémitisme en France. Accessed from https://www.spcj.org/rapport-sur-l-antis%C3%A9mitisme-2021
United Nations, Special Rapporteur on freedom of religion or belief (2019) Report on Combating Antisemitism to Eliminate Discrimination and Intolerance Based on Religion or Belief. A/74/358. Presented to the 74th Session of General Assembly on 17 October 2019. Accessed from https://undocs.org/A/74/358
Vidgen B, Nguyen D, Margetts H, Rossini P, Tromble R (2021) Introducing CAD: the contextual abuse dataset. Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies. Association for Computational Linguistics, Stroudsburg, pp 2289–2303
Yang K-C, Varol O, Hui P-M, Menczer F (2020) Scalable and generalizable social bot detection through data selection. Proc AAAI Conf Artif Intell 34(01):1096–1103. https://doi.org/10.1609/aaai.v34i01.5460
Yener, Y (2020) Step by step: Twitter sentiment analysis in python. Towards Data Science. https://towardsdatascience.com/step-by-step-twitter-sentiment-analysis-in-python-d6f650ade58d. Accessed 20 Sept 2021
Zannettou S, Finkelstein J, Bradlyn B, Blackburn J (2020) A quantitative approach to understanding online antisemitism. In: Proceedings of the International AAAI Conference on Web and Social Media 14 (May), pp 786–797
Acknowledgements
This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562. We are grateful that we were able to use Indiana University’s Observatory on Social Media (OSoMe) tool and data (Davis et al. 2016). This research was supported by the Koret Foundation.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jikeli, G., Axelrod, D., Fischer, R.K. et al. Differences between antisemitic and non-antisemitic English language tweets. Comput Math Organ Theory 30, 232–266 (2024). https://doi.org/10.1007/s10588-022-09363-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10588-022-09363-2