skip to main content
10.1145/3570748.3570753acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaintecConference Proceedingsconference-collections
research-article

Exploring Crowdsourced Content Moderation Through Lens of Reddit during COVID-19

Published: 19 December 2022 Publication History

Abstract

In 2020, when COVID-19 struck, social media gained even more influence in people’s lives due to increased online activity. This event led to a surge of false information and cyberbullying, making content moderation harder than ever. Given this challenge, exploring opportunities to explore content moderation solutions to reduce hate speech and fake news on social media is vital. In this paper, we examine if existing content moderation systems are enough during global pandemics and, if not, where gaps may lie. Due to its intriguing Decentralized Content Management System (DCMS), we chose Reddit as the key social networking platform for our hypothesis testing. We used 1.8 million Reddit posts from COVID-19-related subreddits from January 2020 to April 2021. Our findings reveal several significant trends regarding the effect of a worldwide event on content moderation methods designed to lessen the prevalence of hazardous content and fake news. In light of these considerations, we provide the results of comprehensive research conducted with particular attention paid to the user-generated material and the DCMS of Reddit.

References

[1]
2021. Content Moderation Techniques. (2021). https://vsd.ccs.neu.edu/content_moderation/techniques/
[2]
2022. 10 Reddit Stats You Should Know in 2022. (2022). https://www.oberlo.com/blog/reddit-statistics
[3]
Jon Agley and Yunyu Xiao. 2021. Misinformation about COVID-19: evidence for differential latent profiles and a strong association with trust in science. BMC Public Health 21, 1 (2021), 1–12.
[4]
Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th international conference on World Wide Web companion. 759–760.
[5]
BBC. 2020. Coronavirus: The human cost of virus misinformation. (2020). https://www.bbc.com/news/stories-52731624
[6]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. the Journal of machine Learning research 3 (2003), 993–1022.
[7]
Robyn Caplan. 2018. Content or context moderation?(2018).
[8]
Mark Carman, Mark Koerber, Jiuyong Li, Kim-Kwang Raymond Choo, and Helen Ashman. 2018. Manipulating visibility of political and apolitical threads on reddit via score boosting. In 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE, 184–190.
[9]
Eshwar Chandrasekharan and Eric Gilbert. 2019. Hybrid approaches to detect comments violating macro norms on reddit. arXiv preprint arXiv:1904.03596(2019).
[10]
Corel. 2021. Corel: Toxic Comments|Talk Documentation. (2021). https://docs.coralproject.net/talk/toxic-comments/
[11]
Thomas Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. 2017. Automated hate speech detection and the problem of offensive language. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 11.
[12]
Fabio Del Vigna12, Andrea Cimino23, Felice Dell’Orletta, Marinella Petrocchi, and Maurizio Tesconi. 2017. Hate me, hate me not: Hate speech detection on facebook. In Proceedings of the First Italian Conference on Cybersecurity (ITASEC17). 86–95.
[13]
Nemanja Djuric, Jing Zhou, Robin Morris, Mihajlo Grbovic, Vladan Radosavljevic, and Narayan Bhamidipati. 2015. Hate speech detection with comment embeddings. In Proceedings of the 24th international conference on world wide web. 29–30.
[14]
Mai ElSherief, Vivek Kulkarni, Dana Nguyen, William Yang Wang, and Elizabeth Belding. 2018. Hate lingo: A target-based linguistic analysis of hate speech in social media. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 12.
[15]
Lizhou Fan, Huizi Yu, and Zhanyuan Yin. 2020. Stigmatization in social media: Documenting and analyzing hate speech for COVID-19 on Twitter. Proceedings of the Association for Information Science and Technology 57, 1 (2020), e313.
[16]
Tracie Farrell, Miriam Fernandez, Jakub Novotny, and Harith Alani. 2019. Exploring misogyny across the manosphere in reddit. In Proceedings of the 10th ACM Conference on Web Science. 87–96.
[17]
Paula Fortuna, Juan Soler, and Leo Wanner. 2020. Toxic, hateful, offensive or abusive? what are we really classifying? an empirical analysis of hate speech datasets. In Proceedings of the 12th language resources and evaluation conference. 6786–6794.
[18]
Nir Grinberg, Kenneth Joseph, Lisa Friedland, Briony Swire-Thompson, and David Lazer. 2019. Fake news on Twitter during the 2016 US presidential election. Science 363, 6425 (2019), 374–378.
[19]
Laura Hanu and Unitary team. 2020. Detoxify. Github. https://github.com/unitaryai/detoxify. (2020).
[20]
Waleed Iqbal, Junaid Qadir, Gareth Tyson, Adnan Noor Mian, Saeed-ul Hassan, and Jon Crowcroft. 2019. A bibliometric analysis of publications in computer networking research. Scientometrics 119, 2 (2019), 1121–1155.
[21]
Md Saiful Islam, Tonmoy Sarkar, Sazzad Hossain Khan, Abu-Hena Mostofa Kamal, SM Murshid Hasan, Alamgir Kabir, Dalia Yeasmin, Mohammad Ariful Islam, Kamal Ibne Amin Chowdhury, Kazi Selim Anwar, 2020. COVID-19–related infodemic and its impact on public health: A global social media analysis. The American Journal of Tropical Medicine and Hygiene 103, 4 (2020), 1621.
[22]
R Tallal Javed, Mirza Elaaf Shuja, Muhammad Usama, Junaid Qadir, Waleed Iqbal, Gareth Tyson, Ignacio Castro, and Kiran Garimella. 2020. A First Look at COVID-19 Messages on WhatsAppin Pakistan. arXiv preprint arXiv:2011.09145(2020).
[23]
R Tallal Javed, Muhammad Usama, Waleed Iqbal, Junaid Qadir, Gareth Tyson, Ignacio Castro, and Kiran Garimella. 2022. A deep dive into COVID-19-related messages on WhatsApp in Pakistan. Social Network Analysis and Mining 12, 1 (2022), 1–16.
[24]
Kamran Kowsari, Kiana Jafari Meimandi, Mojtaba Heidarysafa, Sanjana Mendu, Laura Barnes, and Donald Brown. 2019. Text classification algorithms: A survey. Information 10, 4 (2019), 150.
[25]
Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In Twenty-ninth AAAI conference on artificial intelligence.
[26]
Siddique Latif, Muhammad Usman, Sanaullah Manzoor, Waleed Iqbal, Junaid Qadir, Gareth Tyson, Ignacio Castro, Adeel Razi, Maged N Kamel Boulos, Adrian Weller, 2020. Leveraging Data Science To Combat COVID-19: A Comprehensive Review. (2020).
[27]
Binny Mathew, Punyajoy Saha, Hardik Tharad, Subham Rajgaria, Prajwal Singhania, Suman Kalyan Maity, Pawan Goyal, and Animesh Mukherjee. 2019. Thou shalt not hate: Countering online hate speech. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13. 369–380.
[28]
Shruthi Mohan, Apala Guha, Michael Harris, Fred Popowich, Ashley Schuster, and Chris Priebe. 2017. The impact of toxic language on the health of reddit communities. In Canadian Conference on Artificial Intelligence. Springer, 51–56.
[29]
Mainack Mondal, Leandro Araújo Silva, and Fabrício Benevenuto. 2017. A measurement study of hate speech in social media. In Proceedings of the 28th acm conference on hypertext and social media. 85–94.
[30]
Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. 2016. Abusive language detection in online user content. In Proceedings of the 25th international conference on world wide web. 145–153.
[31]
Jing Qian, Mai ElSherief, Elizabeth Belding, and William Yang Wang. 2018. Hierarchical CVAE for fine-grained hate speech classification. arXiv preprint arXiv:1809.00088(2018).
[32]
Elizabeth Reichert, Helen Qiu, and Jasmine Bayrooti. 2020. Reading between the demographic lines: Resolving sources of bias in toxicity classifiers. arXiv preprint arXiv:2006.16402(2020).
[33]
Julio CS Reis, Philipe Melo, Kiran Garimella, and Fabrício Benevenuto. 2020. Can WhatsApp benefit from debunked fact-checked stories to reduce misinformation?Harvard Kennedy School Misinformation Review (2020).
[34]
Reuters. 2020. Police arrest 110 people over lynching in western India. (2020). https://www.reuters.com/article/us-india-crime/police-arrest-110-people-over-lynching-in-western-india-idUSKBN2222HT?il=0
[35]
Axel Rodríguez, Carlos Argueta, and Yi-Ling Chen. 2019. Automatic detection of hate speech on facebook using sentiment and emotion analysis. In 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). IEEE, 169–174.
[36]
Jon Roozenbeek, Claudia R Schneider, Sarah Dryhurst, John Kerr, Alexandra LJ Freeman, Gabriel Recchia, Anne Marthe Van Der Bles, and Sander Van Der Linden. 2020. Susceptibility to misinformation about COVID-19 around the world. Royal Society open science 7, 10 (2020), 201199.
[37]
Lee K Royster. 2017. Fake News: Political Solutions to the Online Epidemic. NCL Rev. 96(2017), 270.
[38]
Diego Saez-Trumper. 2019. Online Disinformation and the Role of Wikipedia. arXiv preprint arXiv:1910.12596(2019).
[39]
Jaydeb Sarker, Asif Kamal Turzo, and Amiangshu Bosu. 2020. A Benchmark Study of the Contemporary Toxicity Detectors on Software Engineering Interactions. In 2020 27th Asia-Pacific Software Engineering Conference (APSEC). IEEE, 218–227.
[40]
Vinay Setty and Erlend Rekve. 2020. Truth be Told: Fake News Detection Using User Reactions on Reddit. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 3325–3328.
[41]
Karishma Sharma, Feng Qian, He Jiang, Natali Ruchansky, Ming Zhang, and Yan Liu. 2019. Combating fake news: A survey on identification and mitigation techniques. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 3(2019), 1–42.
[42]
Kai Shu, Amy Sliva, Suhang Wang, Jiliang Tang, and Huan Liu. 2017. Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter 19, 1 (2017), 22–36.
[43]
Kai Shu, Suhang Wang, Dongwon Lee, and Huan Liu. 2020. Mining Disinformation and Fake News: Concepts, Methods, and Recent Advancements. arXiv preprint arXiv:2001.00623(2020).
[44]
Spandana Singh. 2019. Everything in moderation: An analysis of how Internet platforms are using artificial intelligence to moderate user-generated content. New America 22(2019).
[45]
Martino Trevisan, Luca Vassio, Idilio Drago, Marco Mellia, Fabricio Murai, Flavio Figueiredo, Ana Paula Couto da Silva, and Jussara M Almeida. 2019. Towards Understanding Political Interactions on Instagram. In Proceedings of the 30th ACM Conference on Hypertext and Social Media. 247–251.
[46]
Susana M Vieira, Uzay Kaymak, and João MC Sousa. 2010. Cohen’s kappa coefficient as a performance measure for feature selection. In International conference on fuzzy systems. IEEE, 1–8.
[47]
Vox. 2020. How the 5G coronavirus conspiracy theory went from fringe to mainstream. (2020). https://www.vox.com/recode/2020/4/24/21231085/coronavirus-5g-conspiracy-theory-covid-facebook-youtube
[48]
William Warner and Julia Hirschberg. 2012. Detecting hate speech on the world wide web. In Proceedings of the second workshop on language in social media. 19–26.
[49]
Zeerak Waseem and Dirk Hovy. 2016. Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In Proceedings of the NAACL student research workshop. 88–93.
[50]
Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. 2019. Predicting the type and target of offensive posts in social media. arXiv preprint arXiv:1902.09666(2019).
[51]
Koosha Zarei, Reza Farahbakhsh, Noel Crespi, and Gareth Tyson. 2020. A first instagram dataset on covid-19. arXiv preprint arXiv:2004.12226(2020).
[52]
Xichen Zhang and Ali A Ghorbani. 2020. An overview of online fake news: Characterization, detection, and discussion. Information Processing & Management 57, 2 (2020), 102025.
[53]
Xinyi Zhou and Reza Zafarani. 2018. Fake news: A survey of research, detection methods, and opportunities. arXiv preprint arXiv:1812.00315(2018).
[54]
Fabiana Zollo and Walter Quattrociocchi. 2018. Misinformation spreading on Facebook. In Complex Spreading Phenomena in Social Systems. Springer, 177–196.

Cited By

View all
  • (2024)An attack on free speech? Examining content moderation, (de-), and (re-) platforming on American right-wing alternative social mediaNew Media & Society10.1177/14614448241228850Online publication date: 5-Feb-2024
  • (2024)A Cross Community Comparison of Muting in Conversations of Gendered Violence on RedditProceedings of the ACM on Human-Computer Interaction10.1145/36869408:CSCW2(1-29)Online publication date: 8-Nov-2024
  • (2023)Bots, disinformation, and the first impeachment of U.S. President Donald TrumpPLOS ONE10.1371/journal.pone.028397118:5(e0283971)Online publication date: 8-May-2023
  • Show More Cited By

Index Terms

  1. Exploring Crowdsourced Content Moderation Through Lens of Reddit during COVID-19

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      AINTEC '22: Proceedings of the 17th Asian Internet Engineering Conference
      December 2022
      104 pages
      ISBN:9781450399814
      DOI:10.1145/3570748
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 December 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. COVID-19
      2. content moderation
      3. fake news
      4. hate speech
      5. reddit

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      • Engineering and Physical Sciences Research Council

      Conference

      AINTEC'22
      AINTEC'22: The 17th Asian Internet Engineering Conference
      December 19 - 21, 2022
      Hiroshima, Japan

      Acceptance Rates

      Overall Acceptance Rate 15 of 38 submissions, 39%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)55
      • Downloads (Last 6 weeks)9
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)An attack on free speech? Examining content moderation, (de-), and (re-) platforming on American right-wing alternative social mediaNew Media & Society10.1177/14614448241228850Online publication date: 5-Feb-2024
      • (2024)A Cross Community Comparison of Muting in Conversations of Gendered Violence on RedditProceedings of the ACM on Human-Computer Interaction10.1145/36869408:CSCW2(1-29)Online publication date: 8-Nov-2024
      • (2023)Bots, disinformation, and the first impeachment of U.S. President Donald TrumpPLOS ONE10.1371/journal.pone.028397118:5(e0283971)Online publication date: 8-May-2023
      • (2023)Will Admins Cope? Decentralized Moderation in the FediverseProceedings of the ACM Web Conference 202310.1145/3543507.3583487(3109-3120)Online publication date: 30-Apr-2023

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media