skip to main content
10.1145/3447535.3462640acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article

Towards a Novel Benchmark for Automatic Generation of ClaimReview Markup

Published: 22 June 2021 Publication History

Abstract

The spreading of disinformation throughout the web has become a critical problem for a democratic society. The dissemination of fake news has become a profitable business and a common practice among politicians and content producers. On the other hand, journalists and fact-checkers work unceasingly to debunk misinformation and prevent it from further spreading. In 2015, a new web markup called ClaimReview has been introduced to grant access to the fact-checking article’s meaning by search engines. It is an important initiative to fight fake news by promoting and highlighting fact-check articles among users. However, barely half of fact-checkers have adopted the ClaimReview markup so far, resulting in low findability of fact-check articles, especially in under-represented countries and languages. In this work, we investigate the viability of using Artificial Intelligence for generating ClaimReview automatically. Besides promoting fact-check articles, the automatic generating of ClaimReview is an important step towards the creation of updated multilingual knowledge base for fighting disinformation. Our experiments show noticeable results, which indicate a viable solution in a production environment. Furthermore, this work has created a benchmark that can be used in upcoming investigations in this domain.

Supplementary Material

MP4 File (PS1.3_EduardoCortes_Towards_a_NovelBenchmark_for_AutomaticGeneration_of_ClaimReviewMarkup.mp4)
Towards a Novel Benchmark for ClaimReviewed and ReviewRating Prediction in Fact-checking Articles

References

[1]
Ben Adler and Giacomo Boscaini-Gilroy. 2019. Real-time claim detection from news articles and retrieval of semantically-similar factchecks. arXiv preprint arXiv:1907.02030(2019).
[2]
Nujud Aloshban. 2020. ACT: Automatic Fake News Classification Through Self-Attention. In 12th ACM Conference on Web Science. 115–124.
[3]
Joshua A Braun and Jessica L Eklund. 2019. Fake news, real money: Ad tech platforms, profit-driven hoaxes, and the business of journalism. Digital Journalism 7, 1 (2019), 1–21.
[4]
Josue Caraballo. 2018. A Taxonomy of Political Claims. (2018).
[5]
David Caswell and Konstantin Dörr. 2018. Automated Journalism 2.0: Event-driven narratives: From simple descriptions to real stories. Journalism practice 12, 4 (2018), 477–496.
[6]
Eduardo Cortes, Vinicius Woloszyn, Arne Binder, Tilo Himmelsbach, Dante Barone, and Sebastian Möller. 2020. An Empirical Comparison of Question Classification Methods for Question Answering Systems. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association, Marseille, France, 5408–5416. https://www.aclweb.org/anthology/2020.lrec-1.665
[7]
Eduardo G. Cortes, Vinicius Woloszyn, and Dante A. C. Barone. 2018. When, Where, Who, What or Why? A Hybrid Model to Question Answering Systems. In Computational Processing of the Portuguese Language, Aline Villavicencio, Viviane Moreira, Alberto Abad, Helena Caseli, Pablo Gamallo, Carlos Ramisch, Hugo Gonçalo Oliveira, and Gustavo Henrique Paetzold(Eds.). Springer International Publishing, Cham, 136–146.
[8]
Ronald Denaux and Jose Manuel Gomez-Perez. 2020. Linked Credibility Reviews for Explainable Misinformation Detection. In International Semantic Web Conference. Springer, 147–163.
[9]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arxiv:1810.04805 [cs.CL]
[10]
William Ferreira and Andreas Vlachos. 2016. Emergent: a novel data-set for stance classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: Human language technologies. 1163–1168.
[11]
Google. 2019. Fact-Check Markup Tool. Retrieved January 19, 2021 from https://toolbox.google.com/factcheck/markuptool
[12]
Naeemul Hassan, Fatma Arslan, Chengkai Li, and Mark Tremayne. 2017. Toward automated fact-checking: Detecting check-worthy factual claims by claimbuster. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1803–1812.
[13]
Shan Jiang, Simon Baumgartner, Abe Ittycheriah, and Cong Yu. 2020. Factoring fact-checks: Structured information extraction from fact-checking articles. In Proceedings of The Web Conference 2020. 1592–1603.
[14]
Lev Konstantinovskiy, Oliver Price, Mevan Babakar, and Arkaitz Zubiaga. 2018. Towards automated factchecking: Developing an annotation schema and benchmark for consistent automated claim detection. arXiv preprint arXiv:1809.08193(2018).
[15]
Verónica Pérez-Rosas, Bennett Kleinberg, Alexandra Lefevre, and Rada Mihalcea. 2017. Automatic detection of fake news. arXiv preprint arXiv:1708.07104(2017).
[16]
Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2020. FakeNewsNet: A Data Repository with News Content, Social Context, and Spatiotemporal Information for Studying Fake News on Social Media. Big Data 8, 3 (2020), 171–188.
[17]
Fábio Souza, Rodrigo Nogueira, and Roberto Lotufo. 2020. BERTimbau: pretrained BERT models for Brazilian Portuguese. In 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, October 20-23 (to appear).
[18]
Hajah T Sueno, Bobby D Gerardo, and Ruji P Medina. 2020. Multi-class document classification using support vector machine (SVM) based on improved Naïve bayes vectorization technique. International Journal of Advanced Trends in Computer Science and Engineering 9, 3(2020).
[19]
Andon Tchechmedjiev, Pavlos Fafalios, Katarina Boland, Malo Gasquet, Matthäus Zloch, Benjamin Zapilko, Stefan Dietze, and Konstantin Todorov. 2019. ClaimsKG: a knowledge graph of fact-checked claims. In International Semantic Web Conference. Springer, 309–324.
[20]
Andreas Vlachos and Sebastian Riedel. 2014. Fact checking: Task definition and dataset construction. In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science. 18–22.
[21]
Nguyen Vo and Kyumin Lee. 2020. Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News. arXiv preprint arXiv:2010.03159(2020).
[22]
William Yang Wang. 2017. ” liar, liar pants on fire”: A new benchmark dataset for fake news detection. arXiv preprint arXiv:1705.00648(2017).
[23]
Xuezhi Wang, Cong Yu, Simon Baumgartner, and Flip Korn. 2018. Relevant document discovery for fact-checking articles. In Companion Proceedings of the The Web Conference 2018. 525–533.
[24]
Zhiquan Wang and Zhiyi Qu. 2017. Research on Web text classification algorithm based on improved CNN and SVM. In 2017 IEEE 17th International Conference on Communication Technology (ICCT). IEEE, 1958–1961.
[25]
Vinicius Woloszyn and Wolfgang Nejdl. 2018. Distrustrank: Spotting false news domains. In Proceedings of the 10th ACM Conference on Web Science. 221–228.
[26]
Vinicius Woloszyn, Felipe Schaeffer, Beliza Boniatti, Eduardo Cortes, Salar Mohtaj, and Sebastian Möller. 2020. Untrue.News: A New Search Engine For Fake Stories. arXiv preprint arXiv:2002.06585(2020).
[27]
Xinyi Zhou and Reza Zafarani. 2020. A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR) 53, 5 (2020), 1–40.

Cited By

View all
  • (2024)NewsPolyML: Multi-lingual European News Fake Assessment DatasetProceedings of the 3rd ACM International Workshop on Multimedia AI against Disinformation10.1145/3643491.3660290(82-90)Online publication date: 10-Jun-2024
  • (2024)Evaluating Human-Centered AI Explanations: Introduction of an XAI Evaluation Framework for Fact-CheckingProceedings of the 3rd ACM International Workshop on Multimedia AI against Disinformation10.1145/3643491.3660283(91-100)Online publication date: 10-Jun-2024
  • (2024)The Role of Explainability in Collaborative Human-AI Disinformation DetectionProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659031(2157-2174)Online publication date: 3-Jun-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WebSci '21: Proceedings of the 13th ACM Web Science Conference 2021
June 2021
328 pages
ISBN:9781450383301
DOI:10.1145/3447535
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ClaimReview
  2. data sets
  3. fact-checking
  4. machine learning
  5. misinformation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES)

Conference

WebSci '21
Sponsor:
WebSci '21: WebSci '21 13th ACM Web Science Conference 2021
June 21 - 25, 2021
Virtual Event, United Kingdom

Acceptance Rates

Overall Acceptance Rate 245 of 933 submissions, 26%

Upcoming Conference

Websci '25
17th ACM Web Science Conference
May 20 - 24, 2025
New Brunswick , NJ , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)9
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)NewsPolyML: Multi-lingual European News Fake Assessment DatasetProceedings of the 3rd ACM International Workshop on Multimedia AI against Disinformation10.1145/3643491.3660290(82-90)Online publication date: 10-Jun-2024
  • (2024)Evaluating Human-Centered AI Explanations: Introduction of an XAI Evaluation Framework for Fact-CheckingProceedings of the 3rd ACM International Workshop on Multimedia AI against Disinformation10.1145/3643491.3660283(91-100)Online publication date: 10-Jun-2024
  • (2024)The Role of Explainability in Collaborative Human-AI Disinformation DetectionProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency10.1145/3630106.3659031(2157-2174)Online publication date: 3-Jun-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media