Abstract
Automated claim checking is the task of determining the veracity of a claim given evidence retrieved from a textual knowledge base of trustworthy facts. While previous work has taken the knowledge base as given and optimized the claim-checking pipeline, we take the opposite approach—taking the pipeline as given, we explore the choice of the knowledge base. Our first insight is that a claim-checking pipeline can be transferred to a new domain of claims with access to a knowledge base from the new domain. Second, we do not find a “universally best” knowledge base—higher domain overlap of a task dataset and a knowledge base tends to produce better label accuracy. Third, combining multiple knowledge bases does not tend to improve performance beyond using the closest-domain knowledge base. Finally, we show that the claim-checking pipeline’s confidence score for selecting evidence can be used to assess whether a knowledge base will perform well for a new set of claims, even in the absence of ground-truth labels.
- [1] . 2019. MultiFC: A real-world multi-domain dataset for evidence-based fact checking of claims. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 4685–4697.
DOI: Google ScholarCross Ref - [2] . 2021. A review on fact extraction and verification. ACM Comput. Surv. 55, 1, Article
12 (nov 2021), 35 pages.DOI: Google ScholarDigital Library - [3] . 2021. Automatic Claim Review for Climate Science via Explanation Generation. (2021).
arxiv:cs.CL/2107.14740 Google Scholar - [4] . 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems, , , , , and (Eds.), Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.Google Scholar
- [5] . 2019. Extracting statistical mentions from textual claims to provide trusted content. In Natural Language Processing and Information Systems, , , , , and (Eds.). Springer International Publishing, Cham, 402–408. Google ScholarDigital Library
- [6] . 2018. A content management perspective on fact-checking. In Companion Proceedings of the Web Conference 2018 (WWW’18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 565–574.
DOI: Google ScholarDigital Library - [7] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186.
DOI: Google ScholarCross Ref - [8] . 2020. CLIMATE-FEVER: A dataset for verification of real-world climate claims. In Proceedings of the NeurIPS 2020 Workshop on Tackling Climate Change with Machine Learning.Google Scholar
- [9] . 2021. Fool me twice: Entailment from Wikipedia gamification. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 352–365. https://www.aclweb.org/anthology/2021.naacl-main.32.Google ScholarCross Ref
- [10] . 2016. Emergent: A novel data-set for stance classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 1163–1168.
DOI: Google ScholarCross Ref - [11] . 2020. The Pile: An 800GB Dataset of Diverse Text for Language Modeling. (2020).
arxiv:cs.CL/2101.00027 Google Scholar - [12] . 2013. Fact checking and analyzing the web. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD’13). Association for Computing Machinery, New York, 997–1000.
DOI: Google ScholarDigital Library - [13] . 2019. Overview of the CLEF-2019 CheckThat! Lab: Automatic identification and verification of claims. Task 2: Evidence and factuality. In Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum, Lugano, Switzerland, September 9-12, 2019 (CEUR Workshop Proceedings), , , , and (Eds.), Vol. 2380. CEUR-WS.org. http://ceur-ws.org/Vol-2380/paper_270.pdf.Google Scholar
- [14] . 2020. spaCy: Industrial-strength Natural Language Processing in Python. (2020).
DOI: Google ScholarCross Ref - [15] . 2017. Fully automated fact checking using external sources. In Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP 2017). INCOMA Ltd., Varna, Bulgaria, 344–353.
DOI: Google ScholarCross Ref - [16] . 2020. Scrutinizer: Fact checking statistical claims. Proc. VLDB Endow. 13, 12 (
Aug. 2020), 2965–2968.DOI: Google ScholarDigital Library - [17] . 2021. How robust are fact checking systems on colloquial claims?. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 1535–1548.
DOI: Google ScholarCross Ref - [18] . 2020. Explainable automated fact-checking for public health claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 7740–7754.
DOI: Google ScholarCross Ref - [19] . 2020. Fine-grained fact verification with kernel graph attention network. In Proceedings of ACL.Google ScholarCross Ref
- [20] . 2020. S2ORC: The semantic scholar open research corpus. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 4969–4983.
DOI: Google ScholarCross Ref - [21] . 2019. FAKTA: An automatic end-to-end fact checking system. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). Association for Computational Linguistics, Minneapolis, Minnesota, 78–83.
DOI: Google ScholarCross Ref - [22] . 2021. MLOps: From Model-centric to Data-centric AI. Virtual Event. (2021). https://www.deeplearning.ai/wp-content/uploads/2021/06/MLOps-From-Model-centric-to-Data-centric-AI.pdf.Google Scholar
- [23] . 2017. The fake news challenge: Exploring how artificial intelligence technologies could be leveraged to combat fake news. http://www.fakenewschallenge.org/. (2017).
Accessed: 2021-10-13. Google Scholar - [24] . 2016. Credibility assessment of textual claims on the web. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM’16). Association for Computing Machinery, New York, 2173–2178.
DOI: Google ScholarDigital Library - [25] . 2019. Language models are unsupervised multitask learners. (2019). https://github.com/openai/gpt-2.Google Scholar
- [26] . 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21, 140 (2020), 1–67. http://jmlr.org/papers/v21/20-074.html.Google Scholar
- [27] . 2021. The curse of dense low-dimensional information retrieval for large index sizes. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, Online, 605–611.
DOI: Google ScholarCross Ref - [28] . 2019. Mechanism-based treatment strategies for IBD: Cytokines, cell adhesion molecules, JAK inhibitors, gut flora, and more. Inflammatory Intestinal Diseases 4, 3 (
Aug. 2019), 79–96.DOI: Google ScholarCross Ref - [29] . 2020. e-FEVER: Explanations and summaries forautomated fact checking. In Proceedings of the 2020 Truth and Trust Online Conference (TTO 2020), (Virtual, October 15-17, 2020), and (Eds.). Hacks Hackers, 32–43. https://truthandtrustonline.com/wp-content/uploads/2020/10/TTO04.pdf.Google Scholar
- [30] . 2012. Induction therapy with autologous mesenchymal stem cells in living-related kidney transplants: A randomized controlled trial. JAMA 307, 11 (
03 2012), 1169–1177.DOI: arXiv: https://jamanetwork.com/journals/jama/articlepdf/1105090/joc25022_1169_1177.pdf.Google ScholarCross Ref - [31] . 2019. ClaimsKG: A knowledge graph of fact-checked claims. In International Semantic Web Conference. Springer, 309–324.Google ScholarDigital Library
- [32] . 2021. Evidence-based Verification for Real World Information Needs. (2021).
arxiv:cs.CL/2104.00640 Google Scholar - [33] . 2018. FEVER: A large-scale dataset for fact extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 809–819.
DOI: Google ScholarCross Ref - [34] . 2018. The fact extraction and VERification (FEVER) shared task. In Proceedings of the 1st Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, 1–9. http://aclweb.org/anthology/W18-5501.Google ScholarCross Ref
- [35] . 2013. In kidney transplant patients, alemtuzumab but not basiliximab/low-dose rabbit anti-thymocyte globulin induces B Cell depletion and regeneration, which associates with a high incidence of de novo donor-specific Anti-HLA antibody development. The Journal of Immunology 191, 5 (2013), 2818–2828.
DOI: arXiv: https://www.jimmunol.org/content/191/5/2818.full.pdf.Google ScholarCross Ref - [36] . 2014. Fact checking: Task definition and dataset construction. In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science. Association for Computational Linguistics, 18–22.
DOI: Google ScholarCross Ref - [37] . 2018. The spread of true and false news online. Science 359 (
03 2018), 1146–1151.DOI: Google ScholarCross Ref - [38] . 2020. Fact or Fiction: Verifying scientific claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 7534–7550.
DOI: Google ScholarCross Ref - [39] . 2017. The knowledge graph as the default data model for learning on heterogeneous knowledge. Data Science 1, 1-2 (2017), 39–57.Google ScholarCross Ref
- [40] . 2017. Anserini: Enabling the use of lucene for information retrieval research. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’17). Association for Computing Machinery, New York, NY, USA, 1253–1256.
DOI: Google ScholarDigital Library - [41] . 2018. Feature engineering for machine learning: Principles and techniques for data scientists. “O’Reilly Media, Inc.”.Google Scholar
- [42] . 2021. Retrieving and reading: A comprehensive survey on open-domain question answering. arXiv preprint arXiv:2101.00774 (2021).Google Scholar
Index Terms
- The Choice of Textual Knowledge Base in Automated Claim Checking
Recommendations
Knowledge structure, knowledge granulation and knowledge distance in a knowledge base
One of the strengths of rough set theory is the fact that an unknown target concept can be approximately characterized by existing knowledge structures in a knowledge base. Knowledge structures in knowledge bases have two categories: complete and ...
Knowledge Conceptualization Tool
Knowledge acquisition is one of the most important and problematic aspects of developing knowledge-based systems. Many automated tools have been introduced in the past, however, manual techniques are still heavily used. Interviewing is one of the most ...
Knowledge structures in a knowledge base
Rough set theory is a useful tool for dealing with imprecise knowledge. One of the advantages of rough set theory is the fact that an unknown target concept can be approximately characterized by existing knowledge structures in a knowledge base. This ...
Comments