skip to main content
research-article

The Choice of Textual Knowledge Base in Automated Claim Checking

Published:02 March 2023Publication History
Skip Abstract Section

Abstract

Automated claim checking is the task of determining the veracity of a claim given evidence retrieved from a textual knowledge base of trustworthy facts. While previous work has taken the knowledge base as given and optimized the claim-checking pipeline, we take the opposite approach—taking the pipeline as given, we explore the choice of the knowledge base. Our first insight is that a claim-checking pipeline can be transferred to a new domain of claims with access to a knowledge base from the new domain. Second, we do not find a “universally best” knowledge base—higher domain overlap of a task dataset and a knowledge base tends to produce better label accuracy. Third, combining multiple knowledge bases does not tend to improve performance beyond using the closest-domain knowledge base. Finally, we show that the claim-checking pipeline’s confidence score for selecting evidence can be used to assess whether a knowledge base will perform well for a new set of claims, even in the absence of ground-truth labels.

REFERENCES

  1. [1] Augenstein Isabelle, Lioma Christina, Wang Dongsheng, Lima Lucas Chaves, Hansen Casper, Hansen Christian, and Simonsen Jakob Grue. 2019. MultiFC: A real-world multi-domain dataset for evidence-based fact checking of claims. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 46854697. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Bekoulis Giannis, Papagiannopoulou Christina, and Deligiannis Nikos. 2021. A review on fact extraction and verification. ACM Comput. Surv. 55, 1, Article 12 (nov2021), 35 pages. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Bhatia Shraey, Lau Jey Han, and Baldwin Timothy. 2021. Automatic Claim Review for Climate Science via Explanation Generation. (2021). arxiv:cs.CL/2107.14740Google ScholarGoogle Scholar
  4. [4] Brown Tom, Mann Benjamin, Ryder Nick, Subbiah Melanie, Kaplan Jared D., Dhariwal Prafulla, Neelakantan Arvind, Shyam Pranav, Sastry Girish, Askell Amanda, Agarwal Sandhini, Herbert-Voss Ariel, Krueger Gretchen, Henighan Tom, Child Rewon, Ramesh Aditya, Ziegler Daniel, Wu Jeffrey, Winter Clemens, Hesse Chris, Chen Mark, Sigler Eric, Litwin Mateusz, Gray Scott, Chess Benjamin, Clark Jack, Berner Christopher, McCandlish Sam, Radford Alec, Sutskever Ilya, and Amodei Dario. 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems, Larochelle H., Ranzato M., Hadsell R., Balcan M. F., and Lin H. (Eds.), Vol. 33. Curran Associates, Inc., 18771901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.Google ScholarGoogle Scholar
  5. [5] Cao Tien Duc, Manolescu Ioana, and Tannier Xavier. 2019. Extracting statistical mentions from textual claims to provide trusted content. In Natural Language Processing and Information Systems, Métais Elisabeth, Meziane Farid, Vadera Sunil, Sugumaran Vijayan, and Saraee Mohamad (Eds.). Springer International Publishing, Cham, 402408. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Cazalens Sylvie, Lamarre Philippe, Leblay Julien, Manolescu Ioana, and Tannier Xavier. 2018. A content management perspective on fact-checking. In Companion Proceedings of the Web Conference 2018 (WWW’18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 565574. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 41714186. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Diggelmann Thomas, Boyd-Graber Jordan, Bulian Jannis, Ciaramita Massimiliano, and Leippold Markus. 2020. CLIMATE-FEVER: A dataset for verification of real-world climate claims. In Proceedings of the NeurIPS 2020 Workshop on Tackling Climate Change with Machine Learning.Google ScholarGoogle Scholar
  9. [9] Eisenschlos Julian Martin, Dhingra Bhuwan, Bulian Jannis, Börschinger Benjamin, and Boyd-Graber Jordan. 2021. Fool me twice: Entailment from Wikipedia gamification. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 352365. https://www.aclweb.org/anthology/2021.naacl-main.32.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Ferreira William and Vlachos Andreas. 2016. Emergent: A novel data-set for stance classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, San Diego, California, 11631168. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Gao Leo, Biderman Stella, Black Sid, Golding Laurence, Hoppe Travis, Foster Charles, Phang Jason, He Horace, Thite Anish, Nabeshima Noa, Presser Shawn, and Leahy Connor. 2020. The Pile: An 800GB Dataset of Diverse Text for Language Modeling. (2020). arxiv:cs.CL/2101.00027Google ScholarGoogle Scholar
  12. [12] Goasdoué François, Karanasos Konstantinos, Katsis Yannis, Leblay Julien, Manolescu Ioana, and Zampetakis Stamatis. 2013. Fact checking and analyzing the web. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD’13). Association for Computing Machinery, New York, 9971000. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Hasanain Maram, Suwaileh Reem, Elsayed Tamer, Barrón-Cedeño Alberto, and Nakov Preslav. 2019. Overview of the CLEF-2019 CheckThat! Lab: Automatic identification and verification of claims. Task 2: Evidence and factuality. In Working Notes of CLEF 2019 - Conference and Labs of the Evaluation Forum, Lugano, Switzerland, September 9-12, 2019 (CEUR Workshop Proceedings), Cappellato Linda, Ferro Nicola, Losada David E., and Müller Henning (Eds.), Vol. 2380. CEUR-WS.org. http://ceur-ws.org/Vol-2380/paper_270.pdf.Google ScholarGoogle Scholar
  14. [14] Honnibal Matthew, Montani Ines, Landeghem Sofie Van, and Boyd Adriane. 2020. spaCy: Industrial-strength Natural Language Processing in Python. (2020). DOI:Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Karadzhov Georgi, Nakov Preslav, Màrquez Lluís, Barrón-Cedeño Alberto, and Koychev Ivan. 2017. Fully automated fact checking using external sources. In Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP 2017). INCOMA Ltd., Varna, Bulgaria, 344353. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Karagiannis Georgios, Saeed Mohammed, Papotti Paolo, and Trummer Immanuel. 2020. Scrutinizer: Fact checking statistical claims. Proc. VLDB Endow. 13, 12 (Aug.2020), 29652968. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Kim Byeongchang, Kim Hyunwoo, Hong Seokhee, and Kim Gunhee. 2021. How robust are fact checking systems on colloquial claims?. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 15351548. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Kotonya Neema and Toni Francesca. 2020. Explainable automated fact-checking for public health claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 77407754. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Liu Zhenghao, Xiong Chenyan, Sun Maosong, and Liu Zhiyuan. 2020. Fine-grained fact verification with kernel graph attention network. In Proceedings of ACL.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Lo Kyle, Wang Lucy Lu, Neumann Mark, Kinney Rodney, and Weld Daniel. 2020. S2ORC: The semantic scholar open research corpus. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 49694983. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Nadeem Moin, Fang Wei, Xu Brian, Mohtarami Mitra, and Glass James. 2019. FAKTA: An automatic end-to-end fact checking system. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). Association for Computational Linguistics, Minneapolis, Minnesota, 7883. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Ng Andrew. 2021. MLOps: From Model-centric to Data-centric AI. Virtual Event. (2021). https://www.deeplearning.ai/wp-content/uploads/2021/06/MLOps-From-Model-centric-to-Data-centric-AI.pdf.Google ScholarGoogle Scholar
  23. [23] Pomerleau Dean and Rao Delip. 2017. The fake news challenge: Exploring how artificial intelligence technologies could be leveraged to combat fake news. http://www.fakenewschallenge.org/. (2017). Accessed: 2021-10-13.Google ScholarGoogle Scholar
  24. [24] Popat Kashyap, Mukherjee Subhabrata, Strötgen Jannik, and Weikum Gerhard. 2016. Credibility assessment of textual claims on the web. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM’16). Association for Computing Machinery, New York, 21732178. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Radford Alec, Wu Jeff, Child Rewon, Luan David, Amodei Dario, and Sutskever Ilya. 2019. Language models are unsupervised multitask learners. (2019). https://github.com/openai/gpt-2.Google ScholarGoogle Scholar
  26. [26] Raffel Colin, Shazeer Noam, Roberts Adam, Lee Katherine, Narang Sharan, Matena Michael, Zhou Yanqi, Li Wei, and Liu Peter J.. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21, 140 (2020), 167. http://jmlr.org/papers/v21/20-074.html.Google ScholarGoogle Scholar
  27. [27] Reimers Nils and Gurevych Iryna. 2021. The curse of dense low-dimensional information retrieval for large index sizes. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, Online, 605611. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Schreiner Philipp, Neurath Markus F., Ng Siew C., El-Omar Emad M., Sharara Ala I., Kobayashi Taku, Hisamatsu Tadakazu, Hibi Toshifumi, and Rogler Gerhard. 2019. Mechanism-based treatment strategies for IBD: Cytokines, cell adhesion molecules, JAK inhibitors, gut flora, and more. Inflammatory Intestinal Diseases 4, 3 (Aug.2019), 7996. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Stammbach Dominik and Ash Elliott. 2020. e-FEVER: Explanations and summaries forautomated fact checking. In Proceedings of the 2020 Truth and Trust Online Conference (TTO 2020), (Virtual, October 15-17, 2020), Cristofaro Emiliano De and Nakov Preslav (Eds.). Hacks Hackers, 3243. https://truthandtrustonline.com/wp-content/uploads/2020/10/TTO04.pdf.Google ScholarGoogle Scholar
  30. [30] Tan Jianming, Wu Weizhen, Xu Xiumin, Liao Lianming, Zheng Feng, Messinger Shari, Sun Xinhui, Chen Jin, Yang Shunliang, Cai Jinquan, Gao Xia, Pileggi Antonello, and Ricordi Camillo. 2012. Induction therapy with autologous mesenchymal stem cells in living-related kidney transplants: A randomized controlled trial. JAMA 307, 11 (032012), 11691177. DOI: arXiv: https://jamanetwork.com/journals/jama/articlepdf/1105090/joc25022_1169_1177.pdf.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Tchechmedjiev Andon, Fafalios Pavlos, Boland Katarina, Gasquet Malo, Zloch Matthäus, Zapilko Benjamin, Dietze Stefan, and Todorov Konstantin. 2019. ClaimsKG: A knowledge graph of fact-checked claims. In International Semantic Web Conference. Springer, 309324.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Thorne James, Glockner Max, Vallejo Gisela, Vlachos Andreas, and Gurevych Iryna. 2021. Evidence-based Verification for Real World Information Needs. (2021). arxiv:cs.CL/2104.00640Google ScholarGoogle Scholar
  33. [33] Thorne James, Vlachos Andreas, Christodoulopoulos Christos, and Mittal Arpit. 2018. FEVER: A large-scale dataset for fact extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 809819. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Thorne James, Vlachos Andreas, Cocarascu Oana, Christodoulopoulos Christos, and Mittal Arpit. 2018. The fact extraction and VERification (FEVER) shared task. In Proceedings of the 1st Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, 19. http://aclweb.org/anthology/W18-5501.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Todeschini Marta, Cortinovis Monica, Perico Norberto, Poli Francesca, Innocente Annalisa, Cavinato Regiane Aparecida, Gotti Eliana, Ruggenenti Piero, Gaspari Flavio, Noris Marina, Remuzzi Giuseppe, and Casiraghi Federica. 2013. In kidney transplant patients, alemtuzumab but not basiliximab/low-dose rabbit anti-thymocyte globulin induces B Cell depletion and regeneration, which associates with a high incidence of de novo donor-specific Anti-HLA antibody development. The Journal of Immunology 191, 5 (2013), 28182828. DOI: arXiv: https://www.jimmunol.org/content/191/5/2818.full.pdf.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Vlachos Andreas and Riedel Sebastian. 2014. Fact checking: Task definition and dataset construction. In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science. Association for Computational Linguistics, 1822. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Vosoughi Soroush, Roy Deb, and Aral Sinan. 2018. The spread of true and false news online. Science 359 (032018), 11461151. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Wadden David, Lin Shanchuan, Lo Kyle, Wang Lucy Lu, Zuylen Madeleine van, Cohan Arman, and Hajishirzi Hannaneh. 2020. Fact or Fiction: Verifying scientific claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 75347550. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Wilcke Xander, Bloem Peter, and Boer Victor De. 2017. The knowledge graph as the default data model for learning on heterogeneous knowledge. Data Science 1, 1-2 (2017), 3957.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Yang Peilin, Fang Hui, and Lin Jimmy. 2017. Anserini: Enabling the use of lucene for information retrieval research. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’17). Association for Computing Machinery, New York, NY, USA, 12531256. DOI:Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Zheng Alice and Casari Amanda. 2018. Feature engineering for machine learning: Principles and techniques for data scientists. “O’Reilly Media, Inc.”.Google ScholarGoogle Scholar
  42. [42] Zhu Fengbin, Lei Wenqiang, Wang Chao, Zheng Jianming, Poria Soujanya, and Chua Tat-Seng. 2021. Retrieving and reading: A comprehensive survey on open-domain question answering. arXiv preprint arXiv:2101.00774 (2021).Google ScholarGoogle Scholar

Index Terms

  1. The Choice of Textual Knowledge Base in Automated Claim Checking

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Journal of Data and Information Quality
      Journal of Data and Information Quality  Volume 15, Issue 1
      March 2023
      197 pages
      ISSN:1936-1955
      EISSN:1936-1963
      DOI:10.1145/3578367
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 March 2023
      • Online AM: 25 January 2023
      • Accepted: 26 August 2022
      • Revised: 22 August 2022
      • Received: 14 November 2021
      Published in jdiq Volume 15, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)346
      • Downloads (Last 6 weeks)31

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format