skip to main content
10.1145/3132847.3133150acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper
Public Access

TATHYA: A Multi-Classifier System for Detecting Check-Worthy Statements in Political Debates

Published: 06 November 2017 Publication History

Abstract

Fact-checking political discussions has become an essential clog in computational journalism. This task encompasses an important sub-task---identifying the set of statements with 'check-worthy' claims. Previous work has treated this as a simple text classification problem discounting the nuances involved in determining what makes statements check-worthy. We introduce a dataset of political debates from the 2016 US Presidential election campaign annotated using all major fact-checking media outlets and show that there is a need to model conversation context, debate dynamics and implicit world knowledge. We design a multi-classifier system TATHYA, that models latent groupings in data and improves state-of-art systems in detecting check-worthy statements by 19.5% in F1-score on a held-out test set, gaining primarily gaining in Recall.

References

[1]
Hunt Allcott and Matthew Gentzkow. 2017. Social media and fake news in the 2016 election. Technical Report. National Bureau of Economic Research.
[2]
David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research Vol. 3, Jan (2003), 993--1022.
[3]
Ming-Wei Chang, Dan Goldwasser, Dan Roth, and Vivek Srikumar. 2010. Discriminative learning over constrained latent representations Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 429--437.
[4]
Y-Y Chou and Linda G Shapiro. 2003. A hierarchical multiple classifier learning algorithm. Pattern Analysis & Applications Vol. 6, 2 (2003), 150--168.
[5]
Song Feng, Ritwik Banerjee, and Yejin Choi. 2012. Syntactic stylometry for deception detection. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. Association for Computational Linguistics, 171--175.
[6]
Stephan Greene and Philip Resnik. 2009. More than words: Syntactic packaging and implicit sentiment Proceedings of human language technologies: The 2009 annual conference of the north american chapter of the association for computational linguistics. Association for Computational Linguistics, 503--511.
[7]
Naeemul Hassan, Bill Adair, James T Hamilton, Chengkai Li, Mark Tremayne, Jun Yang, and Cong Yu. 2015 a. The quest to automate fact-checking. Computation and Journalism Symposium (2015).
[8]
Naeemul Hassan, Chengkai Li, and Mark Tremayne. 2015 b. Detecting check-worthy factual claims in presidential debates Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. ACM, 1835--1838.
[9]
Julien Leblay. 2017. A Declarative Approach to Data-Driven Fact Checking Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4--9, 2017, San Francisco, California, USA. 147--153.
[10]
Marco Lippi and Paolo Torroni. 2015. Context-Independent Claim Detection for Argument Mining. IJCAI, Vol. Vol. 15. 185--191.
[11]
Rada Mihalcea and Carlo Strapparava. 2009. The lie detector: Explorations in the automatic recognition of deceptive language Proceedings of the ACL-IJCNLP 2009 Conference Short Papers. Association for Computational Linguistics, 309--312.
[12]
Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 309--319.
[13]
Jeff Pasternack and Dan Roth. 2010. Knowing what to believe (when you already know something) Proceedings of the 23rd International Conference on Computational Linguistics. Association for Computational Linguistics, 877--885.
[14]
Isaac Persing and Vincent Ng. 2016. End-to-end argumentation mining in student essays. Proceedings of NAACL-HLT. 1384--1394.
[15]
Chenhao Tan, Vlad Niculae, Cristian Danescu-Niculescu-Mizil, and Lillian Lee. 2016. Winning arguments: Interaction dynamics and persuasion strategies in good-faith online discussions. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 613--624.
[16]
James Thorne and Andreas Vlachos. 2017. An Extensible Framework for Verification of Numerical Claims. EACL 2017 (2017), 37.
[17]
Andreas Vlachos and Sebastian Riedel. 2014. Fact Checking: Task definition and dataset construction. ACL 2014 (2014), 18.
[18]
Andreas Vlachos and Sebastian Riedel. 2015. Identification and verification of simple claims about statistical properties Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2596--2601.

Cited By

View all
  • (2025)Facilitating automated fact-checking: a machine learning based weighted ensemble technique for claim detectionDiscover Applied Sciences10.1007/s42452-024-06444-67:1Online publication date: 11-Jan-2025
  • (2025)An introduction to computational argumentation research from a human argumentation perspectiveAutonomous Agents and Multi-Agent Systems10.1007/s10458-025-09692-x39:1Online publication date: 1-Jun-2025
  • (2024)Building a framework for fake news detection in the health domainPLOS ONE10.1371/journal.pone.030536219:7(e0305362)Online publication date: 8-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
November 2017
2604 pages
ISBN:9781450349185
DOI:10.1145/3132847
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 November 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustering
  2. computational journalism
  3. natural language processing

Qualifiers

  • Short-paper

Funding Sources

Conference

CIKM '17
Sponsor:

Acceptance Rates

CIKM '17 Paper Acceptance Rate 171 of 855 submissions, 20%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)136
  • Downloads (Last 6 weeks)18
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Facilitating automated fact-checking: a machine learning based weighted ensemble technique for claim detectionDiscover Applied Sciences10.1007/s42452-024-06444-67:1Online publication date: 11-Jan-2025
  • (2025)An introduction to computational argumentation research from a human argumentation perspectiveAutonomous Agents and Multi-Agent Systems10.1007/s10458-025-09692-x39:1Online publication date: 1-Jun-2025
  • (2024)Building a framework for fake news detection in the health domainPLOS ONE10.1371/journal.pone.030536219:7(e0305362)Online publication date: 8-Jul-2024
  • (2024)Gradient-Based Adversarial Training on Transformer Networks for Detecting Check-Worthy Factual ClaimsACM Transactions on Intelligent Systems and Technology10.1145/3689212Online publication date: 20-Aug-2024
  • (2024)Propositional claim detection: a task and dataset for the classification of claims to truthJournal of Computational Social Science10.1007/s42001-024-00289-07:2(1727-1752)Online publication date: 9-May-2024
  • (2024)Is checkworthiness generalizable? Evaluating task and domain generalization of datasets for claim detectionNeural Computing and Applications10.1007/s00521-024-09896-436:24(15165-15176)Online publication date: 1-Aug-2024
  • (2023)MythQA: Query-Based Large-Scale Check-Worthy Claim Detection through Multi-Answer Open-Domain Question AnsweringProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591907(3017-3026)Online publication date: 19-Jul-2023
  • (2023)Re-Think Before You Share: A Comprehensive Study on Prioritizing Check-Worthy ClaimsIEEE Transactions on Computational Social Systems10.1109/TCSS.2021.313864210:1(362-375)Online publication date: Feb-2023
  • (2023)Towards Automated Fact-Checking: An Exploratory Study on the Detection of Checkable Statements in Spanish2023 42nd IEEE International Conference of the Chilean Computer Science Society (SCCC)10.1109/SCCC59417.2023.10315728(1-8)Online publication date: 23-Oct-2023
  • (2023)Holistic Analysis of Organised Misinformation Activity in Social NetworksDisinformation in Open Online Media10.1007/978-3-031-47896-3_10(132-143)Online publication date: 14-Nov-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media