research-article

DESYR: Definition and Syntactic Representation Based Claim Detection on the Web

Authors:
Megha Sundriyal

Indraprastha Institute of Information Technology, Delhi, New Delhi, India

Indraprastha Institute of Information Technology, Delhi, New Delhi, India
View Profile

,
Parantak Singh

Birla Institute of Technology and Science, Pilani - Goa Campus, Zuari Nagar, India

Birla Institute of Technology and Science, Pilani - Goa Campus, Zuari Nagar, India
View Profile

,
Md. Shad Akhtar

Indraprastha Institute of Information Technology, Delhi, New Delhi, India

Indraprastha Institute of Information Technology, Delhi, New Delhi, India
View Profile

,
Shubhashis Sengupta

Accenture Labs Bangalore, Bangalore, India

Accenture Labs Bangalore, Bangalore, India
View Profile

,
Tanmoy Chakraborty

Indraprastha Institute of Information Technology, Delhi, New Delhi, India

Indraprastha Institute of Information Technology, Delhi, New Delhi, India
View Profile

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementOctober 2021Pages 1764–1773https://doi.org/10.1145/3459637.3482423

Published:30 October 2021Publication History

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 1764–1773

ABSTRACT

The formulation of a claim rests at the core of argument mining. To demarcate between a claim and a non-claim is arduous for both humans and machines, owing to latent linguistic variance between the two and the inadequacy of extensive definition-based formalization. Furthermore, the increase in the usage of online social media has resulted in an explosion of unsolicited information on the web presented as informal text. To account for the aforementioned, in this paper, we propose DESYR. It is a framework that intends on annulling the said issues for informal web-based text by leveraging a combination of hierarchical representation learning (dependency-inspired Poincaré embedding), definition-based alignment, and feature projection. We do away with fine-tuning compute-heavy language models in favor of fabricating a more domain-centric but lighter approach. Experimental results indicate that DESYR builds upon the state-of-the-art system across four benchmark claim datasets, most of which were constructed with informal texts. We see an increase of 3 claim-F1 points on the LESA-Twitter dataset, an increase of 1 claim-F1 point and 9 macro-F1 points on the Online Comments (OC) dataset, an increase of 24 claim-F1 points and 17 macro-F1 points on the Web Discourse (WD) dataset, and an increase of 8 claim-F1 points and 5 macro-F1 points on the Micro Texts (MT) dataset. We also perform an extensive analysis of the results. We make a 100-D pre-trained version of our Poincaré-variant along with the source code.

References

Hunt Allcott and Matthew Gentzkow. 2017. Social media and fake news in the 2016 election. Journal of economic perspectives 31, 2 (2017), 211--36.Google ScholarCross Ref
Alberto Barrón-Cedeno, Tamer Elsayed, Preslav Nakov, Giovanni Da San Martino, Maram Hasanain, Reem Suwaileh, and Fatima Haouari. 2020. Checkthat! at clef 2020: Enabling the automatic identification and verification of claims in social media. In European Conference on Information Retrieval. Springer, Nature Publishing Group, 499--507.Google ScholarDigital Library
Alberto Barrón-Cedeño, Tamer Elsayed, Preslav Nakov, Giovanni Da San Martino, Maram Hasanain, Reem Suwaileh, and Fatima Haouari. 2020. CheckThat! at CLEF 2020: Enabling the Automatic Identification and Verification of Claims in Social Media. In Advances in Information Retrieval. Springer International Publishing, Cham, 499--507.Google Scholar
Matthew Baum, Katherine Ognyanova, Hanyu Chwe, Alexi Quintana, Roy H Perlis, David Lazer, James Druckman, Mauricio Santillana, Jennifer Lin, John Della Volpe, et al. 2021. The COVID States Project# 14: Misinformation and Vaccine Acceptance. (2021).Google Scholar
Emily M. Bender, Jonathan T. Morgan, Meghan Oxley, Mark Zachry, Brian Hutchinson, Alex Marin, Bin Zhang, and Mari Ostendorf. 2011. Annotating Social Acts: Authority Claims and Alignment Moves in Wikipedia Talk Pages. In Proceedings of the Workshop on Language in Social Media (LSM 2011). Association for Computational Linguistics, Portland, Oregon, 48--57. https://www.aclweb.org/anthology/W11-0707 Google ScholarDigital Library
O. Biran and O. Rambow. 2011. Identifying Justifications in Written Dialogs. In 2011 IEEE Fifth International Conference on Semantic Computing, Vol. 5. 162--168. Google ScholarDigital Library
Silvere Bonnabel. 2013. Stochastic gradient descent on Riemannian manifolds. IEEE Trans. Automat. Control 58, 9 (2013), 2217--2229.Google ScholarCross Ref
Tuhin Chakrabarty, Christopher Hidey, and Kathleen McKeown. 2019. IMHO fine-tuning improves claim detection. arXiv preprint arXiv:1905.07000 (2019), 558--563.Google Scholar
Gullal S Cheema, Sherzod Hakimov, Eric Müller-Budack, and Ralph Ewerth. 2021. On the Role of Images for Analyzing Claims in Social Media. arXiv preprint arXiv:2103.09602 (2021).Google Scholar
Johannes Daxenberger, Steffen Eger, Ivan Habernal, Christian Stab, and Iryna Gurevych. 2017. What is the Essence of a Claim? Cross-Domain Claim Identification. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2055--2066.Google ScholarCross Ref
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186.Google Scholar
Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In International conference on machine learning. PMLR, 1180-- 1189. Google ScholarDigital Library
Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. The journal of machine learning research 17, 1 (2016), 2096--2030. Google ScholarDigital Library
Edouard Grave, Piotr Bojanowski, Prakhar Gupta, Armand Joulin, and Tomas Mikolov. 2018. Learning word vectors for 157 languages. arXiv preprint arXiv:1802.06893 (2018).Google Scholar
Shreya Gupta, Parantak Singh, Megha Sundriyal, Md. Shad Akhtar, and Tanmoy Chakraborty. 2021. LESA: Linguistic Encapsulation and Semantic Amalgamation Based Generalised Claim Detection from Online Content. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Association for Computational Linguistics, Online, 3178--3188. https://www.aclweb.org/anthology/2021.eacl-main.277Google ScholarCross Ref
Ivan Habernal and Iryna Gurevych. 2015. Exploiting Debate Portals for SemiSupervised Argumentation Mining in User-Generated Web Discourse. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 2127--2137. https://doi.org/10.18653/v1/D15--1255Google ScholarCross Ref
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
Ran Levy, Yonatan Bilu, Daniel Hershcovich, Ehud Aharoni, and Noam Slonim. 2014. Context dependent claim detection. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 1489-- 1500.Google Scholar
Ran Levy, Shai Gretz, Benjamin Sznajder, Shay Hummel, Ranit Aharonov, and Noam Slonim. 2017. Unsupervised corpus--wide claim detection. In Proceedings of the 4th Workshop on Argument Mining. 79--84.Google ScholarCross Ref
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980--2988.Google ScholarCross Ref
Marco Lippi and Paolo Torroni. 2015. Context-independent claim detection for argument mining. In Twenty-Fourth International Joint Conference on Artificial Intelligence. Google ScholarDigital Library
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).Google Scholar
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015).Google Scholar
UN News. 2020. During this coronavirus pandemic,?fake news' is putting lives at risk: UNESCO. (2020).Google Scholar
Maximilian Nickel and Douwe Kiela. 2017. Poincaré Embeddings for Learning Hierarchical Representations. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 6341--6350. http://papers.nips.cc/paper/7213-poincare-embeddings-for-learning-hierarchical-representations.pdf Google ScholarDigital Library
Alex Nikolov, Giovanni Da San Martino, Ivan Koychev, and Preslav Nakov. 2020. Team Alex at CLEF CheckThat! 2020: Identifying Check-Worthy Tweets With Transformer Models. arXiv:2009.02931 (2020). arXiv:2009.02931 [cs.CL]Google Scholar
World Health Organization et al. 2020. Immunizing the public against misinformation. Recuperado em (2020).Google Scholar
Andreas Peldszus and Manfred Stede. 2015. Joint prediction in MST-style discourse parsing for argumentation mining. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 938--948. https://doi.org/10.18653/v1/D15--1110Google ScholarCross Ref
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.Google ScholarCross Ref
Qi Qin, Wenpeng Hu, and Bing Liu. 2020. Feature projection for improved text classification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 8161--8171.Google ScholarCross Ref
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. https://arxiv.org/abs/1908.10084Google ScholarCross Ref
Sara Rosenthal and Kathleen McKeown. 2012. Detecting opinionated claims in online discussions. In 2012 IEEE sixth international conference on semantic computing. IEEE, 30--37. Google ScholarDigital Library
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1--9. https://doi.org/10.1109/CVPR.2015.7298594Google ScholarCross Ref
Stephen E Toulmin. 2003. The uses of argument. Cambridge university press.Google Scholar
Andrew Whalen and Kevin Laland. 2015. Conformity biased transmission in social networks. Journal of Theoretical Biology 380 (2015), 542--549.Google ScholarCross Ref
Evan Williams, Paul Rodrigues, and Valerie Novak. 2020. Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of claims using transformer-based models. arXiv: 2009.02431 (2020). arXiv:2009.02431 [cs.CL]Google Scholar
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V Le. 2019. Xlnet: Generalized autoregressive pretraining for language understanding. arXiv preprint arXiv:1906.08237 (2019).Google Scholar
Guineng Zheng, Subhabrata Mukherjee, Xin Luna Dong, and Feifei Li. 2018. Opentag: Open attribute value extraction from product profiles. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1049--1058. Google ScholarDigital Library

Index Terms

DESYR: Definition and Syntactic Representation Based Claim Detection on the Web
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

Overview of the CLAIMSCAN-2023: Uncovering Truth in Social Media through Claim Detection and Identification of Claim Spans
FIRE '23: Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation

The rapid development of online social media platforms has enabled a significant increase in content creation and information exchange, which has been extremely beneficial. These platforms, however, have also become a haven for those who spread false ...
Read More
Celebrity's self-disclosure on Twitter and parasocial relationships

This study investigated how celebrities' self-disclosure on personal social media accounts, particularly Twitter, affects fans' perceptions. An online survey was utilized among a sample of 429 celebrity followers on Twitter. Results demonstrated that ...
Read More
Inspecting interactions: online news media synergies in social media
ASONAM '18: Proceedings of the 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

The rising popularity of social media has radically changed the way news content is propagated, including interactive attempts with new dimensions. To date, traditional news media such as newspapers, television and radio have already adapted their ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
claim detection
feature projection
informal texts
linguistic grounding
poincar? embedding
social media
twitter
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 143
  Total Downloads
- Downloads (Last 12 months)24
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

DESYR: Definition and Syntactic Representation Based Claim Detection on the Web

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Overview of the CLAIMSCAN-2023: Uncovering Truth in Social Media through Claim Detection and Identification of Claim Spans

Celebrity's self-disclosure on Twitter and parasocial relationships

Inspecting interactions: online news media synergies in social media