Multilingual Evidence Retrieval and Fact Verification to Combat Global Disinformation: The Power of Polyglotism

Olteanu Roberts, Denisa A.

doi:10.1007/978-3-030-72240-1_36

Multilingual Evidence Retrieval and Fact Verification to Combat Global Disinformation: The Power of Polyglotism

Denisa A. Olteanu Roberts ORCID: orcid.org/0000-0002-8916-6140¹⁴

Conference paper
First Online: 30 March 2021

2267 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12657))

Abstract

This article investigates multilingual evidence retrieval and fact verification as a step to combat global disinformation, a first effort of this kind, to the best of our knowledge. The goal is building multilingual systems that retrieve in evidence - rich languages to verify claims in evidence - poor languages that are more commonly targeted by disinformation. To this end, our EnmBERT fact verification system shows evidence of transfer learning ability and a 400 example mixed English - Romanian dataset is made available for cross - lingual transfer learning evaluation.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Andrei, A.: impact.ro (2020). https://www.impact.ro/exclusiv-ce-se-intampla-acum-cu-ion-mihai-pacepa. Accessed 28 Oct 2020
Artetxe, M., Schwenk, H.: Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Trans. Assoc. Comput. Linguist. 7, 597–610 (2019)
Article Google Scholar
Bastos, M.T., Mercea, D.: The Brexit botnet and user-generated hyperpartisan news. Soc. Sci. Comput. Rev. 37(1), 38–54 (2019)
Article Google Scholar
Bessi, A., Ferrara, E.: Social bots distort the 2016 US Presidential election online discussion. First Monday 21(11–7), 56 (2016)
Google Scholar
Brachten, F., Stieglitz, S., Hofeditz, L., Kloppenborg, K., Reimann, A.: Strategies and influence of social bots in a 2017 German state election-a case study on Twitter. arXiv preprint arXiv:1710.07562 (2017)
Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., Li, H.: Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th International Conference on Machine Learning, pp. 129–136 (2007)
Google Scholar
Clark, J.H., et al.: TyDi QA: a benchmark for information-seeking question answering in typologically diverse languages. arXiv preprint arXiv:2003.05002 (2020)
Cohen, N.: Conspiracy videos? Fake news? Enter Wikipedia, the ‘good cop’ of the Internet. The Washington Post (2018)
Google Scholar
Conneau, A., et al.: XNLI: evaluating cross-lingual sentence representations. arXiv preprint arXiv:1809.05053 (2018)
Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 708–716 (2007)
Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
Fallis, D.: What is disinformation? Library Trends 63(3), 401–426 (2015)
Article Google Scholar
Fetzer, J.H.: Disinformation: the use of false information. Mind. Mach. 14(2), 231–240 (2004)
Article Google Scholar
Gardner, M., et al.: AllenNLP: a deep semantic natural language processing platform. arXiv preprint arXiv:1803.07640 (2018)
Grinberg, N., Joseph, K., Friedland, L., Swire-Thompson, B., Lazer, D.: Fake news on Twitter during the 2016 US presidential election. Science 363(6425), 374–378 (2019)
Article Google Scholar
Hanselowski, A., et al.: UKP-Athene: multi-sentence textual entailment for claim verification. In: Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), pp. 103–108 (2018)
Google Scholar
Johnson, M., et al.: Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans. Assoc. Comput. Linguist. 5, 339–351 (2017)
Article Google Scholar
Kar, D., Bhardwaj, M., Samanta, S., Azad, A.P.: No rumours please! A multi-Indic-lingual approach for COVID fake-tweet detection. arXiv preprint arXiv:2010.06906 (2020)
Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. arXiv preprint arXiv:2005.11401 (2020)
Liu, Y., et al.: Multilingual denoising pre-training for neural machine translation. arXiv preprint arXiv:2001.08210 (2020)
Liu, Z., Xiong, C., Sun, M., Liu, Z.: Fine-grained fact verification with kernel graph attention network. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7342–7351 (2020)
Google Scholar
Malon, C.: Team Papelo: transformer networks at FEVER. In: Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), pp. 109–113 (2018)
Google Scholar
Pacepa, I.M.: Red Horizons: Chronicles of a Communist Spy Chief. Gateway Books (1987)
Google Scholar
Pacepa, I.M., Rychlak, R.J.: Disinformation: Former Spy Chief Reveals Secret Strategy for Undermining Freedom, Attacking Religion, and Promoting Terrorism. Wnd Books (2013)
Google Scholar
Rogers, K., Longoria, J.: Why a Gamer Started a Web of Disinformation Sites Aimed at Latino Americans (2020). https://fivethirtyeight.com/features/why-a-gamer-started-a-web-of-disinformation-sites-aimed-at-latino-americans. Accessed 18 Jan 2021
Sakata, W., Shibata, T., Tanaka, R., Kurohashi, S.: FAQ retrieval using query-question similarity and BERT-based query-answer relevance. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1113–1116 (2019)
Google Scholar
Schroepfer, M.: Creating a data set and a challenge for deepfakes. Facebook Artificial Intelligence (2019)
Google Scholar
Schwenk, H., Li, X.: A corpus for multilingual document classification in eight languages. arXiv preprint arXiv:1805.09821 (2018)
Sennrich, R., Haddow, B., Birch, A.: Edinburgh neural machine translation systems for WMT 16. arXiv preprint arXiv:1606.02891 (2016)
Silverman, C.: This Analysis Shows How Viral Fake Election News Stories Outperformed Real News on Facebook (2016). https://www.buzzfeednews.com/article/craigsilverman/viral-fake-election-news-outperformed-real-news-on-facebook. Accessed 28 Oct 2020
Soleimani, A., Monz, C., Worring, M.: BERT for evidence retrieval and claim verification. In: Jose, J.M., et al. (eds.) ECIR 2020. LNCS, vol. 12036, pp. 359–366. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45442-5_45
Chapter Google Scholar
Thorne, J., Vlachos, A.: Avoiding catastrophic forgetting in mitigating model biases in sentence-pair classification with elastic weight consolidation. arXiv preprint arXiv:2004.14366 (2020)
Thorne, J., Vlachos, A., Christodoulopoulos, C., Mittal, A.: FEVER: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355 (2018)
Thorne, J., Vlachos, A., Cocarascu, O., Christodoulopoulos, C., Mittal, A.: The fact extraction and verification (FEVER) shared task. arXiv preprint arXiv:1811.10971 (2018)
Vosoughi, S., Roy, D., Aral, S.: The spread of true and false news online. Science 359(6380), 1146–1151 (2018)
Article Google Scholar
Wolf, T., et al.: HuggingFace’s transformers: state-of-the-art natural language processing. arXiv arXiv:1910 (2019)
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, pp. 5753–5763 (2019)
Google Scholar
Yoneda, T., Mitchell, J., Welbl, J., Stenetorp, P., Riedel, S.: UCL machine reading group: four factor framework for fact finding (HexaF). In: Proceedings of the First Workshop on Fact Extraction and VERification (FEVER), pp. 97–102 (2018)
Google Scholar
Zhong, W., et al.: Reasoning over semantic-level graph for fact checking. arXiv preprint arXiv:1909.03745 (2019)
Zhou, J., et al.: GEAR: graph-based evidence aggregating and reasoning for fact verification. arXiv preprint arXiv:1908.01843 (2019)
Zhou, X., Mulay, A., Ferrara, E., Zafarani, R.: ReCOVery: a multimodal repository for COVID-19 news credibility research. arXiv preprint arXiv:2006.05557 (2020)
Zhou, X., Zafarani, R.: A survey of fake news: fundamental theories, detection methods, and opportunities. ACM Comput. Surv. (CSUR) 53(5), 1–40 (2020)
Article Google Scholar

Download references

Author information

Authors and Affiliations

AI SpaceTime, New York, NY, USA
Denisa A. Olteanu Roberts

Authors

Denisa A. Olteanu Roberts
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Denisa A. Olteanu Roberts .

Editor information

Editors and Affiliations

Radboud University Nijmegen, Nijmegen, The Netherlands
Djoerd Hiemstra
Department of Computer Science, Katholieke Universiteit Leuven, Heverlee, Belgium
Marie-Francine Moens
Toulouse, Toulouse Institute of Computer Science Research, Toulouse, France
Josiane Mothe
Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy
Raffaele Perego
Leipzig University, Leipzig, Germany
Martin Potthast
Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy
Fabrizio Sebastiani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Olteanu Roberts, D.A. (2021). Multilingual Evidence Retrieval and Fact Verification to Combat Global Disinformation: The Power of Polyglotism. In: Hiemstra, D., Moens, MF., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds) Advances in Information Retrieval. ECIR 2021. Lecture Notes in Computer Science(), vol 12657. Springer, Cham. https://doi.org/10.1007/978-3-030-72240-1_36

Download citation

DOI: https://doi.org/10.1007/978-3-030-72240-1_36
Published: 30 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-72239-5
Online ISBN: 978-3-030-72240-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics