HybridFC: A Hybrid Fact-Checking Approach for Knowledge Graphs

Qudus, Umair; Röder, Michael; Saleem, Muhammad; Ngonga Ngomo, Axel-Cyrille

doi:10.1007/978-3-031-19433-7_27

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13489))

Included in the following conference series:

International Semantic Web Conference

2336 Accesses
2 Citations
1 Altmetric

Abstract

We consider fact-checking approaches that aim to predict the veracity of assertions in knowledge graphs. Five main categories of fact-checking approaches for knowledge graphs have been proposed in the recent literature, of which each is subject to partially overlapping limitations. In particular, current text-based approaches are limited by manual feature engineering. Path-based and rule-based approaches are limited by their exclusive use of knowledge graphs as background knowledge, and embedding-based approaches suffer from low accuracy scores on current fact-checking tasks. We propose a hybrid approach—dubbed HybridFC—that exploits the diversity of existing categories of fact-checking approaches within an ensemble learning setting to achieve a significantly better prediction performance. In particular, our approach outperforms the state of the art by 0.14 to 0.27 in terms of Area Under the Receiver Operating Characteristic curve on the FactBench dataset. Our code is open-source and can be found at https://github.com/dice-group/HybridFC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://webdatacommons.org/structureddata/2021-12/stats/stats.html.
2.
https://lod-cloud.net/.
3.
For the assertion \(award\_00135\) from the FactBench, COPAAL produces a score of 0.0 as it is unable to find a path between the assertion’s subject and its object.
4.
https://www.mpi-inf.mpg.de/impact/exfakt#Tracy.
5.
During a first evaluation a simpler approach with only one multi-layer perceptron module (i.e., without \(\vartheta _{1}\) and \(\vartheta _{2}\)) showed an insufficient performance.
6.
https://www.elastic.co/.
7.
We ran experiments with all available pre-trained models (not shown in the paper due to space limitations) from the SBert homepage (https://www.sbert.net/docs/pretrained_models.html) and found that nq-distilbert-base-v1 worked best for our approach.
8.
A large number of KG embedding algorithms [12, 42, 49] has been developed in recent years. However, while many of them show promising effectiveness, their scalability is often limited. For many of them, generating embedding models for the whole DBpedia is impractical (runtimes > 1 month). Hence, we only considered the approaches for which pre-trained DBpedia embeddings are available.
9.
Fair comparison could not be possible with missing entities, which constitute many assertions.
10.
We use a Wilcoxon signed rank test with a significance threshold \(\alpha =0.05\).
11.
Due to space limitation we exclude the results of FactBench train set. These results are available on our GitHub page.
12.
Source code: https://github.com/dice-group/HybridFC.

References

Athreya, R.G., Ngonga Ngomo, A.C., Usbeck, R.: Enhancing community interactions with data-driven chatbots-the dbpedia chatbot. In: Companion Proceedings of the The Web Conference 2018, pp. 143–146. WWW 2018, International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE (2018). https://doi.org/10.1145/3184558.3186964
Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., Ives, Z.: DBpedia: a nucleus for a web of open data. In: Aberer, K., et al. (eds.) ASWC/ISWC -2007. LNCS, vol. 4825, pp. 722–735. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76298-0_52
Chapter Google Scholar
Authors, A.: Mypublications dataset. https://doi.org/10.5281/zenodo.6523389
Authors, A.: Pre-trained embeddings for fact-checking datasets. https://doi.org/10.5281/zenodo.6523438
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Article Google Scholar
Boland, K., Fafalios, P., Tchechmedjiev, A., Dietze, S., Todorov, K.: Beyond facts - a survey and conceptualisation of claims in online discourse analysis, March 2021. https://hal.mines-ales.fr/hal-03185097, working paper or preprint
Bordes, A., Usunier, N., Garcia-Durán, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2. NIPS 2013, pp. 2787–2795, Curran Associates Inc., Red Hook, NY, USA (2013)
Google Scholar
Chen, Y., Goldberg, S., Wang, D.Z., Johri, S.S.: Ontological pathfinding: mining first-order knowledge from large knowledge bases. In: Proceedings of the 2016 International Conference on Management of Data. SIGMOD 2016, New York, NY, USA, pp. 835–846. Association for Computing Machinery (2016). https://doi.org/10.1145/2882903.2882954
Ciampaglia, G.L., Shiralkar, P., Rocha, L.M., Bollen, J., Menczer, F., Flammini, A.: Computational fact checking from knowledge networks. PLoS ONE 10(6), 1–13 (2015). https://doi.org/10.1371/journal.pone.0128193
Article Google Scholar
Dai, Y., Wang, S., Xiong, N.N., Guo, W.: A survey on knowledge graph embedding: approaches, applications and benchmarks. Electronics 9(5) (2020). https://doi.org/10.3390/electronics9050750
Demir, C., Moussallem, D., Heindorf, S., Ngomo, A.C.N.: Convolutional hypercomplex embeddings for link prediction. In: Asian Conference on Machine Learning, pp. 656–671. PMLR (2021)
Google Scholar
Demir, C., Ngomo, A.-C.N.: Convolutional complex knowledge graph embeddings. In: Verborgh, R., et al. (eds.) ESWC 2021. LNCS, vol. 12731, pp. 409–424. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77385-4_24
Chapter Google Scholar
Dong, X.L., et al.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, New York, NY, USA, 24–27 August, pp. 601–610, 2014 (2014). http://www.cs.cmu.edu/nlao/publication/2014.kdd.pdf, evgeniy Gabrilovich Wilko Horn Ni Lao Kevin Murphy Thomas Strohmann Shaohua Sun Wei Zhang Geremy Heitz
Gad-Elrab, M.H., Stepanova, D., Urbani, J., Weikum, G.: Exfakt: a framework for explaining facts over knowledge graphs and text. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. WSDM 2019, New York, NY, USA, pp. 87–95. Association for Computing Machinery (2019). https://doi.org/10.1145/3289600.3290996
Gad-Elrab, M.H., Stepanova, D., Urbani, J., Weikum, G.: Tracy: tracing facts over knowledge graphs and text. In: The World Wide Web Conference. WWW 2019, pp. 3516–3520, New York, NY, USA. Association for Computing Machinery (2019). https://doi.org/10.1145/3308558.3314126
Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: Fast rule mining in ontological knowledge bases with AMIE\(+\). VLDB J. 24(6), 707–730 (2015). https://doi.org/10.1007/s00778-015-0394-1
Article Google Scholar
Galárraga, L.A., Teflioudi, C., Hose, K., Suchanek, F.: Amie: association rule mining under incomplete evidence in ontological knowledge bases. In: Proceedings of the 22nd International Conference on World Wide Web. WWW 2013, pp. 413–422, New York, NY, USA. Association for Computing Machinery (2013). https://doi.org/10.1145/2488388.2488425
Gardner, M., Mitchell, T.: Efficient and expressive knowledge base completion using subgraph feature extraction. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1488–1498 (2015)
Google Scholar
Gardner, M., Talukdar, P., Krishnamurthy, J., Mitchell, T.: Incorporating vector space similarity in random walk inference over knowledge bases. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 397–406, Doha, Qatar. Association for Computational Linguistics, October 2014. https://doi.org/10.3115/v1/D14-1044
Gerber, D., et al.: Defacto-temporal and multilingual deep fact validation. Web Semant. 35(P2), 85–101 (2015). https://doi.org/10.1016/j.websem.2015.08.001
Article Google Scholar
Huang, J., et al.: Trustworthy knowledge graph completion based on multi-sourced noisy data. In: Laforest, F., et al. (eds.) WWW 2022: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25–29, 2022, pp. 956–965. ACM (2022). https://doi.org/10.1145/3485447.3511938
Huynh, V.P., Papotti, P.: Towards a benchmark for fact checking with knowledge bases. In: Companion Proceedings of the The Web Conference 2018, pp. 1595–1598. WWW 2018, Republic and Canton of Geneva, CHE. International World Wide Web Conferences Steering Committee (2018). https://doi.org/10.1145/3184558.3191616
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37. ICML 2015, pp. 448–456. JMLR.org (2015)
Google Scholar
Ji, G., He, S., Xu, L., Liu, K., Zhao, J.: Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 687–696. Association for Computational Linguistics, Beijing, China, July 2015. https://doi.org/10.3115/v1/P15-1067
Kim, J., Choi, K.s.: Unsupervised fact checking by counter-weighted positive and negative evidential paths in a knowledge graph. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 1677–1686. International Committee on Computational Linguistics, Barcelona, Spain (Online), December 2020. https://doi.org/10.18653/v1/2020.coling-main.147
Kotonya, N., Toni, F.: Explainable automated fact-checking for public health claims. arXiv preprint arXiv:2010.09926 (2020)
Lajus, J., Galárraga, L., Suchanek, F.: Fast and exact rule mining with AMIE 3. In: Harth, A., et al. (eds.) The Semantic Web, pp. 36–52. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49461-2_3
Chapter Google Scholar
Li, F., Dong, X.L., Langen, A., Li, Y.: Knowledge verification for long-tail verticals. Proc. VLDB Endow. 10(11), 1370–1381 (2017). https://doi.org/10.14778/3137628.3137646
Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embeddings for knowledge graph completion. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29 (2015)
Google Scholar
Malyshev, S., Krötzsch, M., González, L., Gonsior, J., Bielefeldt, A.: Getting the most out of Wikidata: semantic technology usage in Wikipedia’s knowledge graph. In: Vrandečić, D., et al. (eds.) The Semantic Web - ISWC 2018, pp. 376–394. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_23
Chapter Google Scholar
Malyshev, S., Krötzsch, M., González, L., Gonsior, J., Bielefeldt, A.: Getting the most out of wikidata: semantic technology usage in Wikipedia’s knowledge graph. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11137, pp. 376–394. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00668-6_23
Chapter Google Scholar
Nakamura, S., et al.: Trustworthiness analysis of web search results. In: Kovács, L., Fuhr, N., Meghini, C. (eds.) ECDL 2007. LNCS, vol. 4675, pp. 38–49. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74851-9_4
Chapter Google Scholar
Ngonga Ngomo, A.C., Röder, M., Syed, Z.H.: Semantic web challenge 2019. Website (2019). https://github.com/dice-group/semantic-web-challenge.github.io/. Accessed 30 March 2022
Ortona, S., Meduri, V.V., Papotti, P.: Rudik: rule discovery in knowledge bases. Proc. VLDB Endow. 11(12), 1946–1949 (2018). https://doi.org/10.14778/3229863.3236231
Article Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical Report 1999–66, Stanford InfoLab, November 1999. http://ilpubs.stanford.edu:8090/422/, previous number = SIDL-WP-1999-0120
Paulheim, H., Ngonga Ngomo, A.C., Bennett, D.: Semantic web challenge 2018. Website (2018). http://iswc2018.semanticweb.org/semantic-web-challenge-2018/index.html. Accessed 30 March 2022
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982–3992. Association for Computational Linguistics, Hong Kong, China, November 2019. https://doi.org/10.18653/v1/D19-1410
Ristoski, P., Paulheim, H.: RDF2Vec: RDF graph embeddings for data mining. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9981, pp. 498–514. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46523-4_30
Chapter Google Scholar
Rula, A., et al.: Tisco: temporal scoping of facts. Web Semant. 54(C), 72–86 (2019). https://doi.org/10.1016/j.websem.2018.09.002
Article Google Scholar
Shi, B., Weninger, T.: Discriminative predicate path mining for fact checking in knowledge graphs. Know.-Based Syst. 104(C), 123–133 (2016). https://doi.org/10.1016/j.knosys.2016.04.015
Shiralkar, P., Flammini, A., Menczer, F., Ciampaglia, G.L.: Finding streams in knowledge graphs to support fact checking. In: 2017 IEEE International Conference on Data Mining (ICDM), pp. 859–864 (2017). https://doi.org/10.1109/ICDM.2017.105
da Silva, A.A.M., Röder, M., Ngomo, A.-C.N.: Using compositional embeddings for fact checking. In: Hotho, A., et al. (eds.) ISWC 2021. LNCS, vol. 12922, pp. 270–286. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88361-4_16
Chapter Google Scholar
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706. ACM (2007)
Google Scholar
Sultana, T., Lee, Y.: Efficient rule mining and compression for RDF style kb based on horn rules. J. Supercomput. (2022). https://doi.org/10.1007/s11227-022-04519-y
Article Google Scholar
Sun, Y., Barber, R., Gupta, M., Aggarwal, C.C., Han, J.: Co-author relationship prediction in heterogeneous bibliographic networks. In: 2011 International Conference on Advances in Social Networks Analysis and Mining, pp. 121–128 (2011). https://doi.org/10.1109/ASONAM.2011.112
Syed, Z.H., Röder, M., Ngonga Ngomo, A.C.: Factcheck: validating RDF triples using textual evidence. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. CIKM 2018, New York, NY, USA, pp. 1599–1602. Association for Computing Machinery (2018). https://doi.org/10.1145/3269206.3269308
Syed, Z.H., Srivastava, N., Röder, M., Ngomo, A.C.N.: Copaal - an interface for explaining facts using corroborative paths. In: ISWC Satellites (2019)
Google Scholar
Syed, Z.H., Srivastava, N., Röder, M., Ngomo, A.N.: COPAAL - an interface for explaining facts using corroborative paths. In: Suárez-Figueroa, M.C., Cheng, G., Gentile, A.L., Guéret, C., Keet, C.M., Bernstein, A. (eds.) Proceedings of the ISWC 2019 Satellite Tracks (Posters & Demonstrations, Industry, and Outrageous Ideas) co-located with 18th International Semantic Web Conference (ISWC 2019), Auckland, New Zealand, October 26–30, 2019. CEUR Workshop Proceedings, vol. 2456, pp. 201–204. CEUR-WS.org (2019). http://ceur-ws.org/Vol-2456/paper52.pdf
Trouillon, T., Welbl, J., Riedel, S., Gaussier, E., Bouchard, G.: Complex embeddings for simple link prediction. In: International Conference on Machine Learning, pp. 2071–2080 (2016)
Google Scholar
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017). https://doi.org/10.1109/TKDE.2017.2754499
Article Google Scholar
Watt, N., du Plessis, M.C.: Dropout algorithms for recurrent neural networks. In: Proceedings of the Annual Conference of the South African Institute of Computer Scientists and Information Technologists, New York, NY, USA, pp. 72–78. SAICSIT 2018, Association for Computing Machinery (2018). https://doi.org/10.1145/3278681.3278691

Download references

Acknowledgments

The work has been supported by the EU H2020 Marie Skłodowska-Curie project KnowGraphs (no. 860801), the German Federal Ministry for Economic Affairs and Climate Action (BMWK) funded project RAKI (no. 01MD19012B), and the German Federal Ministry of Education and Research (BMBF) funded EuroStars projects 3DFed (no. 01QE2114B) and FROCKG (no. 01QE19418). We are also grateful to Daniel Vollmers and Caglar Demir for the valuable discussion on earlier drafts.

Author information

Authors and Affiliations

DICE Group, Department of Computer Science, Universität Paderborn, Paderborn, Germany
Umair Qudus, Michael Röder, Muhammad Saleem & Axel-Cyrille Ngonga Ngomo

Authors

Umair Qudus
View author publications
You can also search for this author in PubMed Google Scholar
Michael Röder
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Saleem
View author publications
You can also search for this author in PubMed Google Scholar
Axel-Cyrille Ngonga Ngomo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Umair Qudus .

Editor information

Editors and Affiliations

University of Manchester, Manchester, UK
Ulrike Sattler
University of Chile, Santiago, Chile
Aidan Hogan
University of Cape Town, Cape Town, South Africa
Maria Keet
University of Bologna, Bologna, Italy
Valentina Presutti
Universidade Federal do Espírito Santo, Vitória, Brazil
João Paulo A. Almeida
National Institute of Informatics, Tokyo, Japan
Hideaki Takeda
Orange, Belfort, France
Pierre Monnin
Sapienza University of Rome, Rome, Italy
Giuseppe Pirrò
University of Bari, Bari, Italy
Claudia d’Amato

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qudus, U., Röder, M., Saleem, M., Ngonga Ngomo, AC. (2022). HybridFC: A Hybrid Fact-Checking Approach for Knowledge Graphs. In: Sattler, U., et al. The Semantic Web – ISWC 2022. ISWC 2022. Lecture Notes in Computer Science, vol 13489. Springer, Cham. https://doi.org/10.1007/978-3-031-19433-7_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-19433-7_27
Published: 16 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19432-0
Online ISBN: 978-3-031-19433-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the Semantic Web Science Association (opens in a new tab)

HybridFC: A Hybrid Fact-Checking Approach for Knowledge Graphs