Knowledge Injection to Counter Large Language Model (LLM) Hallucination

Martino, Ariana; Iannelli, Michael; Truong, Coleen

doi:10.1007/978-3-031-43458-7_34

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13998))

Included in the following conference series:

European Semantic Web Conference

2044 Accesses
1 Citations
3 Altmetric

Abstract

A shortfall of Large Language Model (LLM) content generation is hallucination, i.e., including false information in the output. This is especially risky for enterprise use cases that require reliable, fact-based, controllable text generation at scale. To mitigate this, we utilize a technique called Knowledge Injection (KI), where contextual data about the entities relevant to a text-generation task is mapped from a knowledge graph to text space for inclusion in an LLM prompt. Using the task of responding to online customer reviews of retail locations as an example, we have found that KI increases the count of correct assertions included in generated text. In a qualitative review, fine-tuned bloom-560m with KI outperformed a non-fine-tuned text-davinci-003 model from OpenAI, though text-davinci-003 has 300 times more parameters. Thus, the KI method can increase enterprise users’ confidence leveraging LLMs to replace tedious manual text generation and enable better performance from smaller, cheaper models.

All work in this paper was supported by and conducted at Yext.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Brown, T.B., et al.: Language models are few-shot learners. CoRR abs/2005.14165 (2020). https://arxiv.org/abs/2005.14165
Ji, Z., et al.: Survey of hallucination in natural language generation. ACM Comput. Surv. 55(12), 1–38 (2023). https://doi.org/10.1145/3571730
Article Google Scholar
Menghani, G.: Efficient deep learning: a survey on making deep learning models smaller, faster, and better. ACM Comput. Surv. 55(12), 1–37 (2023)
Article Google Scholar
Scao, T.L., et al.: BLOOM: a 176B-parameter open-access multilingual language model (2022). https://doi.org/10.48550/ARXIV.2211.05100
Sharir, O., Peleg, B., Shoham, Y.: The cost of training NLP models: a concise overview. arXiv preprint arXiv:2004.08900 (2020)
Singhal, K., et al.: Large language models encode clinical knowledge (2022)
Google Scholar
Wang, C., Liu, X., Song, D.: Language models are open knowledge graphs (2020). https://arxiv.org/abs/2010.11967
Zhang, H., et al.: A survey of controllable text generation using transformer-based pre-trained language models. ArXiv abs/2201.05337 (2022)
Google Scholar

Download references

Author information

Authors and Affiliations

Yext, New York, NY, 10011, USA
Ariana Martino, Michael Iannelli & Coleen Truong

Authors

Ariana Martino
View author publications
You can also search for this author in PubMed Google Scholar
Michael Iannelli
View author publications
You can also search for this author in PubMed Google Scholar
Coleen Truong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ariana Martino .

Editor information

Editors and Affiliations

University of Lisbon, Lisbon, Portugal
Catia Pesquita
University of Nantes, Nantes, France
Hala Skaf-Molli
Foundation for Research and Technology, Heraklion, Greece
Vasilis Efthymiou
Vienna University of Economics and Business, Vienna, Austria
Sabrina Kirrane
University of Paderborn, Paderborn, Germany
Axel Ngonga
Fraunhofer IAIS, Sankt Augustin, Germany
Diego Collarana
IBM Research, Rio de Janeiro, Brazil
Renato Cerqueira
Institut Polytechnique de Paris, Palaiseau, France
Mehwish Alam
Institut de Recherche en Informatique de Toulouse, Toulouse, France
Cassia Trojahn
University of Mannheim, Mannheim, Germany
Sven Hertling

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martino, A., Iannelli, M., Truong, C. (2023). Knowledge Injection to Counter Large Language Model (LLM) Hallucination. In: Pesquita, C., et al. The Semantic Web: ESWC 2023 Satellite Events. ESWC 2023. Lecture Notes in Computer Science, vol 13998. Springer, Cham. https://doi.org/10.1007/978-3-031-43458-7_34

Download citation

DOI: https://doi.org/10.1007/978-3-031-43458-7_34
Published: 21 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43457-0
Online ISBN: 978-3-031-43458-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Knowledge Injection to Counter Large Language Model (LLM) Hallucination