Knowledge-Grounded and Self-extending NER

Kamath Barkur, Sudarshan; Schacht, Sigurd; Lanquillon, Carsten

doi:10.1007/978-3-031-36004-6_60

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1836))

Included in the following conference series:

International Conference on Human-Computer Interaction

1601 Accesses

Abstract

The wave of digitization has begun. Organizations deal with huge amounts of data, such as logs, websites, and documents. A common way to make the information contained in these sources machine-accessible for automated processing is to first extract the information and then store it in a knowledge graph. A key task in this approach is to recognize entities. While common named entity recognition (NER) models work well for common entity types, they typically fail to recognize custom entities. Custom entity recognition requires data to be manually annotated and custom NER models to be trained. To efficiently extract the information, this paper proposes an innovative solution: Our Gazetteer approach uses a knowledge graph to create a coarse and fast NER component, reducing the need for manual annotation and saving human effort. Focusing on a university use case, our Gazetteer is integrated into a chatbot for entity recognition. In addition, data can be annotated using the Gazetteer and an NER model can be trained. Subsequently, the NER model can be used to recognize unseen custom entities, which are then added to the knowledge graph. This will improve the knowledge graph and make it self-extending.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

AI, O.: ChatGPT - a sibling of InstructGPT which is trained to follow an instruction (2023). https://chat.openai.com/chat
ArangoDB: ArangoDB - a native multi-model database with flexible data models for documents, graphs, and key-values (2023). https://www.arangodb.com/
Community, A.: Python Arango - Python driver for Arango (2023), https://github.com/ArangoDB-Community/python-arango
Effland, T., Collins, M.: Partially supervised named entity recognition via the expected entity ratio loss (2021). https://doi.org/10.48550/ARXIV.2108.07216, https://arxiv.org/abs/2108.07216
Explosion: SpaCy - Industrial-strength Natural Language Processing (NLP) in Python (2023). https://spacy.io/
Flair: Flair - a very simple framework for state-of-the-art Natural Language Processing (NLP) (2023). https://github.com/flairNLP/flair
Henne, S., Mehlin, V., Schmid, E., Schacht, S.: The DIAS project. development of an intelligent digital assistant in higher education. In: Proceedings of the 4th International Conference Business Meets Technology (BMT22). Editorial Universitat Politècnica de València (2023)
Google Scholar
HuggingFace: HuggingFace Transformers - State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX (2023), https://github.com/huggingface/transformers
Pasupat, P., Liang, P.: Zero-shot entity extraction from web pages. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 391–401 (2014)
Google Scholar
Rasa: Rasa - an open source machine learning framework to automate text and voice-based conversations (2023). https://rasa.com/
Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. arXiv preprint cs/0306050 (2003)
Google Scholar
Singh, V.: Replace or retrieve keywords in documents at scale. arXiv preprint arXiv:1711.00046 (2017)

Download references

Author information

Authors and Affiliations

University of Applied Sciences, Ansbach, Germany
Sudarshan Kamath Barkur & Sigurd Schacht
University of Applied Sciences, Heilbronn, Germany
Carsten Lanquillon

Authors

Sudarshan Kamath Barkur
View author publications
You can also search for this author in PubMed Google Scholar
Sigurd Schacht
View author publications
You can also search for this author in PubMed Google Scholar
Carsten Lanquillon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sudarshan Kamath Barkur .

Editor information

Editors and Affiliations

University of Crete and Foundation for Research and Technology - Hellas (FORTH), Heraklion, Crete, Greece
Constantine Stephanidis
Foundation for Research and Technology - Hellas (FORTH), Heraklion, Crete, Greece
Margherita Antona
Foundation for Research and Technology - Hellas (FORTH), Heraklion, Crete, Greece
Stavroula Ntoa
University of Central Florida, Orlando, FL, USA
Gavriel Salvendy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kamath Barkur, S., Schacht, S., Lanquillon, C. (2023). Knowledge-Grounded and Self-extending NER. In: Stephanidis, C., Antona, M., Ntoa, S., Salvendy, G. (eds) HCI International 2023 Posters. HCII 2023. Communications in Computer and Information Science, vol 1836. Springer, Cham. https://doi.org/10.1007/978-3-031-36004-6_60

Download citation

DOI: https://doi.org/10.1007/978-3-031-36004-6_60
Published: 09 July 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36003-9
Online ISBN: 978-3-031-36004-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Knowledge-Grounded and Self-extending NER