research-article

Named Entity Recognition using Knowledge Graph Embeddings and DistilBERT

Authors:

Shreyansh Mehta,

Sagar SunkleAuthors Info & Claims

NLPIR '21: Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval

Pages 146 - 150

https://doi.org/10.1145/3508230.3508252

Published: 08 March 2022 Publication History

Abstract

Named Entity Recognition (NER) is a Natural Language Processing (NLP) task of identifying entities from a natural language text and classifies them into categories like Person, Location, Organization etc. Pre-trained neural language models (PNLM) based on transformers are state-of-the-art in many NLP task including NER. Analysis of output of DistilBERT, a popular PNLM, reveals that mis-classifications occur when a non-entity word is at a place contextually suitable for an entity. The paper is based on the hypothesis that the performance of a PNLM can be improved by combining it with Knowledge Graph Embeddings (KGE). We show that fine-tuning of DistilBERT along with NumberBatch KGE gives performance improvement over various Open-domain as well as Biomedical-domain datasets.

References

[1]

Daniel M Bikel, Richard Schwartz, and Ralph M Weischedel. 1999. An algorithm that learns what’s in a name. Machine learning 34, 1 (1999), 211–231.

[2]

Peter F Brown, Vincent J Della Pietra, Peter V Desouza, Jennifer C Lai, and Robert L Mercer. 1992. Class-based n-gram models of natural language. Computational linguistics 18, 4 (1992), 467–480.

[3]

Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078(2014).

[4]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).

[5]

Rezarta Islamaj Doğan, Robert Leaman, and Zhiyong Lu. 2014. NCBI disease corpus: a resource for disease name recognition and concept normalization. Journal of biomedical informatics 47 (2014), 1–10.

[6]

Daniel Hanisch, Katrin Fundel, Heinz-Theodor Mevissen, Ralf Zimmer, and Juliane Fluck. 2005. ProMiner: rule-based protein and gene entity recognition. BMC bioinformatics 6, 1 (2005), 1–9.

[7]

Kurt Hornik, Maxwell Stinchcombe, and Halbert White. 1989. Multilayer feedforward networks are universal approximators. Neural networks 2, 5 (1989), 359–366.

Digital Library

[8]

Shaoxiong Ji, Shirui Pan, Erik Cambria, Pekka Marttinen, and S Yu Philip. 2021. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems (2021).

[9]

Ben Kröse, Ben Krose, Patrick van der Smagt, and Patrick Smagt. 1993. An introduction to neural networks. (1993).

[10]

Jiao Li, Yueping Sun, Robin J Johnson, Daniela Sciaky, Chih-Hsuan Wei, Robert Leaman, Allan Peter Davis, Carolyn J Mattingly, Thomas C Wiegers, and Zhiyong Lu. 2016. BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database 2016(2016).

[11]

Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. arxiv:1711.05101 [cs.LG]

[12]

Larry R Medsker and LC Jain. 2001. Recurrent neural networks. Design and Applications 5 (2001), 64–67.

[13]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781(2013).

[14]

Hiroki Nakayama. 2018. seqeval: A Python framework for sequence labeling evaluation. https://github.com/chakki-works/seqeval Software available from https://github.com/chakki-works/seqeval.

[15]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.

[16]

Afshin Rahimi, Yuan Li, and Trevor Cohn. 2019. Massively Multilingual Transfer for NER. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 151–164. https://www.aclweb.org/anthology/P19-1015

[17]

Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. arXiv:cs/0306050 (June 2003). http://arxiv.org/abs/cs/0306050 arXiv:cs/0306050.

[18]

Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108(2019).

[19]

Robyn Speer and Joanna Lowry-Duda. 2017. Conceptnet at semeval-2017 task 2: Extending word embeddings with multilingual relational knowledge. arXiv preprint arXiv:1704.03560(2017).

[20]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. arXiv preprint arXiv:1706.03762(2017).

[21]

Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering 29, 12(2017), 2724–2743.

[22]

Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Online, 38–45. https://www.aclweb.org/anthology/2020.emnlp-demos.6

[23]

Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, and Yuji Matsumoto. 2020. LUKE: deep contextualized entity representations with entity-aware self-attention. arXiv preprint arXiv:2010.01057(2020).

Cited By

S AKrishnan AG V(2024)Comparison of Different BERT models for Document Clustering2024 4th Asian Conference on Innovation in Technology (ASIANCON)10.1109/ASIANCON62057.2024.10837738(1-7)Online publication date: 23-Aug-2024
https://doi.org/10.1109/ASIANCON62057.2024.10837738

Recommendations

Learning multilingual named entity recognition from Wikipedia

We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...
Two-stage approach to named entity recognition using Wikipedia and DBpedia
IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication

In natural language understanding, extraction of named entity (NE) mentions in given text and classification of the mentions into pre-defined NE types are important processes. Most NE recognition (NER) relies on resources such as a training corpus or NE ...
Improving Named Entity Recognition of English and Vietnamese Languages using Bilingual Constraints
NLPIR '18: Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval

Named entity recognition plays a crucial role in many Natural Language Processing tasks because the semantic information is carried by entities. The recent efforts are trying to reduce the annotation labor because the state-of-the-art Named Entity ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

NLPIR '21: Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval

December 2021

175 pages

ISBN:9781450387354

DOI:10.1145/3508230

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 March 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

NLPIR 2021

NLPIR 2021: 2021 5th International Conference on Natural Language Processing and Information Retrieval

December 17 - 20, 2021

Sanya, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
190
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)1

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

S AKrishnan AG V(2024)Comparison of Different BERT models for Document Clustering2024 4th Asian Conference on Innovation in Technology (ASIANCON)10.1109/ASIANCON62057.2024.10837738(1-7)Online publication date: 23-Aug-2024
https://doi.org/10.1109/ASIANCON62057.2024.10837738

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten