TOKEN Is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models

Davody, Ali; Adelani, David Ifeoluwa; Kleinbauer, Thomas; Klakow, Dietrich

doi:10.1007/978-3-031-16270-1_12

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13502))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1120 Accesses

Abstract

Transferring knowledge from one domain to another is of practical importance for many tasks in natural language processing, especially when the amount of available data in the target domain is limited. In this work, we propose a novel few-shot approach to domain adaptation in the context of Named Entity Recognition (NER). We propose a two-step approach consisting of a variable base module and a template module that leverages the knowledge captured in pre-trained language models with the help of simple descriptive patterns. Our approach is simple yet versatile, and can be applied in few-shot and zero-shot settings. Evaluating our lightweight approach across a number of different datasets shows that it can boost the performance of state-of-the-art baselines by $2-5\%$ F1-score.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Attending to Entity Class Attributes for Named Entity Recognition with Few-Shot Learning

Semantically-Informed Domain Adaptation for Named Entity Recognition

Less is More: A Prototypical Framework for Efficient Few-Shot Named Entity Recognition

Notes

1.
https://github.com/uds-lsv/TOKEN-is-a-MASK.
2.
We make use of Spacy POS tagger https://spacy.io/usage/linguistic-features.

References

Brown, T., et al.: Language models are few-shot learners. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) In: NeurIPS, vol. 33, pp. 1877–1901 (2020)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186, June 2019
Google Scholar
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: ICML (2017)
Google Scholar
Fritzler, A., Logacheva, V., Kretov, M.: Few-shot classification in named entity recognition task. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, SAC 2019, pp. 993–1000 (2019)
Google Scholar
Gao, T., Fisch, A., Chen, D.: Making pre-trained language models better few-shot learners. arXiv:2012.15723, 31 December 2020
Gao, T., Fisch, A., Chen, D.: Making pre-trained language models better few-shot learners (2020)
Google Scholar
Hou, Y., et al.: Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In: ACL, pp. 1381–1393. Online, July 2020
Google Scholar
Huang, J., et al.: Few-shot named entity recognition: a comprehensive study. ArXiv abs/2012.14978 (2020)
Google Scholar
Krone, J., Zhang, Y., Diab, M.: Learning to classify intents and slot labels given a handful of examples. In: Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI, pp. 96–108. Association for Computational Linguistics, July 2020
Google Scholar
Li, J., Shang, S., Shao, L.: Metaner: Named entity recognition with meta-learning. In: Proceedings of The Web Conference 2020, pp. 429–440. WWW 2020. Association for Computing Machinery, New York, (2020)
Google Scholar
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. ArXiv abs/1907.11692 (2019)
Google Scholar
Peters, M.E., et al.: Deep contextualized word representations. In: NAACL-HLT, pp. 2227–2237, June 2018
Google Scholar
Petroni, F., et al.: Language models as knowledge bases? (2019)
Google Scholar
Pradhan, S., et al.: Towards robust linguistic analysis using OntoNotes. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pp. 143–152. Association for Computational Linguistics, Sofia, Bulgaria, August 2013
Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019). https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)
Google Scholar
Ravi, S., Larochelle, H.: Optimization as a model for few-shot learning. In: ICLR (2017)
Google Scholar
Schick, T., Schütze, H.: Exploiting cloze questions for few shot text classification and natural language inference. In: EACL, 19–23 April 2021
Google Scholar
Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S.: AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. In: EMNLP. pp. 4222–4235. Online (Nov 2020)
Google Scholar
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) NeurIPS, vol. 30, pp. 4077–4087. Curran Associates, Inc. (2017)
Google Scholar
Stubbs, A., Kotfila, C., Uzuner, O.: Automated systems for the de-identification of longitudinal clinical narratives. J. Biomed. Inf. 58(S), S11–S19 (2015)
Google Scholar
Tjong Kim Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In: Proceedings of the 7th CoNLL at HLT-NAACL 2003, pp. 142–147 (2003)
Google Scholar
Wallace, E., Feng, S., Kandpal, N., Gardner, M., Singh, S.: Universal adversarial triggers for attacking and analyzing NLP. In: EMNLP/IJCNLP (2019)
Google Scholar
Wang, J., Kulkarni, M., Preotiuc-Pietro, D.: Multi-domain named entity recognition with genre-aware and agnostic inference. In: ACL. pp. 8476–8488. Online (Jul 2020)
Google Scholar
Yamada, I., Asai, A., Shindo, H., Takeda, H., Matsumoto, Y.: LUKE: deep contextualized entity representations with entity-aware self-attention. In: EMNLP, pp. 6442–6454, November 2020
Google Scholar
Yang, Y., Katiyar, A.: Simple and effective few-shot named entity recognition with structured nearest neighbor learning. In: EMNLP, pp. 6365–6375, November 2020
Google Scholar
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: Xlnet: Generalized autoregressive pretraining for language understanding. In: NeurIPS. vol. 32, pp. 5753–5763 (2019)
Google Scholar

Download references

Acknowledgments

This work was funded by the EU-funded Horizon 2020 projects: COMPRISE (http://www.compriseh2020.eu/) under grant agreement No. 3081705 and ROXANNE under grant number 833635.

Author information

Authors and Affiliations

Spoken Language Systems Group, Saarland Informatics Campus, Saarland University, Saarbrücken, Germany
Ali Davody, David Ifeoluwa Adelani, Thomas Kleinbauer & Dietrich Klakow
Testifi.io, Munich, Germany
Ali Davody

Authors

Ali Davody
View author publications
You can also search for this author in PubMed Google Scholar
David Ifeoluwa Adelani
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Kleinbauer
View author publications
You can also search for this author in PubMed Google Scholar
Dietrich Klakow
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Ifeoluwa Adelani .

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Aleš Horák
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Masaryk University, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Davody, A., Adelani, D.I., Kleinbauer, T., Klakow, D. (2022). TOKEN Is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2022. Lecture Notes in Computer Science(), vol 13502. Springer, Cham. https://doi.org/10.1007/978-3-031-16270-1_12

Download citation

DOI: https://doi.org/10.1007/978-3-031-16270-1_12
Published: 16 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16269-5
Online ISBN: 978-3-031-16270-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

TOKEN Is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models