CALM: Context Augmentation with Large Language Model for Named Entity Recognition

Luiggi, Tristan; Herserant, Tanguy; Tran, Thong; Soulier, Laure; Guigue, Vincent

doi:10.1007/978-3-031-72437-4_16

Tristan Luiggi^14,15,
Tanguy Herserant^14,16,
Thong Tran¹⁵,
Laure Soulier¹⁴ &
…
Vincent Guigue¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15177))

Included in the following conference series:

International Conference on Theory and Practice of Digital Libraries

546 Accesses

Abstract

In prior research on Named Entity Recognition (NER), the focus has been on addressing challenges arising from data scarcity and overfitting, particularly in the context of increasingly complex transformer-based architectures. A framework based on information retrieval (IR), using a search engine to increase input samples and mitigate overfitting tendencies, has been proposed. However, the effectiveness of such system is limited, as they were not designed for this specific application. While this approach serves as a solid foundation, we maintain that LLMs offer capabilities surpassing those of search engines, with greater flexibility in terms of semantic analysis and generation. To overcome these challenges, we propose CALM an innovative context augmentation method, designed for adaptability through prompting. In our study, prompts are meticulously defined as pairs comprising specific tasks and their corresponding response strategies. This careful definition of prompts is pivotal in realizing optimal performance. Our findings illustrate that the resultant context enhances the robustness and performances on NER datasets. We achieve state-of-the-art F1 scores on WNUT17 and CoNLL++. We also delve into the qualitative impact of prompting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.99; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking

Prompt-Based Data Augmentation Framework for Few-Shot Named Entity Recognition

Evaluating the Effect of Letter Case on Named Entity Recognition Performance

References

Bontcheva, K., Roberts, I., Derczynski, L., Rout, D.: The gate crowdsourcing plugin: crowdsourcing annotated corpora made easy. In: EACL 2014 - Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 97–100 (2014). https://doi.org/10.3115/V1/E14-2025, https://aclanthology.org/E14-2025
Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional LSTM-CNNs. Trans. Assoc. Comput. Linguist. 4, 357–370 (2016)
Article Google Scholar
Cui, L., Wu, Y., Liu, J., Yang, S., Zhang, Y.: Template-based named entity recognition using BART. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1835–1845, June 2021. https://doi.org/10.18653/v1/2021.findings-acl.161, https://arxiv.org/abs/2106.01760v1
Derczynski, L., Nichols, E., Van Erp, M., Limsopatham, N.: Results of the WNUT2017 shared task on novel and emerging entity recognition. In: Proceedings of the 3rd Workshop on Noisy User-Generated Text, pp. 140–147 (2017)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Ding, N., et al.: Prompt-learning for fine-grained entity typing. In: Findings of the Association for Computational Linguistics: EMNLP 2022, pp. 6917–6930, August 2021. https://doi.org/10.18653/v1/2022.findings-emnlp.512, https://arxiv.org/abs/2108.10604v1
Hancock, B., Bringmann, M., Varma, P., Liang, P., Wang, S., Ré, C.: Training classifiers with natural language explanations. In: ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), vol. 1, pp. 1884–1895 (2018). https://doi.org/10.18653/V1/P18-1175, https://aclanthology.org/P18-1175
Hou, Y., Liu, Y., Che, W., Liu, T.: Sequence-to-sequence data augmentation for dialogue language understanding (2018). https://aclanthology.org/C18-1105
Hu, J., Shen, Y., Liu, Y., Wan, X., Chang, T.H.: Hero-gang neural model for named entity recognition (2022)
Google Scholar
Huffman, S.B.: Learning information extraction patterns from examples. In: Wermter, S., Riloff, E., Scheler, G. (eds.) IJCAI 1995. LNCS, vol. 1040, pp. 246–260. Springer, Heidelberg (1996). https://doi.org/10.1007/3-540-60925-3_51
Chapter Google Scholar
Iyyer, M., Wieting, J., Gimpel, K., Zettlemoyer, L.: Adversarial example generation with syntactically controlled paraphrase networks. In: NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, vol. 1, pp. 1875–1885 (2018). https://doi.org/10.18653/V1/N18-1170, https://aclanthology.org/N18-1170
Jeong, M., Kang, J.: Regularizing models via pointwise mutual information for named entity recognition. CoRR abs/2104.07249 (2021). https://arxiv.org/abs/2104.07249
Jeong, M., Kang, J.: Enhancing label consistency on document-level named entity recognition (2022)
Google Scholar
Jiang, A.Q., et al.: Mistral 7B (2023)
Google Scholar
Kobayashi, S.: Contextual augmentation: data augmentation by words with paradigmatic relations. In: NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, vol. 2, pp. 452–457 (2018). https://doi.org/10.18653/V1/N18-2072, https://aclanthology.org/N18-2072
Kocaman, V., Talby, D.: Biomedical named entity recognition at scale (2020)
Google Scholar
Kumar, V., Ai, A., Choudhary, A., Cho, E.: Data augmentation using pre-trained transformer models (2020). https://aclanthology.org/2020.lifelongnlp-1.3
Kurata, G., Xiang, B., Zhou, B.: Labeled data generation with encoder-decoder LSTM for semantic slot filling. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 8–12 September 2016, pp. 725–729 (2016). https://doi.org/10.21437/INTERSPEECH.2016-727
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360 (2016)
Lee, D.H., et al.: LEAN-LIFE: a label-efficient annotation framework towards learning from explanation. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 372–379 (2020). https://doi.org/10.18653/V1/2020.ACL-DEMOS.42, https://aclanthology.org/2020.acl-demos.42
Lee, D.H., et al.: AutoTrigger: label-efficient and robust named entity recognition with auxiliary trigger extraction. In: EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, pp. 3003–3017, September 2021. https://doi.org/10.18653/v1/2023.eacl-main.219, https://arxiv.org/abs/2109.04726v3
Li, J., et al.: BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database 2016, baw068 (2016)
Google Scholar
Li, X., Sun, X., Meng, Y., Liang, J., Wu, F., Li, J.: Dice loss for data-imbalanced NLP tasks. CoRR abs/1911.02855 (2019). http://arxiv.org/abs/1911.02855
Lin, B.Y., et al.: TriggerNER: learning with entity triggers as explanations for named entity recognition. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 8503–8511 (2020). https://doi.org/10.18653/V1/2020.ACL-MAIN.752, https://aclanthology.org/2020.acl-main.752
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9), 1–35 (2023)
Article Google Scholar
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach (2019). arXiv preprint arXiv:1907.11692, vol. 364 (2019)
Loshchilov, I., Hutter, F.: Fixing weight decay regularization in adam (2018)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. CoRR abs/1310.4546 (2013). http://arxiv.org/abs/1310.4546
Min, J., McCoy, R.T., Das, D., Pitler, E., Linzen, T.: Syntactic data augmentation increases robustness to inference heuristics. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 2339–2352 (2020). https://doi.org/10.18653/V1/2020.ACL-MAIN.212, https://aclanthology.org/2020.acl-main.212
Morton, T.S., LaCivita, J.: WordFreak: an open tool for linguistic annotation (2003). https://aclanthology.org/N03-4009
OpenAI: GPT-4 technical report (2023)
Google Scholar
Peters, M.E., et al.: Deep contextualized word representations (2018)
Google Scholar
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. CoRR abs/1910.10683 (2019). http://arxiv.org/abs/1910.10683
Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Natural Language Processing Using Very Large Corpora, pp. 157–176 (1999)
Google Scholar
Rei, M.: Semi-supervised multitask learning for sequence labeling. CoRR abs/1704.07156 (2017). http://arxiv.org/abs/1704.07156
Sang, E.F., De Meulder, F.: Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. arXiv preprint cs/0306050 (2003)
Google Scholar
Seyler, D., Dembelova, T., Del Corro, L., Hoffart, J., Weikum, G.: A study of the importance of external knowledge in the named entity recognition task. In: Gurevych, I., Miyao, Y. (eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 241–246. Association for Computational Linguistics, Melbourne, Australia, July 2018. https://doi.org/10.18653/v1/P18-2039, https://aclanthology.org/P18-2039
Singh, T.D., Nongmeikapam, K., Ekbal, A., Bandyopadhyay, S.: Named entity recognition for Manipuri using support vector machine. In: Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, vol. 2, pp. 811–818 (2009)
Google Scholar
Srivastava, S., Labutov, I., Mitchell, T.: Joint concept learning and semantic parsing from natural language explanations. In: EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings, pp. 1527–1536 (2017). https://doi.org/10.18653/V1/D17-1161, https://aclanthology.org/D17-1161
Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: BRAT: a web-based tool for NLP-assisted text annotation (2012). https://aclanthology.org/E12-2021
Sutton, C., McCallum, A.: An introduction to conditional random fields (2010)
Google Scholar
Taillé, B., Guigue, V., Gallinari, P.: Contextualized embeddings in named-entity recognition: an empirical study on generalization. In: Jose, J., et al. (eds.) Advances in Information Retrieval: 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, 14–17 April 2020, Proceedings, Part II 42, pp. 383–391. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-45442-5_48
Touvron, H., et al.: LLaMA 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023)
Ushio, A., Camacho-Collados, J.: T-NER: an all-round python library for transformer-based named entity recognition. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.eacl-demos.7
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, S., et al.: GPT-NER: named entity recognition via large language models. arXiv preprint arXiv:2304.10428 (2023)
Wang, X., et al.: Automated concatenation of embeddings for structured prediction. CoRR abs/2010.05006 (2020). https://arxiv.org/abs/2010.05006
Wang, X., et al.: Improving named entity recognition by external context retrieving and cooperative learning. arXiv preprint arXiv:2105.03654 (2021)
Wang, Z., Shang, J., Liu, L., Lu, L., Liu, J., Han, J.: CrossWeigh: training named entity tagger from imperfect annotations. CoRR abs/1909.01441 (2019). http://arxiv.org/abs/1909.01441
Wang*, Z., et al.: Learning from explanations with neural execution tree, September 2019. http://inklab.usc.edu/project-NExT/
Wei, J., Zou, K.: EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, pp. 6382–6388 (2019). https://doi.org/10.18653/V1/D19-1670, https://aclanthology.org/D19-1670
White, J., et al.: A prompt pattern catalog to enhance prompt engineering with ChatGPT. arXiv preprint arXiv:2302.11382 (2023)
Wu, X., Lv, S., Zang, L., Han, J., Hu, S.: Conditional BERT contextual augmentation. In: Rodrigues, J., et al. (eds.) ICCS 2019. LNCS (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11539, pp. 84–95. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-22747-0_7, https://arxiv.org/abs/1812.06705v1
Yamada, I., Asai, A., Shindo, H., Takeda, H., Matsumoto, Y.: LUKE: deep contextualized entity representations with entity-aware self-attention. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 6442–6454. Association for Computational Linguistics, Online, November 2020. https://doi.org/10.18653/v1/2020.emnlp-main.523, https://aclanthology.org/2020.emnlp-main.523
Yang, J., Zhang, Y., Li, L., Li, X.: YEDDA: a lightweight collaborative text span annotation tool. In: ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of System Demonstrations, pp. 31–36 (2018). https://doi.org/10.18653/V1/P18-4006, https://aclanthology.org/P18-4006
Yu, A.W., et al.: QANet: combining local convolution with global self-attention for reading comprehension. In: 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, April 2018. https://arxiv.org/abs/1804.09541v1
Zhang, S., Cheng, H., Gao, J., Poon, H.: Optimizing bi-encoder for named entity recognition via contrastive learning (2023)
Google Scholar
Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: BERTScore: evaluating text generation with BERT. arXiv preprint arXiv:1904.09675 (2019)
Zhang, X., Zhao, J., Lecun, Y.: Character-level convolutional networks for text classification (2015)
Google Scholar
Zhou, W., Chen, M.: Learning from noisy labels for entity-centric information extraction. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.emnlp-main.437
Zhou, W., et al.: NERO: a neural rule grounding framework for label-efficient relation extraction. In: The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020, pp. 2166–2176, September 2019. https://doi.org/10.1145/3366423.3380282, https://arxiv.org/abs/1909.02177v4

Download references

Author information

Authors and Affiliations

Sorbonne Université, CNRS, ISIR, 75005, Paris, France
Tristan Luiggi, Tanguy Herserant & Laure Soulier
Upskills R&D, Choisy-le-Roi, France
Tristan Luiggi & Thong Tran
AgroParisTech, UMR MIA-PS, Paris, France
Tanguy Herserant & Vincent Guigue

Authors

Tristan Luiggi
View author publications
You can also search for this author in PubMed Google Scholar
Tanguy Herserant
View author publications
You can also search for this author in PubMed Google Scholar
Thong Tran
View author publications
You can also search for this author in PubMed Google Scholar
Laure Soulier
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Guigue
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tristan Luiggi .

Editor information

Editors and Affiliations

University of Salford, Salford, UK
Apostolos Antonacopoulos
University of Waikato, Hamilton, New Zealand
Annika Hinze
Sorbonne University (CNRS), Paris, France
Benjamin Piwowarski
University of La Rochelle (L3i Laboratory), La Rochelle, France
Mickaël Coustaty
University of Padova, Padua, Italy
Giorgio Maria Di Nunzio
University of Hamburg, Hamburg, Germany
Francesco Gelati
University of Waikato, Hamilton, New Zealand
Nicholas Vanderschantz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luiggi, T., Herserant, T., Tran, T., Soulier, L., Guigue, V. (2024). CALM: Context Augmentation with Large Language Model for Named Entity Recognition. In: Antonacopoulos, A., et al. Linking Theory and Practice of Digital Libraries. TPDL 2024. Lecture Notes in Computer Science, vol 15177. Springer, Cham. https://doi.org/10.1007/978-3-031-72437-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-72437-4_16
Published: 26 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72436-7
Online ISBN: 978-3-031-72437-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CALM: Context Augmentation with Large Language Model for Named Entity Recognition