RepEKShot: an evidential k-nearest neighbor classifier with repulsion loss for few-shot named entity recognition

Liu, Haitao; Peng, Weiming; Song, Jihua

doi:10.1007/s11227-024-06244-0

RepEKShot: an evidential k-nearest neighbor classifier with repulsion loss for few-shot named entity recognition

Published: 20 June 2024

Volume 80, pages 22069–22098, (2024)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Haitao Liu¹,
Weiming Peng^2,3 &
Jihua Song¹

168 Accesses
Explore all metrics

Abstract

Metric-based models have recently shown promising performance in the few-shot named entity recognition (NER) task. Many methods train their encoders with loss functions that focus on distinguishing different entity types, which ignores improving the ability to recognize ground-truth and interfered labels when making predictions. Furthermore, the inference strategy of nearest neighbor is popular for metric-based models. However, other surrounding neighbors can also provide useful information for NER, and it is hard to determine whether the nearest neighbor is the most suitable referent when multiple neighbors are all close to the query sample. To solve the above problems, we propose RepEKShot, a novel model which utilizes repulsion loss for training the encoder and extends the inference strategy from nearest neighbor to evidential k-nearest neighbor in the framework of Dempster–Shafer theory. Our model effectively optimizes the training of encoder, and sufficiently exploits the information provided by other neighbors to provide a more global perspective for few-shot NER. Extensive experiments have been conducted on two benchmarks with public datasets, and the results show that our model has performance merits in few-shot scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Less is More: A Prototypical Framework for Efficient Few-Shot Named Entity Recognition

Label-Description Enhanced Network for Few-Shot Named Entity Recognition

Enhanced Prototypical Network for Few-Shot Named Entity Recognition

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

The sources of datasets have been listed in the paper.

Notes

Note that we only adopt the domain transfer scenario in the later benchmark, because the support sets for another tag set extension scenario are not publicly available.
According to the original setting, in 1-shot setting, there are 200 support-query pairs for testing CoNLL, GUM and WNUT, and 100 pairs for OntoNotes. In 5-shot setting, all the datasets are tested with 100 support-query pairs.
The data for CrossNER and Domain Transfer can be obtained from https://github.com/AtmaHou/FewShotTagging and https://github.com/asappresearch/structshot.
https://huggingface.co/bert-base-uncased.
https://huggingface.co/bert-base-cased.

References

Hirschman L, Gaizauskas R (2001) Natural language question answering: the view from here. Nat Lang Eng 7(4):275–300. https://doi.org/10.1017/S1351324901002807
Article Google Scholar
Zou X (2020) A survey on application of knowledge graph. J Phys Conf Ser 1487:012016. https://doi.org/10.1088/1742-6596/1487/1/012016
Article Google Scholar
Chen H, Liu X, Yin D, Tang J (2017) A survey on dialogue systems: recent advances and new frontiers. ACM SIGKDD Explor Newsl 19(2):25–35. https://doi.org/10.1145/3166054.3166058
Article Google Scholar
Li J, Sun A, Han J, Li C (2020) A survey on deep learning for named entity recognition. IEEE Trans Knowl Data Eng 34(1):50–70. https://doi.org/10.1109/TKDE.2020.2981314
Article Google Scholar
Huang J, Li C, Subudhi K, Jose D, Balakrishnan S, Chen W, Peng B, Gao J, Han J (2021) Few-shot named entity recognition: an empirical baseline study. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp 10408–10423. https://doi.org/10.18653/v1/2021.emnlp-main.813
Snell J, Swersky K, Zemel R (2017) Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems, vol 30. https://proceedings.neurips.cc/paper_files/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf
Fritzler A, Logacheva V, Kretov M (2019) Few-shot classification in named entity recognition task. In: Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, pp 993–1000. https://doi.org/10.1145/3297280.3297378
Yang Y, Katiyar A (2020) Simple and effective few-shot named entity recognition with structured nearest neighbor learning. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 6365–6375. https://doi.org/10.18653/v1/2020.emnlp-main.516
Das SSS, Katiyar A, Passonneau RJ, Zhang R (2022) Container: few-shot named entity recognition via contrastive learning. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (vol 1: Long Papers), pp 6338–6353. https://doi.org/10.18653/v1/2022.acl-long.439
Cao J, Gao Y, Huang H (2022) A prototype-based few-shot named entity recognition. In: Proceedings of the 8th International Conference on Computing and Artificial Intelligence, pp 338–343. https://doi.org/10.1145/3532213.3532263
Wang X, Xiao T, Jiang Y, Shao S, Sun J, Shen C (2018) Repulsion loss: detecting pedestrians in a crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7774–7783. https://doi.org/10.1109/CVPR.2018.00811
Sachdeva R, Cordeiro FR, Belagiannis V, Reid I, Carneiro G (2023) Scanmix: learning from severe label noise via semantic clustering and semi-supervised learning. Pattern Recogn 134:109121. https://doi.org/10.1016/j.patcog.2022.109121
Article Google Scholar
Zhang G, Zhang S, Yuan G (2024) Bayesian graph local extrema convolution with long-tail strategy for misinformation detection. ACM Trans Knowl Discov Data. https://doi.org/10.1145/3639408
Article Google Scholar
Tong M, Wang S, Xu B, Cao Y, Liu M, Hou L, Li J (2021) Learning from miscellaneous other-class words for few-shot named entity recognition. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (vol 1: Long Papers), pp 6236–6247. https://doi.org/10.18653/v1/2021.acl-long.487
Shafer G (1976) A mathematical theory of evidence. Princeton University Press, Princeton
Book Google Scholar
Denoeux T (1995) A k-nearest neighbor classification rule based on Dempster–Shafer theory. IEEE Trans Syst Man Cybern 25(5):804–813. https://doi.org/10.1109/21.376493
Article Google Scholar
Huang Y, He K, Wang Y, Zhang X, Gong T, Mao R, Li C (2022) Copner: Contrastive learning with prompt guiding for few-shot named entity recognition. In: Proceedings of the 29th International Conference on Computational Linguistics, pp 2515–2527. https://aclanthology.org/2022.coling-1.222
Ding N, Xu G, Chen Y, Wang X, Han X, Xie P, Zheng H, Liu Z (2021) Few-nerd: a few-shot named entity recognition dataset. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (vol 1: Long Papers), pp 3198–3213. https://doi.org/10.18653/v1/2021.acl-long.248
Huang T, Zhang M, Liu K, Li X, Wang Y (2023) Enhanced prototypical network for few-shot named entity recognition. In: International Artificial Intelligence Conference, pp 156–170. https://doi.org/10.1007/978-981-97-1277-9_12
Ma J, Ballesteros M, Doss S, Anubhai R, Mallya S, Al-Onaizan Y, Roth D (2022) Label semantics for few shot named entity recognition. Findings of the Association for Computational Linguistics: ACL 2022, pp 1956–1971.https://doi.org/10.18653/v1/2022.findings-acl.155
Liao Z, Fei J, Zeng W, Zhao X (2023) Few-shot named entity recognition with hybrid multi-prototype learning. World Wide Web 26(5):2521–2544. https://doi.org/10.1007/s11280-023-01143-5
Article Google Scholar
Wen W, Liu Y, Lin Q, Ouyang C (2023) Few-shot named entity recognition with joint token and sentence awareness. Data Intell 5(3):767–785. https://doi.org/10.1162/dint_a_00195
Article Google Scholar
Dong G, Wang Z, Wang L, Guo D, Fu D, Wu Y, Zeng C, Li X, Hui T, He K, et al (2023) A prototypical semantic decoupling method via joint contrastive learning for few-shot named entity recognition. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 1–5 https://doi.org/10.1109/ICASSP49357.2023.10095149
Hou Y, Che W, Lai Y, Zhou Z, Liu Y, Liu H, Liu T (2020) Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 1381–1393 . https://doi.org/10.18653/v1/2020.acl-main.128
Li X, Li X, Zhao M, Yang M, Yu R, Yu M, Yu J (2024) Cliner: exploring task-relevant features and label semantic for few-shot named entity recognition. Neural Comput Appl 36(9):4679–4691. https://doi.org/10.1007/s00521-023-09285-3
Article Google Scholar
Wang P, Xu R, Liu T, Zhou Q, Cao Y, Chang B, Sui Z (2022) An enhanced span-based decomposition method for few-shot sequence labeling. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 5012–5024. https://doi.org/10.18653/v1/2022.naacl-main.369
Ji B, Li S, Gan S, Yu J, Ma J, Liu H, Yang J (2022) Few-shot named entity recognition with entity-level prototypical network enhanced by dispersedly distributed prototypes. In: Proceedings of the 29th International Conference on Computational Linguistics, pp 1842–1854. https://aclanthology.org/2022.coling-1.159
Wang J, Wang C, Tan C, Qiu M, Huang S, Huang J, Gao M (2022) Spanproto: A two-stage span-based prototypical network for few-shot named entity recognition. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp 3466–3476. https://doi.org/10.18653/v1/2022.emnlp-main.227
Feng J, Xu G, Wang Q, Yang Y, Huang L (2024) Note the hierarchy: taxonomy-guided prototype for few-shot named entity recognition. Inf Process Manag 61(1):103557. https://doi.org/10.1016/j.ipm.2023.103557
Article Google Scholar
Zha E, Zeng D, Lin M, Shen Y (2024) Ceptner: contrastive learning enhanced prototypical network for two-stage few-shot named entity recognition. Knowl-Based Syst 295:111730. https://doi.org/10.1016/j.knosys.2024.111730
Article Google Scholar
Zouhal LM, Denoeux T (1998) An evidence-theoretic k-nn rule with parameter optimization. IEEE Trans Syst Man Cybern C (Appl Rev) 28(2):263–271. https://doi.org/10.1109/5326.669565
Article Google Scholar
Jiao L, Pan Q, Feng X, Yang F (2013) An evidential k-nearest neighbor classification method with weighted attributes. In: Proceedings of the 16th International Conference on Information Fusion, pp 145–150. https://ieeexplore.ieee.org/abstract/document/6641178
Lian C, Ruan S, Denœux T (2015) An evidential classifier based on feature selection and two-step classification strategy. Pattern Recogn 48(7):2318–2327. https://doi.org/10.1016/j.patcog.2015.01.019
Article Google Scholar
Lian C, Ruan S, Denoeux T (2016) Dissimilarity metric learning in the belief function framework. IEEE Trans Fuzzy Syst 24(6):1555–1564. https://doi.org/10.1109/TFUZZ.2016.2540068
Article Google Scholar
Su Z, Denoeux T, Hao Y, Zhao M (2018) Evidential k-nn classification with enhanced performance via optimizing a class of parametric conjunctive t-rules. Knowl-Based Syst 142:7–16. https://doi.org/10.1016/j.knosys.2017.11.020
Article Google Scholar
Denoeux T, Kanjanatarakul O, Sriboonchitta S (2019) A new evidential k-nearest neighbor rule based on contextual discounting with partially supervised learning. Int J Approx Reason 113:287–302. https://doi.org/10.1016/j.ijar.2019.07.009
Article MathSciNet Google Scholar
Denoeux T (2000) A neural network classifier based on Dempster–Shafer theory. IEEE Trans Syst Man Cybern A Syst Humans 30(2):131–150. https://doi.org/10.1109/3468.833094
Article Google Scholar
Denoeux T (2019) Logistic regression, neural networks and Dempster–Shafer theory: a new perspective. Knowl-Based Syst 176:54–67. https://doi.org/10.1016/j.knosys.2019.03.030
Article Google Scholar
Capellier E, Davoine F, Cherfaoui V, Li Y (2019) Evidential deep learning for arbitrary lidar object classification in the context of autonomous driving. In: 2019 IEEE Intelligent Vehicles Symposium (IV), pp 1304–1311. https://doi.org/10.1109/IVS.2019.8813846
Tong Z, Xu P, Denoeux T (2019) Convnet and dempster-shafer theory for object recognition. In: Scalable Uncertainty Management: 13th International Conference, SUM 2019, Compiègne, France, 16–18 Dec 2019, Proceedings 13, pp 368–381. https://doi.org/10.1007/978-3-030-35514-2_27
Tong Z, Xu P, Denoeux T (2021) An evidential classifier based on Dempster–Shafer theory and deep learning. Neurocomputing 450:275–293. https://doi.org/10.1016/j.neucom.2021.03.066
Article Google Scholar
Huang L, Ruan S, Decazes P, Denoeux T (2021) Evidential segmentation of 3d pet/ct images. In: Belief Functions: Theory and Applications: 6th International Conference, BELIEF 2021, Shanghai, China, 15–19 Oct 2021, Proceedings, pp 159–167. https://doi.org/10.1007/978-3-030-88601-1_16
Huang L, Ruan S, Decazes P, Denœux T (2022) Lymphoma segmentation from 3d pet-ct images using a deep evidential network. Int J Approx Reason 149:39–60. https://doi.org/10.1016/j.ijar.2022.06.007
Article Google Scholar
Huang L, Ruan S, Denoeux T (2021) Belief function-based semi-supervised learning for brain tumor segmentation. In: 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), pp 160–164. https://doi.org/10.1109/ISBI48211.2021.9433885
Yue X, Chen Y, Yuan B, Lv Y (2022) Three-way image classification with evidential deep convolutional neural networks. Cogn Comput 14:2074–2086. https://doi.org/10.1007/s12559-021-09869-y
Article Google Scholar
Xu S, Chen Y, Ma C, Yue X (2022) Deep evidential fusion network for medical image classification. Int J Approx Reason 150:188–198. https://doi.org/10.1016/j.ijar.2022.08.013
Article MathSciNet Google Scholar
Qiang C, Deng Y (2022) A new correlation coefficient of mass function in evidence theory and its application in fault diagnosis. Appl Intell 52(7):7832–7842. https://doi.org/10.1007/s10489-021-02797-2
Article Google Scholar
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp 4171–4186. https://doi.org/10.18653/v1/N19-1423
Weischedel R, Palmer M, Marcus M, Hovy E, Pradhan S, Ramshaw L, Xue N, Taylor A, Kaufman J, Franchini M, et al. (2013) Ontonotes release 5.0 ldc2013t19. Linguistic Data Consortium, Philadelphia, PA, vol 23
Sang EFTK, De Meulder F (2003) Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp 142–147. https://doi.org/10.3115/1119176.1119195
Stubbs A, Uzuner Ö (2015) Annotating longitudinal clinical narratives for de-identification: the 2014 i2b2/uthealth corpus. J Biomed Inform 58:20–29. https://doi.org/10.1016/j.jbi.2015.07.020
Article Google Scholar
Derczynski L, Nichols E, Van Erp M, Limsopatham N (2017) Results of the wnut2017 shared task on novel and emerging entity recognition. In: Proceedings of the 3rd Workshop on Noisy User-generated Text, pp 140–147. https://doi.org/10.18653/v1/W17-4418
Zeldes A (2017) The gum corpus: creating multilayer resources in the classroom. Lang Resour Eval 51(3):581–612. https://doi.org/10.1007/s10579-016-9343-x
Article Google Scholar
Liu J, Pasupat P, Cyphers S, Glass J (2013) Asgard: a portable architecture for multilingual dialogue systems. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp 8386–8390. https://doi.org/10.1109/ICASSP.2013.6639301
Chen P, Xu H, Zhang C, Huang R (2022) Crossroads, buildings and neighborhoods: a dataset for fine-grained location recognition. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 3329–3339. https://doi.org/10.18653/v1/2022.naacl-main.243
Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(86):2579–2605
Google Scholar
Zheng X, Chen H, Xu T (2013) Deep learning for Chinese word segmentation and POS tagging. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp 647–657. https://aclanthology.org/D13-1061/

Download references

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 62007004), the Beijing Natural Science Foundation (Grant No. 4234081) and the Major Program of Key Research Base of Humanities and Social Sciences of the Ministry of Education of China (22JJD740017).

Author information

Authors and Affiliations

School of Artificial Intelligence, Beijing Normal University, No. 19, Xinjiekouwai St, Haidian District, Beijing, 100875, China
Haitao Liu & Jihua Song
Chinese Character Research and Application Laboratory, Beijing Normal University, No. 19, Xinjiekouwai St, Haidian District, Beijing, 100875, China
Weiming Peng
Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, 19104, USA
Weiming Peng

Authors

Haitao Liu
View author publications
You can also search for this author inPubMed Google Scholar
Weiming Peng
View author publications
You can also search for this author inPubMed Google Scholar
Jihua Song
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Haitao Liu contributed to data curation, investigation, resources, software, writing—original draft. Weiming Peng contributed to conceptualization, funding acquisition, methodology, writing—review and editing. Jihua Song: funding acquisition, formal analysis, project administration, supervision.

Corresponding author

Correspondence to Jihua Song.

Ethics declarations

Conflict of interest

The authors declare no potential conflict of interest.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, H., Peng, W. & Song, J. RepEKShot: an evidential k-nearest neighbor classifier with repulsion loss for few-shot named entity recognition. J Supercomput 80, 22069–22098 (2024). https://doi.org/10.1007/s11227-024-06244-0

Download citation

Accepted: 16 May 2024
Published: 20 June 2024
Issue Date: October 2024
DOI: https://doi.org/10.1007/s11227-024-06244-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RepEKShot: an evidential k-nearest neighbor classifier with repulsion loss for few-shot named entity recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Less is More: A Prototypical Framework for Efficient Few-Shot Named Entity Recognition

Label-Description Enhanced Network for Few-Shot Named Entity Recognition

Enhanced Prototypical Network for Few-Shot Named Entity Recognition

Explore related subjects

Data availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now