Prompt-Based Self-training Framework for Few-Shot Named Entity Recognition

Huang, Ganghong; Zhong, Jiang; Wang, Chen; Dai, Qizhu; Li, Rongzhen

doi:10.1007/978-3-031-10989-8_8

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13370))

Included in the following conference series:

International Conference on Knowledge Science, Engineering and Management

2046 Accesses
1 Citations

Abstract

Exploiting unlabeled data is one of the plausible methods to improve few-shot named entity recognition (few-shot NER), where only a small number of labeled examples are given for each entity type. Existing works focus on learning deep NER models with self-training for few-shot NER. Self-training may induce incomplete and noisy labels which do not necessarily improve or even deteriorate the model performance. To address this challenge, we propose a prompt-based self-training framework. In the first stage, we introduce a self-training approach with prompt tuning to improve the model performance. Specially, we explore several label selection strategies in self-training to mitigate error propagation from noisy pseudo-labels. In the second stage, we fine-tune the BERT model over the high confidence pseudo-labels and original labels. We conduct experiments on two benchmark datasets. The results show that our method outperforms existing few-shot NER models by significant margins, demonstrating its effectiveness for the few-shot setting.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Brown, T., et al.: Language models are few-shot learners. Adv. Neural Inf. Process. Sys. 33, 1877–1901 (2020)
Google Scholar
Chen, L., Ruan, W., Liu, X., Lu, J.: SeqVAT: virtual adversarial training for semi-supervised sequence labeling. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8801–8811 (2020)
Google Scholar
Clark, K., Luong, M.T., Manning, C.D., Le, Q.V.: Semi-supervised sequence modeling with cross-view training. arXiv preprint arXiv:1809.08370 (2018)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Ding, Z., Liu, K., Wang, W., Liu, B.: A semantic textual similarity calculation model based on pre-training model. In: Qiu, H., Zhang, C., Fei, Z., Qiu, M., Kung, S.-Y. (eds.) KSEM 2021. LNCS (LNAI), vol. 12816, pp. 3–15. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82147-0_1
Chapter Google Scholar
Fries, J., Wu, S., Ratner, A., Ré, C.: SwellShark: a generative model for biomedical named entity recognition without labeled data. arXiv preprint arXiv:1704.06360 (2017)
Giannakopoulos, A., Musat, C., Hossmann, A., Baeriswyl, M.: Unsupervised aspect term extraction with B-LSTM & CRF using automatically labelled datasets. In: EMNLP , vol. 180 (2017)
Google Scholar
Hu, F., Lakdawala, S., Hao, Q., Qiu, M.: Low-power, intelligent sensor hardware interface for medical data preprocessing. IEEE Trans. Inf. Technol. Biomed. 13(4), 656–663 (2009)
Article Google Scholar
Huang, J., et al.: Few-shot named entity recognition: an empirical baseline study. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 10408–10423 (2021)
Google Scholar
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Lee, S., Song, Y., Choi, M., Kim, H.: Bagging-based active learning model for named entity recognition with distant supervision. In: 2016 International Conference on Big Data and Smart Computing (BigComp), pp. 321–324. IEEE (2016)
Google Scholar
Li, X.L., Liang, P.: Prefix-tuning: optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190 (2021)
Li, Y., Song, Y., Jia, L., Gao, S., Li, Q., Qiu, M.: Intelligent fault diagnosis by fusing domain adversarial training and maximum mean discrepancy via ensemble learning. IEEE Trans. Indus. Inform. 17(4), 2833–2841 (2020)
Article Google Scholar
Liang, C., et al.: BOND: BERT-assisted open-domain named entity recognition with distant supervision. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1054–1064 (2020)
Google Scholar
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021)
Liu, X., Ji, K., Fu, Y., Du, Z., Yang, Z., Tang, J.: P-tuning v2: prompt tuning can be comparable to fine-tuning universally across scales and tasks. arXiv preprint arXiv:2110.07602 (2021)
Liu, X., et al.: GPT understands, too. arXiv preprint arXiv:2103.10385 (2021)
Miyato, T., Maeda, S.I., Koyama, M., Ishii, S.: Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1979–1993 (2018)
Google Scholar
Peng, S., Zhang, Y., Yu, Y., Zuo, H., Zhang, K.: Named entity recognition based on reinforcement learning and adversarial training. In: Qiu, H., Zhang, C., Fei, Z., Qiu, M., Kung, S.-Y. (eds.) KSEM 2021. LNCS (LNAI), vol. 12815, pp. 191–202. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-82136-4_16
Chapter Google Scholar
Qiu, H., Zheng, Q., Msahli, M., Memmi, G., Qiu, M., Lu, J.: Topological graph convolutional network-based urban traffic flow and density prediction. IEEE Trans. Intell. Transp. Syst. 22(7), 4560–4569 (2020)
Article Google Scholar
Sang, E.T.K., De Meulder, F.: Introduction to the conll-2003 shared task: language-independent named entity recognition. In: Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pp. 142–147 (2003)
Google Scholar
Scudder, H.: Probability of error of some adaptive pattern-recognition machines. IEEE Trans. Inf. Theory 11(3), 363–371 (1965)
Article MathSciNet Google Scholar
Shang, J., Liu, L., Ren, X., Gu, X., Ren, T., Han, J.: Learning named entity tagger using domain-specific dictionary. In: EMNLP (2018)
Google Scholar
Tarvainen, A., Valpola, H.: Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA (2017)
Google Scholar
Wang, Y., et al.: Meta self-training for few-shot neural sequence labeling. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 1737–1747 (2021)
Google Scholar
Weischedel, R., et al.: Ontonotes release 5.0 ldc2013t19. Linguistic Data Consortium, p. 23. Philadelphia (2013)
Google Scholar
Yang, Y., Katiyar, A.: Simple and effective few-shot named entity recognition with structured nearest neighbor learning. arXiv preprint arXiv:2010.02405 (2020)
Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: 33rd Annual Meeting of the Association for Computational Linguistics, pp. 189–196 (1995)
Google Scholar
Zhang, Y., Shen, J., Shang, J., Han, J.: Empower entity set expansion via language model probing. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 8151–8160 (2020)
Google Scholar
Zoph, B., et al.: Rethinking pre-training and self-training. Adv. Neural Inf. Process. Syst. 33, 3833–3845 (2020)
Google Scholar

Download references

Acknowledgements

The authors would like to thank the Associate Editor and anonymous reviewers for their valuable comments and suggestions. This work is funded in part by the National Natural Science Foundation of China under Grants No. 62176029, and in part by the graduate research and innovation foundation of Chongqing, China under Grants No. CYB21063. This work also is supported in part by the National Key Research, Development Program of China under Grants 2017YFB1402400, Major Project of Chongqing Higher Education Teaching Reform Research (191003), and the New Engineering Research and Practice Project of the Ministry of Education (E-JSJRJ20201335).

Author information

Authors and Affiliations

College of Computer Science, Chongqing University, Chongqing, 400044, China
Ganghong Huang, Jiang Zhong, Chen Wang, Qizhu Dai & Rongzhen Li

Authors

Ganghong Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jiang Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Chen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qizhu Dai
View author publications
You can also search for this author in PubMed Google Scholar
Rongzhen Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiang Zhong .

Editor information

Editors and Affiliations

Télécom Paris, Paris, France
Gerard Memmi
Purdue University, West Lafayette, IN, USA
Baijian Yang
Shanghai Jiao Tong University, Shanghai, Shanghai, China
Linghe Kong
Nanyang Technological University, Singapore, Singapore
Tianwei Zhang
Texas A&M University – Commerce, Commerce, TX, USA
Meikang Qiu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, G., Zhong, J., Wang, C., Dai, Q., Li, R. (2022). Prompt-Based Self-training Framework for Few-Shot Named Entity Recognition. In: Memmi, G., Yang, B., Kong, L., Zhang, T., Qiu, M. (eds) Knowledge Science, Engineering and Management. KSEM 2022. Lecture Notes in Computer Science(), vol 13370. Springer, Cham. https://doi.org/10.1007/978-3-031-10989-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-10989-8_8
Published: 19 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-10988-1
Online ISBN: 978-3-031-10989-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Prompt-Based Self-training Framework for Few-Shot Named Entity Recognition