A Span-Based Distantly Supervised NER with Self-learning

Mao, Hongli; Tang, Hanlin; Zhang, Wen; Huang, Heyan; Mao, Xian-Ling

doi:10.1007/978-3-030-60450-9_16

Hongli Mao¹²,
Hanlin Tang¹²,
Wen Zhang¹³,
Heyan Huang¹² &
…
Xian-Ling Mao¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12430))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

3443 Accesses

Abstract

The lack of labeled data is one of the major obstacles for named entity recognition (NER). Distant supervision is often used to alleviate this problem, which automatically generates annotated training datasets by dictionaries. However, as far as we know, existing distant supervision based methods do not consider the latent entities which are not in dictionaries. Intuitively, entities of the same type have the similar contextualized feature, we can use the feature to extract the latent entities within corpuses into corresponding dictionaries to improve the performance of distant supervision based methods. Thus, in this paper, we propose a novel span-based self-learning method, which employs span-level features to update corresponding dictionaries. Specifically, the proposed method directly takes all possible spans into account and scores them for each label, then picks latent entities from candidate spans into corresponding dictionaries based on both local and global features. Extensive experiments on two public datasets show that our proposed method performs better than the state-of-the-art baselines.

H. Mao and H. Tang—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Span-Based Chinese Few-Shot NER with Contrastive and Prompt Learning

Improving distantly supervised named entity recognition by emphasizing uncertain examples

Article 23 December 2024

Span-Based Nested Named Entity Recognition with Pretrained Language Model

Notes

References

Akbik, A., Blythe, D., Vollgraf, R.: Contextual string embeddings for sequence labeling. In: COLING 2018: 27th International Conference on Computational Linguistics, pp. 1638–1649 (2018)
Google Scholar
Augenstein, I., Maynard, D., Ciravegna, F.: Relation extraction from the web using distant supervision. In: Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E. (eds.) EKAW 2014. LNCS (LNAI), vol. 8876, pp. 26–41. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13704-9_3
Chapter Google Scholar
Chang, K.W., Samdani, R., Roth, D.: A constrained latent variable model for coreference resolution. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 601–612 (2013)
Google Scholar
Cui, Y., et al.: Pre-training with whole word masking for Chinese bert. arXiv preprint arXiv:1906.08101 (2019)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT 2019: Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 4171–4186 (2019)
Google Scholar
Giannakopoulos, A., Musat, C., Hossmann, A., Baeriswyl, M.: Unsupervised aspect term extraction with B-LSTM & CRF using automatically labelled datasets. In: Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pp. 180–188 (2017)
Google Scholar
He, W.: Autoentity: automated entity detection from massive text corpora (2017)
Google Scholar
Kitaev, N., Klein, D.: Constituency parsing with a self-attentive encoder. arXiv preprint arXiv:1805.01052 (2018)
Koo, T., Collins, M.: Efficient third-order dependency parsers. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 1–11 (2010)
Google Scholar
Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 260–270 (2016)
Google Scholar
Liu, S., Sun, Y., Li, B., Wang, W., Zhao, X.: Hamner: headword amplified multi-span distantly supervised method for domain specific named entity recognition. In: AAAI 2020: The Thirty-Fourth AAAI Conference on Artificial Intelligence (2020)
Google Scholar
Ma, X., Hovy, E.H.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 1064–1074 (2016)
Google Scholar
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, pp. 1003–1011 (2009)
Google Scholar
Nooralahzadeh, F., Lønning, J.T., Øvrelid, L.: Reinforcement-based denoising of distantly supervised NER with partial annotation. In: Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), pp. 225–233 (2019)
Google Scholar
Ouchi, H., Shindo, H., Matsumoto, Y.: A span selection model for semantic role labeling. arXiv preprint arXiv:1810.02245 (2018)
Passos, A., Kumar, V., McCallum, A.: Lexicon infused phrase embeddings for named entity resolution. In: Proceedings of the Eighteenth Conference on Computational Natural Language Learning, pp. 78–86 (2014)
Google Scholar
Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009), pp. 147–155 (2009)
Google Scholar
Schuster, M., Paliwal, K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45(11), 2673–2681 (1997)
Article Google Scholar
Shang, J., Liu, L., Gu, X., Ren, X., Ren, T., Han, J.: Learning named entity tagger using domain-specific dictionary. In: EMNLP 2018: 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2054–2064 (2018)
Google Scholar
Stern, M., Andreas, J., Klein, D.: A minimal span-based neural constituency parser. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 818–827 (2017)
Google Scholar
Wang, W., Chang, B.: Graph-based dependency parsing with bidirectional LSTM. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 2306–2315 (2016)
Google Scholar
Wu, W., Wang, F., Yuan, A., Wu, F., Li, J.: Coreference resolution as query-based span prediction. arXiv preprint arXiv:1911.01746 (2019)
Yang, Y., Chen, W., Li, Z., He, Z., Zhang, M.: Distantly supervised NER with partial annotation learning and reinforcement learning. In: COLING 2018: 27th International Conference on Computational Linguistics, pp. 2159–2169 (2018)
Google Scholar

Download references

Acknowledgement

The work is supported by National Key R&D Plan (No. 2018YFB1005100), NSFC (61772076, 61751201 and 61602197, No. U19B2020), NSFB (No. Z181100 008918002). We also thank Yuming Shang, Jiaxin Wu, Maxime Hugueville and the anonymous reviewers for their helpful comments.

Author information

Authors and Affiliations

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
Hongli Mao, Hanlin Tang, Heyan Huang & Xian-Ling Mao
Huazhong University of Science and Technology, Wuhan, China
Wen Zhang

Authors

Hongli Mao
View author publications
You can also search for this author in PubMed Google Scholar
Hanlin Tang
View author publications
You can also search for this author in PubMed Google Scholar
Wen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Heyan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xian-Ling Mao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xian-Ling Mao .

Editor information

Editors and Affiliations

ECE & Ingenuity Labs Research Institute, Queen’s University, Kingston, ON, Canada
Xiaodan Zhu
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Min Zhang
School of Computer Science and Technology, Soochow University, Suzhou, China
Yu Hong
College of Intelligence and Computing, Tianjin University, Tianjin, China
Ruifang He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mao, H., Tang, H., Zhang, W., Huang, H., Mao, XL. (2020). A Span-Based Distantly Supervised NER with Self-learning. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12430. Springer, Cham. https://doi.org/10.1007/978-3-030-60450-9_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-60450-9_16
Published: 02 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60449-3
Online ISBN: 978-3-030-60450-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

A Span-Based Distantly Supervised NER with Self-learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Span-Based Chinese Few-Shot NER with Contrastive and Prompt Learning

Improving distantly supervised named entity recognition by emphasizing uncertain examples

Span-Based Nested Named Entity Recognition with Pretrained Language Model

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

A Span-Based Distantly Supervised NER with Self-learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Span-Based Chinese Few-Shot NER with Contrastive and Prompt Learning

Improving distantly supervised named entity recognition by emphasizing uncertain examples

Span-Based Nested Named Entity Recognition with Pretrained Language Model

Notes

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation