Skip to main content

Semi-supervised Learning for Fine-Grained Entity Typing with Mixed Label Smoothing and Pseudo Labeling

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13945))

Included in the following conference series:

  • 1437 Accesses

Abstract

Distant supervision (DS) has been proposed to automatically annotate data and achieved significant success in fine-grained entity typing(FET). Despite its efficiency, distant supervision often suffers from the noisy labeling problem. To solve the noisy labeling problem, existing approaches assume the existence of “clean” and “noisy” sets in the training data and use different types of methods to utilize them. However, they still suffer from the confirmation bias problem in the “noisy” set and the false positive problem in the “clean” set. To address these issues, we propose a novel semi-supervised learning method with mixed label smoothing and pseudo labeling for distantly supervised fine-grained entity typing. Specifically, to solve the false positive problem on the “clean” set, we propose a mixed label smoothing method to smooth the labels of the “clean” set to train the FET model. To solve the confirmation bias problem on the “noisy” set, we do not consider the labels in the “noisy” set and use a pseudo labeling technique to deal with the “noisy” set. Extensive experiments conducted on three widely used FET datasets show the effectiveness of our proposed approach. The source code is publicly available at https://github.com/xubodhu/NFETC-SSL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abhishek, A., Anand, A., Awekar, A.: Fine-grained entity type classification by jointly learning representations and label embeddings. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. pp. 797–807 (2017)

    Google Scholar 

  2. Chen, B., Gu, X., Hu, Y., Tang, S., Hu, G., Zhuang, Y., Ren, X.: Improving distantly-supervised entity typing with compact latent space clustering. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). pp. 2862–2872 (2019)

    Google Scholar 

  3. Gao, T., Yao, X., Chen, D.: SimCSE: Simple contrastive learning of sentence embeddings. In: Empirical Methods in Natural Language Processing (EMNLP) (2021)

    Google Scholar 

  4. Gillick, D., Lazic, N., Ganchev, K., Kirchner, J., Huynh, D.: Context-dependent fine-grained entity type tagging. arXiv preprint arXiv:1412.1820 (2014)

  5. Ling, X., Weld, D.S.: Fine-grained entity recognition. In: Twenty-Sixth AAAI Conference on Artificial Intelligence. pp. 94–100 (2012)

    Google Scholar 

  6. Lukasik, M., Bhojanapalli, S., Menon, A., Kumar, S.: Does label smoothing mitigate label noise? In: International Conference on Machine Learning. pp. 6448–6458. PMLR (2020)

    Google Scholar 

  7. Onoe, Y., Boratko, M., McCallum, A., Durrett, G.: Modeling fine-grained entity types with box embeddings. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). pp. 2051–2064 (2021)

    Google Scholar 

  8. Onoe, Y., Durrett, G.: Learning to denoise distantly-labeled data for entity typing. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). pp. 2407–2417 (2019)

    Google Scholar 

  9. Ren, X., He, W., Qu, M., Huang, L., Ji, H., Han, J.: Afet: Automatic fine-grained entity typing by hierarchical partial-label embedding. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. pp. 1369–1378 (2016)

    Google Scholar 

  10. Shi, H., Tang, S., Gu, X., Chen, B., Chen, Z., Shao, J., Ren, X.: Alleviate dataset shift problem in fine-grained entity typing with virtual adversarial training. In: Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. pp. 3898–3904 (2021)

    Google Scholar 

  11. Shimaoka, S., Stenetorp, P., Inui, K., Riedel, S.: An attentive neural architecture for fine-grained entity type classification. In: Proceedings of the 5th Workshop on Automated Knowledge Base Construction. pp. 69–74 (2016)

    Google Scholar 

  12. Sohn, K., Berthelot, D., Carlini, N., Zhang, Z., Zhang, H., Raffel, C.A., Cubuk, E.D., Kurakin, A., Li, C.L.: Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems 33, 596–608 (2020)

    Google Scholar 

  13. Weischedel, R., Brunstein, A.: Bbn pronoun coreference and entity type corpus. Linguistic Data Consortium, Philadelphia 112 (2005)

    Google Scholar 

  14. Weischedel, R., Palmer, M., Marcus, M., Hovy, E., Pradhan, S., Ramshaw, L., Xue, N., Taylor, A., Kaufman, J., Franchini, M., et al.: Ontonotes release 5.0 ldc2013t19. Linguistic Data Consortium, Philadelphia, PA 23 (2013)

    Google Scholar 

  15. Wu, J., Zhang, R., Mao, Y., Guo, H., Huai, J.: Modeling noisy hierarchical types in fine-grained entity typing: A content-based weighting approach. In: IJCAI. pp. 5264–5270 (2019)

    Google Scholar 

  16. Xu, B., Zhang, Z., Sha, C., Du, M., Song, H., Wang, H.: A three-stage curriculum learning framework with hierarchical label smoothing for fine-grained entity typing. In: International Conference on Database Systems for Advanced Applications. pp. 289–296. Springer (2022)

    Google Scholar 

  17. Xu, P., Barbosa, D.: Neural fine-grained entity type classification with hierarchy-aware loss. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). pp. 16–25 (2018)

    Google Scholar 

  18. Zhang, C.B., Jiang, P.T., Hou, Q., Wei, Y., Han, Q., Li, Z., Cheng, M.M.: Delving deep into label smoothing. IEEE Transactions on Image Processing 30, 5984–5996 (2021)

    Article  Google Scholar 

  19. Zhang, H., Long, D., Xu, G., Zhu, M., Xie, P., Huang, F., Wang, J.: Learning with noise: Improving distantly-supervised fine-grained entity typing via automatic relabeling. In: IJCAI. pp. 3808–3815 (2020)

    Google Scholar 

Download references

Acknowledgement

This work is supported by the National Natural Science Foundation of China (No. 61906035), the Natural Science Foundation of Shanghai (No. 22ZR1402000) and the Science and Technology Commission of Shanghai Municipality Grant (No. 22511105902).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming Du .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, B., Zhang, Z., Du, M., Wang, H., Song, H., Xiao, Y. (2023). Semi-supervised Learning for Fine-Grained Entity Typing with Mixed Label Smoothing and Pseudo Labeling. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13945. Springer, Cham. https://doi.org/10.1007/978-3-031-30675-4_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-30675-4_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-30674-7

  • Online ISBN: 978-3-031-30675-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics