Skip to main content

Unsupervised Ultra-Fine Entity Typing with Distributionally Induced Word Senses

  • Conference paper
  • First Online:
Analysis of Images, Social Networks and Texts (AIST 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14486))

  • 70 Accesses

Abstract

The lack of annotated data is one of the challenging issues in an ultra-fine entity typing, which is the task to assign semantic types for a given entity mention. Hence, automatic type generation is receiving increased interest, typically to be used as distant supervision data. In this study, we investigate an unsupervised way based on distributionally induced word senses. The types or labels are obtained by selecting the appropriate sense cluster for a mention. Experimental results on an ultra-fine entity typing task demonstrate that combining our predictions with the predictions of an existing neural model leads to a slight improvement over the ultra-fine types for mentions that are not pronouns.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://ltmaggie.informatik.uni-hamburg.de/jobimviz/#.

  2. 2.

    https://commoncrawl.org.

  3. 3.

    https://www.sbert.net/docs/pretrained_models.html.

  4. 4.

    https://github.com/stanfordnlp/stanza.

  5. 5.

    https://pypi.org/project/inflect.

  6. 6.

    http://ltmaggie.informatik.uni-hamburg.de/jobimtext/documentation/sense-clustering.

  7. 7.

    https://github.com/uwnlp/open_type/blob/master/scorer.py.

  8. 8.

    http://nlp.cs.washington.edu/entity_type/model/best_model.tar.gz.

  9. 9.

    https://github.com/HKUST-KnowComp/MLMET/blob/main/prep.py#L9, https://en.wikipedia.org/wiki/English_pronouns#Full_list.

  10. 10.

    Our code can be found at: https://github.com/uhh-lt/unsupervised-ultra-fine-entity-typing.

References

  1. Anwar, S., Shelmanov, A., Panchenko, A., Biemann, C.: Generating lexical representations of frames using lexical substitution. In: Proceedings of the Probability and Meaning Conference (PaM 2020), Gothenburg, pp. 95–103 (2020). https://aclanthology.org/2020.pam-1.13

  2. Biemann, C.: Chinese whispers - an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of TextGraphs: the First Workshop on Graph Based Methods for Natural Language Processing, New York City, pp. 73–80 (2006). https://aclanthology.org/W06-3812

  3. Biemann, C., Coppola, B., Glass, M.R., Gliozzo, A., Hatem, M., Riedl, M.: JoBimText visualizer: a graph-based approach to contextualizing distributional similarity. In: Proceedings of TextGraphs-8 Graph-based Methods for Natural Language Processing, Seattle, WA, USA, pp. 6–10 (2013). https://aclanthology.org/W13-5002

  4. Biemann, C., Riedl, M.: Text: now in 2D! A framework for lexical expansion with contextual similarity. J. Lang. Modell. 1(1), 55–95 (2013). https://doi.org/10.15398/jlm.v1i1.60

    Article  Google Scholar 

  5. Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’Reilly Media Inc. (2009). https://www.nltk.org/book

  6. Choi, E., Levy, O., Choi, Y., Zettlemoyer, L.: Ultra-fine entity typing. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, pp. 87–96 (2018). https://www.aclweb.org/anthology/P18-1009

  7. Dai, H., Song, Y., Wang, H.: Ultra-fine entity typing with weak supervision from a masked language model. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1790–1799 (2021). https://doi.org/10.18653/v1/2021.acl-long.141

  8. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, pp. 4171–4186 (2019). https://doi.org/10.18653/v1/N19-1423

  9. Ding, N., et al.: Prompt-learning for fine-grained entity typing. In: Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, pp. 6888–6901 (2022). https://aclanthology.org/2022.findings-emnlp.512

  10. Gillick, D., Lazic, N., Ganchev, K., Kirchner, J., Huynh, D.: Context-dependent fine-grained entity type tagging (2016)

    Google Scholar 

  11. Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics, pp. 539–545 (1992). https://aclanthology.org/C92-2082

  12. Huang, L., May, J., Pan, X., Ji, H.: Building a fine-grained entity typing system overnight for a new x (x = language, domain, genre) (2016)

    Google Scholar 

  13. Huang, L., et al.: Liberal entity extraction: rapid construction of fine-grained entity typing systems. Big Data 5(1), 19–31 (2017). https://doi.org/10.1089/big.2017.0012

    Article  Google Scholar 

  14. Jana, A., Goyal, P.: Can network embedding of distributional thesaurus be combined with word vectors for better representation? In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LO, USA, pp. 463–473 (2018). https://doi.org/10.18653/v1/N18-1043

  15. Jiang, C., Jiang, Y., Wu, W., Xie, P., Tu, K.: Modeling label correlations for ultra-fine entity typing with neural pairwise conditional random field. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, pp. 6836–6847 (2022). https://aclanthology.org/2022.emnlp-main.459

  16. Kilgarriff, A., Rychly, P., Smrz, P., Tugwell, D.: ITRI-04-08 the sketch engine. Inf. Technol. 105(116), 105–116 (2004)

    Google Scholar 

  17. Lee, C., Dai, H., Song, Y., Li, X.: A Chinese corpus for fine-grained entity typing. In: Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France, pp. 4451–4457 (2020). https://aclanthology.org/2020.lrec-1.548

  18. Li, B., Yin, W., Chen, M.: Ultra-fine entity typing with indirect supervision from natural language inference. Trans. Associat. Computat. Linguist. 10, 607–622 (2022). https://doi.org/10.1162/tacl_a_00479

    Article  Google Scholar 

  19. Ling, X., Weld, D.: Fine-grained entity recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 26, no. 1, pp. 94–100 (2021). https://doi.org/10.1609/aaai.v26i1.8122

  20. Liu, L., et al.: TexSmart: a system for enhanced natural language understanding. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations, pp. 1–10 (2021). https://doi.org/10.18653/v1/2021.acl-demo.1

  21. Liu, Q., Lin, H., Xiao, X., Han, X., Sun, L., Wu, H.: Fine-grained entity typing via label reasoning. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, pp. 4611–4622 (2021). https://doi.org/10.18653/v1/2021.emnlp-main.378

  22. López, F., Heinzerling, B., Strube, M.: Fine-grained entity typing in hyperbolic space. In: Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), Florence, Italy, pp. 169–180 (2019). https://doi.org/10.18653/v1/W19-4319

  23. Obeidat, R., Fern, X., Shahbazi, H., Tadepalli, P.: Description-based zero-shot fine-grained entity typing. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, pp. 807–814 (2019). https://doi.org/10.18653/v1/N19-1087

  24. Onoe, Y., Boratko, M., McCallum, A., Durrett, G.: Modeling fine-grained entity types with box embeddings. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 2051–2064 (2021). https://doi.org/10.18653/v1/2021.acl-long.160

  25. Onoe, Y., Durrett, G.: Learning to denoise distantly-labeled data for entity typing. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, pp. 2407–2417 (2019). https://doi.org/10.18653/v1/N19-1250

  26. Pan, W., Wei, W., Zhu, F.: Automatic noisy label correction for fine-grained entity typing. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 4317–4323 (2022). https://doi.org/10.24963/ijcai.2022/599

  27. Panchenko, A., et al.: Unsupervised, knowledge-free, and interpretable word sense disambiguation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Copenhagen, Denmark, pp. 91–96 (2017). https://doi.org/10.18653/v1/D17-2016

  28. Panchenko, A., Ruppert, E., Faralli, S., Ponzetto, S.P., Biemann, C.: Building a web-scale dependency-parsed corpus from Common Crawl. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, pp. 1816–1823 (2018). https://aclanthology.org/L18-1286

  29. Panchenko, A., Ruppert, E., Faralli, S., Ponzetto, S.P., Biemann, C.: Unsupervised does not mean uninterpretable: the case for word sense induction and disambiguation. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, Valencia, Spain, pp. 86–98 (2017). https://aclanthology.org/E17-1009

  30. Pelevina, M., Arefiev, N., Biemann, C., Panchenko, A.: Making sense of word embeddings. In: Proceedings of the 1st Workshop on Representation Learning for NLP, Berlin, Germany, pp. 174–183 (2016). https://doi.org/10.18653/v1/W16-1620

  31. Qi, P., Zhang, Y., Zhang, Y., Bolton, J., Manning, C.D.: Stanza: a python natural language processing toolkit for many human languages. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 101–108 (2020). https://doi.org/10.18653/v1/2020.acl-demos.14

  32. Qian, J., et al.: Fine-grained entity typing without knowledge base. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, pp. 5309–5319 (2021). https://doi.org/10.18653/v1/2021.emnlp-main.431

  33. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, Hong Kong, pp. 3982–3992 (2019). https://doi.org/10.18653/v1/D19-1410

  34. Riedl, M., Biemann, C.: Scaling to large\(^3\) data: an efficient and effective method to compute distributional thesauri. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, WA, USA, pp. 884–890 (2013). https://aclanthology.org/D13-1089

  35. Ruppert, E., Kaufmann, M., Riedl, M., Biemann, C.: JoBimViz: a web-based visualization for graph-based distributional semantic models. In: Proceedings of ACL-IJCNLP 2015 System Demonstrations, Beijing, China, pp. 103–108 (2015). https://doi.org/10.3115/v1/P15-4018

  36. Sui, X., et al.: Improving zero-shot entity linking candidate generation with ultra-fine entity type information. In: Proceedings of the 29th International Conference on Computational Linguistics, pp. 2429–2437. Gyeongju, Republic of Korea (2022). https://aclanthology.org/2022.coling-1.214

  37. Xiong, W., et al.: Imposing label-relational inductive bias for extremely fine-grained entity typing. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, pp. 773–784 (2019). https://doi.org/10.18653/v1/N19-1084

  38. Zhang, T., Xia, C., Lu, C.T., Yu, P.: MZET: memory augmented zero-shot fine-grained named entity typing. In: Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 77–87 (2020). https://doi.org/10.18653/v1/2020.coling-main.7

  39. Zhou, B., Khashabi, D., Tsai, C.T., Roth, D.: Zero-shot open entity typing as type-compatible grounding. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 2065–2076 (2018). https://doi.org/10.18653/v1/D18-1231

  40. Zuo, X., Liang, H., Jing, N., Zeng, S., Fang, Z., Luo, Y.: Type-enriched hierarchical contrastive strategy for fine-grained entity typing. In: Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea, pp. 2405–2417 (2022). https://aclanthology.org/2022.coling-1.212

Download references

Acknowledgements

The work was partially supported by a Deutscher Akademischer Austauschdienst (DAAD) doctoral stipend and the DFG funded JOIN-T project BI 1544/4.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Özge Sevgili .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sevgili, Ö., Remus, S., Jana, A., Panchenko, A., Biemann, C. (2024). Unsupervised Ultra-Fine Entity Typing with Distributionally Induced Word Senses. In: Ignatov, D.I., et al. Analysis of Images, Social Networks and Texts. AIST 2023. Lecture Notes in Computer Science, vol 14486. Springer, Cham. https://doi.org/10.1007/978-3-031-54534-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-54534-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-54533-7

  • Online ISBN: 978-3-031-54534-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics