Finding Word Sense Embeddings of Known Meaning

White, Lyndon; Togneri, Roberto; Liu, Wei; Bennamoun, Mohammed

doi:10.1007/978-3-031-23804-8_1

Lyndon White⁸,
Roberto Togneri⁸,
Wei Liu⁸ &
…
Mohammed Bennamoun⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13397))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

315 Accesses

Abstract

Word sense embeddings are vector representations of polysemous words – words with multiple meanings. These induced sense embeddings, however, do not necessarily correspond to any dictionary senses of the word. To overcome this, we propose a method to find new sense embeddings with known meaning. We term this method refitting, as the new embedding is fitted to model the meaning of a target word in an example sentence. The new lexically refitted embeddings are learnt using the probabilities of the existing induced sense embeddings, as well as their vector values. Our contributions are threefold: (1) The refitting method to find the new sense embeddings; (2) a novel smoothing technique, for use with the refitting method; and (3) a new similarity measure for words in context, defined by using the refitted sense embeddings. We show how our techniques improve the performance of the Adaptive Skip-Gram sense embeddings for word similarly evaluation; and how they allow the embeddings to be used for lexical word sense disambiguation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Context-Based Text-Graph Embeddings in Word-Sense Induction Tasks

Using a Chinese Lexicon to Learn Sense Embeddings and Measure Semantic Similarity

Learning Word Sense Embeddings from Word Sense Definitions

Notes

1.
As this part of our method is used with both the unsupervised senses and the lexical senses, referred to as $\textbf{u}$ and $\textbf{l}$ respectively in other parts of the paper, here we use a general sense $\textbf{s}$ to avoid confusion.
2.
https://github.com/sbos/AdaGram.jl.
3.
https://github.com/tanmaykm/Word2Vec.jl/.
4.
It should be noted, though, that the number of meanings is not normally distributed [23].

References

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), pp. 1532–1543 (2014)
Google Scholar
Reisinger, J., Mooney, R.J.: Multi-prototype vector-space models of word meaning. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 109–117. Association for Computational Linguistics (2010)
Google Scholar
Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Vol. 1, pp. 873–882. Association for Computational Linguistics (2012)
Google Scholar
Tian, F., et al.: A probabilistic model for learning multi-prototype word embeddings. In: COLING, pp. 151–160 (2014)
Google Scholar
Bartunov, S., Kondrashkin, D., Osokin, A., Vetrov, D.P.: Breaking sticks and ambiguities with adaptive skip-gram. CoRR abs/1502.07257 (2015)
Google Scholar
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38, 39–41 (1995)
Article Google Scholar
Véronis, J.: A study of polysemy judgements and inter-annotator agreement. In: Programme and Advanced Papers of the Senseval workshop, pp. 2–4 (1998)
Google Scholar
Iacobacci, I., Pilehvar, M.T., Navigli, R.: Sensembed: learning sense embeddings for word and relational similarity. In: Proceedings of ACL, pp. 95–105 (2015)
Google Scholar
Moro, A., Raganato, A., Navigli, R.: Entity linking meets word sense disambiguation: a unified approach. Trans. Assoc. Comput. Linguist. 2, 231–244 (2014)
Article Google Scholar
Chen, X., Liu, Z., Sun, M.: A unified model for word sense representation and disambiguation. In: EMNLP, pp. 1025–1035. Citeseer (2014)
Google Scholar
Agirre, E., Martínez, D., De Lacalle, O.L., Soroa, A.: Evaluating and optimizing the parameters of an unsupervised graph-based WSD algorithm. In: Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing, Association for Computational Linguistics, pp. 89–96 (2006)
Google Scholar
Agirre, E., Soroa, A.: Semeval-2007 task 02: evaluating word sense induction and discrimination systems. In: Proceedings of the 4th International Workshop on Semantic Evaluations. SemEval 2007, Stroudsburg, PA, USA, pp. 7–12. Association for Computational Linguistics (2007)
Google Scholar
Nocedal, J.: Updating quasi-newton matrices with limited storage. Math. Comput. 35, 773–782 (1980)
Article Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: HLT-NAACL, pp. 746–751 (2013)
Google Scholar
Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 137–186 (2003)
Google Scholar
Rosenfeld, R.: Two decades of statistical language modeling: where do we go from here? Proc. IEEE 88, 1270–1278 (2000)
Article Google Scholar
Zipf, G.: Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley Press, Cambridge (1949)
Google Scholar
Kilgarriff, A.: How dominant is the commonest sense of a word? In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 103–111. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30120-2_14
Chapter Google Scholar
Bezanson, J., Edelman, A., Karpinski, S., Shah, V.B.: Julia: a fresh approach to numerical computing. SIAM Rev. 59, 65–98 (2014)
Article Google Scholar
Neelakantan, A., Shankar, J., Passos, A., McCallum, A.: Efficient non-parametric estimation of multiple embeddings per word in vector space. arXiv preprint arXiv:1504.06654 (2015)
Zipf, G.K.: The meaning-frequency relationship of words. J. Gen. Psychol. 33, 251–256 (1945)
Article Google Scholar
Tengi, R.I.: Design and implementation of the WordNet lexical database and searching software. In: WordNet: An Electronic Lexical Database, p. 105. The MIT Press, Cambridge (1998)
Google Scholar
Navigli, R., Litkowski, K.C., Hargraves, O.: Semeval-2007 task 07: coarse-grained English all-words task. In: Proceedings of the 4th International Workshop on Semantic Evaluations. SemEval 2007, Stroudsburg, PA, USA, Association for Computational Linguistics, pp. 30–35 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Western Australia, 35 Stirling Highway, Crawley, Western Australia, Australia
Lyndon White, Roberto Togneri, Wei Liu & Mohammed Bennamoun

Authors

Lyndon White
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Togneri
View author publications
You can also search for this author in PubMed Google Scholar
Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Bennamoun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Liu .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

White, L., Togneri, R., Liu, W., Bennamoun, M. (2023). Finding Word Sense Embeddings of Known Meaning. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2018. Lecture Notes in Computer Science, vol 13397. Springer, Cham. https://doi.org/10.1007/978-3-031-23804-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-23804-8_1
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23803-1
Online ISBN: 978-3-031-23804-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Finding Word Sense Embeddings of Known Meaning