A New Approach to the Supervised Word Sense Disambiguation

Agre, Gennady; Petrov, Daniel; Keskinova, Simona

doi:10.1007/978-3-319-99344-7_1

Gennady Agre ORCID: orcid.org/0000-0003-4610-7973¹⁶,
Daniel Petrov¹⁷ &
Simona Keskinova¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11089))

Included in the following conference series:

International Conference on Artificial Intelligence: Methodology, Systems, and Applications

938 Accesses
4 Citations

Abstract

The paper presents a new supervised approach for solving the all-words sense disambiguation (WSD) task, which allows avoiding the necessity to construct different specialized classifiers for disambiguating different target words. In the core of the approach lies a new interpretation of the notion ‘class’, which relates each possible meaning of a word to a frequency with which it occurs in some corpora. In such a way all possible senses of different words can be classified in a unified way into a restricted set of classes starting from the most frequent, and ending with the least frequent class. For representing target and context words the approach uses word embeddings and information about their part-of-speech (POS) categories. The experiments have shown that classifiers trained on examples created by means of the approach outperform the standard baselines for measuring the behavior of all-words WSD classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://clu.uni.no/icame/manuals/RROWN/INDEX.HTM.
2.
In cases, when the training set is not available, SemCor [4], for example, may be used as a default training set.
3.
Moreover, we assume that all words (both in training and testing sets) that have no their embeddings are removed from the corresponding sets.
4.
https://www.cs.waikato.ac.nz/ml/weka/.
5.
https://orange.biolab.si/.
6.
http://ixa2.si.ehu.es/ukb.
7.
http://web.eecs.umich.edu/~mihalcea/downloads/semcor/semcor3.0.tar.gz.
8.
https://github.com/asoroa/ukb/blob/master/src/README.
9.
WN30WNGWN30glConOneGraphRelSCOne-synsetEmbeddings.bin downloaded from http://bultreebank.org/en/DemoSem/Embeddings.
10.
https://code.google.com/archive/p/word2vec/.
11.
https://www.tensorflow.org/.
12.
http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html.
13.
https://github.com/kingfengji/gcForestDeep.
14.
It is clear, that the WNFS baseline for such extended test set will be also changed.

References

Mallery, J.C.: Thinking about foreign policy: finding an appropriate role for artificial intelligence computers. Ph.D. dissertation. MIT Political Science Department, Cambridge, MA (1988)
Google Scholar
Navigli, R. Word sense disambiguation: a survey. ACM Comput. Surv. 41(2) (2009). Article 10
Google Scholar
Fellbaum, C.: WordNet and wordnets. In: Brown, K., et al. (eds.) Encyclopedia of Language and Linguistics, 2nd edn., pp. 665–670. Elsevier, Oxford (2005)
Google Scholar
Miller, G.A., Leacock, C., Tengi, R., Bunker, R.T.: A semantic concordance. In: Proceedings of the ARPA Workshop on Human Language Technology, pp. 303–308 (1993)
Google Scholar
Kuchera, H., Francis, W.N.: Computational Analysis of Present-Day American English. Brown University Press, Providence (1967)
Google Scholar
Pilehvar, M.T., Navigli, R.: A large-scale pseudoword-based evaluation framework for state-of-the-art Word Sense Disambiguation. Comput. Linguist. 40(4), 837–881 (2014)
Article Google Scholar
Raganato, A., Camacho-Collados, J., Navigli, R.: Word sense disambiguation: a unified evaluation framework and empirical comparison. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pp. 99–110 (2017)
Google Scholar
Lesk, M. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceedings of the 5th SIGDOC, New York, NY, pp. 24–26 (1986)
Google Scholar
Chen, X., Liu, Z., Sun, M.: A unified model for word sense representation and disambiguation. In: Proceedings of EMNLP, pp. 1025–1035 (2014)
Google Scholar
Camacho-Collados, J., Pilehvar, M.H., Navigli, R.: Nasari: integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. Artif. Intell. 240, 36–64 (2016)
Article MathSciNet Google Scholar
Agirre, E., Soroa, A.: Personalizing Pagerank for word sense disambiguation. In: Proceedings of 12th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 33–41 (2009)
Google Scholar
Tripodi, R., Pelillo, M.: A game-theoretic approach to word sense disambiguation. arXiv preprint arXiv:1606.07711 (2016)
Zhong, Z., Ng, H.T.: It Makes Sense: a wide-coverage Word Sense Disambiguation system for free text. In: Proceedings of the ACL System Demonstrations, pp. 78–83 (2010)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781 (2013)
Taghipour, K., Ng, H. T. Semisupervised word sense disambiguation using word embeddings in general and specific domains. In: Proceedings of NAACL HLT, pp. 314–323 (2015)
Google Scholar
Rothe, S., Schutze, H.: AutoExtend: extending word embeddings to embeddings for synsets and lexemes. In: Proceedings of ACL, Beijing, China, pp. 1793–1803 (2015)
Google Scholar
Iacobacci, I., Pilehvar, M.H., Navigli, R.: Embeddings for word sense disambiguation: An evaluation study. In: Proceedings of ACL, Berlin, Germany, pp. 897–907 (2016)
Google Scholar
Melamud, O., Goldberger, J., Dagan, I.: Learning generic context embedding with bidirectional LSTM. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL), pp. 51–61 (2016)
Google Scholar
Kageback, M., Salomonsson, H.: Word sense disambiguation using a bidirectional LSTM. arXiv preprint arXiv:1606.03568 (2016)
Yuan, D., Richardson, J., Doherty, R., Evans, C., Altendorf, E.: Semi-supervised word sense disambiguation with neural models. In: Proceedings of COLING, pp. 1374–1385 (2016)
Google Scholar
Simov, K., Osenova, P., Popov, A.: Using context information for knowledge-based word sense disambiguation. In: Dichev, C., Agre, G. (eds.) AIMSA 2016. LNCS (LNAI), vol. 9883, pp. 130–139. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44748-3_13
Chapter Google Scholar
Zhou, Z-H., Feng, J.: Deep Forest: towards an alternative to deep neural networks. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-2017), pp. 3553–3559 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Sofia, Bulgaria
Gennady Agre
Laboratory of Computer Graphics and Geographical Information Systems, Technical University of Sofia, Sofia, Bulgaria
Daniel Petrov & Simona Keskinova

Authors

Gennady Agre
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Petrov
View author publications
You can also search for this author in PubMed Google Scholar
Simona Keskinova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gennady Agre .

Editor information

Editors and Affiliations

Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Sofia, Bulgaria
Gennady Agre
Universität des Saarlandes, Saarbrücken, Germany
Josef van Genabith
DFKI GmbH, Saarbrücken, Germany
Thierry Declerck

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agre, G., Petrov, D., Keskinova, S. (2018). A New Approach to the Supervised Word Sense Disambiguation. In: Agre, G., van Genabith, J., Declerck, T. (eds) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2018. Lecture Notes in Computer Science(), vol 11089. Springer, Cham. https://doi.org/10.1007/978-3-319-99344-7_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-99344-7_1
Published: 29 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99343-0
Online ISBN: 978-3-319-99344-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics