Skip to main content

A New Approach to the Supervised Word Sense Disambiguation

  • Conference paper
  • First Online:
Book cover Artificial Intelligence: Methodology, Systems, and Applications (AIMSA 2018)

Abstract

The paper presents a new supervised approach for solving the all-words sense disambiguation (WSD) task, which allows avoiding the necessity to construct different specialized classifiers for disambiguating different target words. In the core of the approach lies a new interpretation of the notion ‘class’, which relates each possible meaning of a word to a frequency with which it occurs in some corpora. In such a way all possible senses of different words can be classified in a unified way into a restricted set of classes starting from the most frequent, and ending with the least frequent class. For representing target and context words the approach uses word embeddings and information about their part-of-speech (POS) categories. The experiments have shown that classifiers trained on examples created by means of the approach outperform the standard baselines for measuring the behavior of all-words WSD classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://clu.uni.no/icame/manuals/RROWN/INDEX.HTM.

  2. 2.

    In cases, when the training set is not available, SemCor [4], for example, may be used as a default training set.

  3. 3.

    Moreover, we assume that all words (both in training and testing sets) that have no their embeddings are removed from the corresponding sets.

  4. 4.

    https://www.cs.waikato.ac.nz/ml/weka/.

  5. 5.

    https://orange.biolab.si/.

  6. 6.

    http://ixa2.si.ehu.es/ukb.

  7. 7.

    http://web.eecs.umich.edu/~mihalcea/downloads/semcor/semcor3.0.tar.gz.

  8. 8.

    https://github.com/asoroa/ukb/blob/master/src/README.

  9. 9.

    WN30WNGWN30glConOneGraphRelSCOne-synsetEmbeddings.bin downloaded from http://bultreebank.org/en/DemoSem/Embeddings.

  10. 10.

    https://code.google.com/archive/p/word2vec/.

  11. 11.

    https://www.tensorflow.org/.

  12. 12.

    http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html.

  13. 13.

    https://github.com/kingfengji/gcForestDeep.

  14. 14.

    It is clear, that the WNFS baseline for such extended test set will be also changed.

References

  1. Mallery, J.C.: Thinking about foreign policy: finding an appropriate role for artificial intelligence computers. Ph.D. dissertation. MIT Political Science Department, Cambridge, MA (1988)

    Google Scholar 

  2. Navigli, R. Word sense disambiguation: a survey. ACM Comput. Surv. 41(2) (2009). Article 10

    Google Scholar 

  3. Fellbaum, C.: WordNet and wordnets. In: Brown, K., et al. (eds.) Encyclopedia of Language and Linguistics, 2nd edn., pp. 665–670. Elsevier, Oxford (2005)

    Google Scholar 

  4. Miller, G.A., Leacock, C., Tengi, R., Bunker, R.T.: A semantic concordance. In: Proceedings of the ARPA Workshop on Human Language Technology, pp. 303–308 (1993)

    Google Scholar 

  5. Kuchera, H., Francis, W.N.: Computational Analysis of Present-Day American English. Brown University Press, Providence (1967)

    Google Scholar 

  6. Pilehvar, M.T., Navigli, R.: A large-scale pseudoword-based evaluation framework for state-of-the-art Word Sense Disambiguation. Comput. Linguist. 40(4), 837–881 (2014)

    Article  Google Scholar 

  7. Raganato, A., Camacho-Collados, J., Navigli, R.: Word sense disambiguation: a unified evaluation framework and empirical comparison. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pp. 99–110 (2017)

    Google Scholar 

  8. Lesk, M. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceedings of the 5th SIGDOC, New York, NY, pp. 24–26 (1986)

    Google Scholar 

  9. Chen, X., Liu, Z., Sun, M.: A unified model for word sense representation and disambiguation. In: Proceedings of EMNLP, pp. 1025–1035 (2014)

    Google Scholar 

  10. Camacho-Collados, J., Pilehvar, M.H., Navigli, R.: Nasari: integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. Artif. Intell. 240, 36–64 (2016)

    Article  MathSciNet  Google Scholar 

  11. Agirre, E., Soroa, A.: Personalizing Pagerank for word sense disambiguation. In: Proceedings of 12th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 33–41 (2009)

    Google Scholar 

  12. Tripodi, R., Pelillo, M.: A game-theoretic approach to word sense disambiguation. arXiv preprint arXiv:1606.07711 (2016)

  13. Zhong, Z., Ng, H.T.: It Makes Sense: a wide-coverage Word Sense Disambiguation system for free text. In: Proceedings of the ACL System Demonstrations, pp. 78–83 (2010)

    Google Scholar 

  14. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781 (2013)

  15. Taghipour, K., Ng, H. T. Semisupervised word sense disambiguation using word embeddings in general and specific domains. In: Proceedings of NAACL HLT, pp. 314–323 (2015)

    Google Scholar 

  16. Rothe, S., Schutze, H.: AutoExtend: extending word embeddings to embeddings for synsets and lexemes. In: Proceedings of ACL, Beijing, China, pp. 1793–1803 (2015)

    Google Scholar 

  17. Iacobacci, I., Pilehvar, M.H., Navigli, R.: Embeddings for word sense disambiguation: An evaluation study. In: Proceedings of ACL, Berlin, Germany, pp. 897–907 (2016)

    Google Scholar 

  18. Melamud, O., Goldberger, J., Dagan, I.: Learning generic context embedding with bidirectional LSTM. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL), pp. 51–61 (2016)

    Google Scholar 

  19. Kageback, M., Salomonsson, H.: Word sense disambiguation using a bidirectional LSTM. arXiv preprint arXiv:1606.03568 (2016)

  20. Yuan, D., Richardson, J., Doherty, R., Evans, C., Altendorf, E.: Semi-supervised word sense disambiguation with neural models. In: Proceedings of COLING, pp. 1374–1385 (2016)

    Google Scholar 

  21. Simov, K., Osenova, P., Popov, A.: Using context information for knowledge-based word sense disambiguation. In: Dichev, C., Agre, G. (eds.) AIMSA 2016. LNCS (LNAI), vol. 9883, pp. 130–139. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44748-3_13

    Chapter  Google Scholar 

  22. Zhou, Z-H., Feng, J.: Deep Forest: towards an alternative to deep neural networks. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-2017), pp. 3553–3559 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gennady Agre .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Agre, G., Petrov, D., Keskinova, S. (2018). A New Approach to the Supervised Word Sense Disambiguation. In: Agre, G., van Genabith, J., Declerck, T. (eds) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2018. Lecture Notes in Computer Science(), vol 11089. Springer, Cham. https://doi.org/10.1007/978-3-319-99344-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99344-7_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99343-0

  • Online ISBN: 978-3-319-99344-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics