Neural Local and Global Contexts Learning for Word Sense Disambiguation

Fukumoto, Fumiyo; Mishima, Taishin; Li, Jiyi; Suzuki, Yoshimi

doi:10.1007/978-3-030-92273-3_44

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13111))

Included in the following conference series:

International Conference on Neural Information Processing

2105 Accesses

Abstract

Supervised Word Sense Disambiguation (WSD) has been one of the popular NLP topics, while how to utilize the limited volume of the sense-tagged data and interpret a diversity of contexts as relevant features remains a challenging research question. This paper focuses the problem and proposes a method for effectively leveraging a variety of contexts into a neural-based WSD model. Our model is Transformer-XL framework which is coupled with Graph Convolutional Network (GCNs). GCNs integrates different features from local contexts, i.e., full dependency structures, words with part-of-speech (POS), word order information into a model. By using hidden states obtained by GCNs, Transformer-XL learns local and global contexts simultaneously, where the global context is obtained from a document appearing with the target words. The experimental results by using a series of benchmark WSD datasets show that our method is comparable to the state-of-the-art WSD methods which utilize only the limited number of sense-tagged data, especially we verified that dependency structure and POS features contribute to performance improvement in our model through an ablation test.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Combining Local and Global Features in Supervised Word Sense Disambiguation

A Comparative Study of Transformers on Word Sense Disambiguation

A Comparative Study of Deep Learning Models for Word-Sense Disambiguation

Notes

1.
We used 39 dependency labels provided by the Stanford CoreNLP syntactic parser for the first two types of flows, two types of word order, and self-loops which would result in having 81 (39 $\times $ 2 + 2 + 1) different matrices in every layer.
2.
https://github.com/pfnet/optuna.

References

AI-Rfou, R., Choe, D., Constant, N., Guo, M., Jones, L.: Character-level language modeling with deeper self-attention. In: Proceedings of the Advancement of Artificial Intelligence, pp. 3159–3166 (2019)
Google Scholar
Baevski, A., Auli, M.: Adaptive input representations for neural language modeling. In: Proceedings of 7th International Conference on Learning Representations (2019)
Google Scholar
Bastings, J., Titov, I., Aziz, W., Marcheggiani, D., Sima’an, K.: Graph convolutional networks for text classification. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1957–1967 (2017)
Google Scholar
Bevilacqua, M., Navigli, R.: Braking through the 80% glass ceiling; raising the state of the art in word sense disambiguation by incorporating knowledge graph information. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 2854–2864 (2020)
Google Scholar
Blevins, T., Zettlemoyer, L.: Moving down the long tail of word sense disambiguation with gloss informed bi-encoders. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 1006–1017 (2020)
Google Scholar
Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q.V., Salakhutdinov, R.: Transformer-XL: attentive language models beyond a fixed-length context. In: Proceedings of 30th Conference on Neural Information Processing Systems, pp. 2978–2988 (2019)
Google Scholar
Hadiwinoto, C., Ng, H.T., Gan, W.C.: Improved word sense disambiguation using pre-trained contextualized word Representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 5300–5309 (2019)
Google Scholar
Iacobacci, I., Pilehvar, M.T., Navigli, R.: Embeddings for word sense disambiguation: an evaluation study. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 897–907 (2016)
Google Scholar
Ide, N., Véronis, J.: Introduction to the special issue on word sense disambiguation: the state of the art. J. Assoc. Comput. Linguist. 24(1), 1–40 (1998)
Google Scholar
Kipf, T.N., Welling, M.: SEMI-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations (2017)
Google Scholar
Levine, Y., Lenz, B., Dagan, O., Ram, O., et al.: SenseBERT: driving some sense into BERT. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 4656–4667 (2020)
Google Scholar
Li, Q., han, Z., Wu, X.M.: Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of 32nd AAAI Conference on Artificial Intelligence, pp. 3538–3545 (2018)
Google Scholar
Luo, F., Liu, T., Xia, Q., Chang, B., Sui, Z.: Incorporating glosses into neural word sense disambiguation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, pp. 2473–2482 (2018)
Google Scholar
Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J., McClosky, D.: The stanford core NLP natural language processing toolkit. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014)
Google Scholar
Marcheggiani, D., Titov, I.: Encoding sentences with graph convolutional networks for semantic role labeling. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1506–1515 (2017)
Google Scholar
Melamud, O., Goldberger, J., Dagan, I.: Context2vec: learning generic context embedding with bidirectional LSTM. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, pp. 51–61 (2016)
Google Scholar
Merity, S., Xiong, C., Bradbury, J., Socher, R.: Pointer sentinel mixture models. In: arXiv preprint arXiv:1609.07843 (2016)
Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language PRocessing and the 9th International Joint Conference on Natural Language PRocessing, pp. 1532–1543 (2014)
Google Scholar
Peters, M.E., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 2227–2237 (2018)
Google Scholar
Raganato, A., Bovi, C.D., Navigli, R.: Neural sequence learning models for word sense disambiguation. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1156–1167 (2017)
Google Scholar
Raganato, A., Camacho-Collados, J., Navigli, R.: Word sense disambiguation; A unified evaluation framework and empirical comparison. In: Proceedings of the 15th European Chapters of the Association for Computational Linguistics, pp. 99–110 (2017)
Google Scholar
Schlichtkrull, M., Kipf, T.N., Bloem, P., Berg, R.V.D., Titov, I., Welling, M.: Modeling relational data with graph convolutional networks. In: Proceedings of European Semantic Web Conference, pp. 593–607 (2018)
Google Scholar
Vashishth, S., Bhandari, M., Yadav, P., Rai, P., Bhattacharyya, C., Talukdar, P.: Incorporating syntactic and semantic information in word embeddings using graph convolutional networks. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 3308–3318 (2019)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Proceedings of the NIPS, pp. 6000–6010 (2017)
Google Scholar
Xu, Y., Yang, J.: Look again at the syntax: Relational graph convolutional network for gendered ambiguous pronoun resolution. In: Proceedings of the 1st Workshop on Gender Bias in Natural Language Processing, pp. 99–104 (2019)
Google Scholar
Yarowsky, D.: One sense per collocation. In: Proceedings of ARPA Human Language Processing Technology Workshop, pp. 266–271 (1993)
Google Scholar
Zhong, Z., Ng, H.T.: It makes sense: a wide-coverage word sense disambiguation system for free text. In: Proceedings of the ACL 2010 System Demonstrations, pp. 78–83 (2010)
Google Scholar

Download references

Acknowledgements

We are grateful to the anonymous reviewers for their comments and suggestions. This work was supported by the Grant-in-aid for JSPS, Grant Number 21K12026, and JKA through its promotion funds from KEIRIN RACE.

Author information

Authors and Affiliations

Interdisciplinary Graduate School, University of Yamanashi, Kofu, Japan
Fumiyo Fukumoto, Jiyi Li & Yoshimi Suzuki
Graduate School of Engineering, University of Yamanashi, Kofu, Japan
Taishin Mishima

Authors

Fumiyo Fukumoto
View author publications
You can also search for this author in PubMed Google Scholar
Taishin Mishima
View author publications
You can also search for this author in PubMed Google Scholar
Jiyi Li
View author publications
You can also search for this author in PubMed Google Scholar
Yoshimi Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fumiyo Fukumoto .

Editor information

Editors and Affiliations

Sampoerna University, Jakarta, Indonesia
Teddy Mantoro
Kyungpook National University, Daegu, Korea (Republic of)
Minho Lee
Sampoerna University, Jakarta, Indonesia
Media Anugerah Ayu
Murdoch University, Murdoch, WA, Australia
Kok Wai Wong
Universitas Indonesia, Depok, Indonesia
Achmad Nizar Hidayanto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fukumoto, F., Mishima, T., Li, J., Suzuki, Y. (2021). Neural Local and Global Contexts Learning for Word Sense Disambiguation. In: Mantoro, T., Lee, M., Ayu, M.A., Wong, K.W., Hidayanto, A.N. (eds) Neural Information Processing. ICONIP 2021. Lecture Notes in Computer Science(), vol 13111. Springer, Cham. https://doi.org/10.1007/978-3-030-92273-3_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-92273-3_44
Published: 05 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92272-6
Online ISBN: 978-3-030-92273-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Neural Local and Global Contexts Learning for Word Sense Disambiguation