research-article

Word Sense Disambiguation Combining Knowledge Graph and Text Hierarchical Structure

Authors:

ZiYue WeiAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing, Volume 23, Issue 12

Article No.: 161, Pages 1 - 16

https://doi.org/10.1145/3677524

Published: 23 November 2024 Publication History

Abstract

Current supervised word sense disambiguation models have obtained high disambiguation results using annotated information of different word senses and pre-trained language models. However, the semantic data of the supervised word sense disambiguation models are in the form of short texts, and much of the corpus information is not rich enough to distinguish the semantics in different scenarios. This article proposes a bi-encoder word sense disambiguation method combining a knowledge graph and text hierarchy structure, by introducing structured knowledge from the knowledge graph to supplement more extended semantic information, using the hierarchy of contextual input text to describe the meaning of words and phrases, and constructing a BERT-based bi-encoder, introducing a graph attention network to reduce the noise information in the contextual input text, so as to improve the disambiguation accuracy of the target words in phrase form and ultimately improve the disambiguation effectiveness of the method. By comparing the method with the latest nine comparison algorithms in five test datasets, the disambiguation accuracy of the method mostly outperformed the comparison algorithms and achieved better results.

References

[1]

M. Almousa, R. Benlamri, and R. Khoury. 2022. A novel word sense disambiguation approach using WordNet knowledge graph. Computer Speech & Language74 (2022), 101337.

Digital Library

[2]

Edoardo Barba, Tommaso Pasini, and Roberto Navigli. 2021. ESC: Redesigning WSD with extractive sense comprehension. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4661–4672. DOI:

[3]

Edoardo Barba, Luigi Procopio, and Roberto Navigli. 2021. ConSeC: Word sense disambiguation as continuous sense comprehension. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 1492–1503. DOI:

[4]

M. Bevilacqua and R. Navigli. 2020. Breaking through the 80% glass ceiling: Raising the state of the art in word sense disambiguation by incorporating knowledge graph information. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2854–2864.

[5]

T. Blevins and L. Zettlemoyer. 2020. Moving down the long tail of word sense disambiguation with gloss-informed biencoders. arXiv:2005.02590 (2020).

[6]

Simone Conia and Roberto Navigli. 2021. Framing word sense disambiguation as a multi-label problem for model-agnostic knowledge integration. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL’21). 3269–3275.

[7]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[8]

Z. Dong and Q. Dong. 2003. HowNet—A hybrid language and knowledge resource. In Proceedings of the 2003 International Conference on Natural Language Processing and Knowledge Engineering.

[9]

Christian Hadiwinoto, Hwee Tou Ng, and Wee Chung Gan. 2019. Improved word sense disambiguation using pre-trained contextualized word representations. arXiv preprint arXiv:1910.00194 (2019).

[10]

L. Huang, C. Sun, X. Qiu, and X. Huang. 2019. GlossBERT: BERT for word sense disambiguation with gloss knowledge. arXiv:1908.07245 (2019).

[11]

Seyed Mehran Kazemi and David Poole. 2018. Simple embedding for link prediction in knowledge graphs. arXiv preprint arXiv:1802.04868 (2018).

[12]

D. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).

[13]

Sawan Kumar, Sharmistha Jat, Karan Saxena, and Partha Talukdar. 2019. Zero-shot word sense disambiguation using sense definition embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5670–5681.

[14]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).

[15]

D. Loureiro and A. Jorge. 2019. Language modelling makes sense: Propagating representations through WordNet for full-coverage word sense disambiguation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.

[16]

Fuli Luo, Tianyu Liu, Zexue He, Qiaolin Xia, and Baobao Chang. 2018. Leveraging gloss knowledge in neural word sense disambiguation by hierarchical co-attention. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.

[17]

F. Luo, T. Liu, Q. Xia, B. Chang, and Z. Sui. 2018. Incorporating glosses into neural word sense disambiguation. In Proceedings of the Meeting of the Association for Computational Linguistics.

[18]

Marco Maru, Federico Scozzafava, Federico Martelli, and Roberto Navigli. 2019. SyntagNet: Challenging supervised word sense disambiguation with lexical-semantic combinations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.

[19]

George A. Miller. 1995. WordNet: A lexical database for English. Communications of the ACM 38, 11 (1995), 39–41.

Digital Library

[20]

George A. Miller, Claudia Leacock, Randee Tengi, and Ross T. Bunker. 1993. A semantic concordance. In Proceedings of the Workshop on Human Language Technology.

Digital Library

[21]

Andrea Moro and Roberto Navigli. 2015. SemEval-2015 task 13: Multilingual all-words sense disambiguation and entity linking. In Proceedings of the International Workshop on Semantic Evaluation.

[22]

Roberto Navigli, David Jurgens, and Daniele Vannella. 2013. SemEval-2013 task 12: Multilingual word sense disambiguation. In Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval’13). 222–231.

[23]

Martha Palmer, Christiane Fellbaum, Scott Cotton, Lauren Delfs, and Hoa Trang Dang. 2001. English tasks: All-words and verb lexical sample. In Proceedings of the 2nd International Workshop on Evaluating Word Sense Disambiguation Systems (SENSEVAL’01).

[24]

Sameer Pradhan, Edward Loper, Dmitriy Dligach, and Martha Palmer. 2007. SemEval-2007 task-17: English lexical sample, SRL and all words. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval’07).

[25]

Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.

[26]

Alessandro Raganato, Jose Camacho-Collados, and Roberto Navigli. 2017. Word sense disambiguation: A unified evaluation framework and empirical comparison. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). 99–110.

[27]

Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).

[28]

Bianca Scarlini, Tommaso Pasini, and Roberto Navigli. 2020. SensEmBERT: Context-enhanced sense embeddings for multilingual word sense disambiguation. Proceedings of the AAAI Conference on Artificial Intelligence 34, 5 (2020), 8758–8765.

[29]

Federico Scozzafava, Marco Maru, Fabrizio Brignone, Giovanni Torrisi, and Roberto Navigli. 2020. Personalized PageRank with syntagmatic information for multilingual word sense disambiguation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 37–46.

[30]

V. S. Silva, A. Freitas, and S. Handschuh. 2019. Exploring knowledge graphs in an interpretable composite approach for text entailment. Proceedings of the AAAI Conference on Artificial Intelligence 33 (2019), 7023–7030.

Digital Library

[31]

B. Snyder and M. Palmer. 2004. The English all-words task. In Proceedings of the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (SENSEVAL’04).

[32]

Ying Su, Hongming Zhang, Yangqiu Song, and Tong Zhang. 2022. Rare and zero-shot word sense disambiguation using Z-Reweighting. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 4713–4723. DOI:

[33]

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).

[34]

Xiang Wang, Dingxian Wang, Canran Xu, Xiangnan He, and Tat Seng Chua. 2019. Explainable reasoning over knowledge graphs for recommendation. Proceedings of the AAAI Conference on Artificial Intelligence 33 (2019), 5329–5336.

Digital Library

[35]

Q. Zhang, Z. Sun, W. Hu, M. Chen, and Y. Qu. 2019. Multi-view knowledge graph embedding for entity alignment. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19).

Index Terms

Word Sense Disambiguation Combining Knowledge Graph and Text Hierarchical Structure
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources

Recommendations

A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation

Word Sense Disambiguation (WSD) aims to automatically predict the correct sense of a word used in a given context. All human languages exhibit word sense ambiguity, and resolving this ambiguity can be difficult. Standard benchmark resources are required ...
Unsupervised translated word sense disambiguation in constructing bilingual lexical database
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

The performance of a machine translation system depends on the availability of bilingual lexical dictionary and completion of its word sense disambiguation performance. Word sense disambiguation plays a vital role in several applications such as machine ...
A word sense disambiguation corpus for Urdu
Abstract
The aim of word sense disambiguation (WSD) is to correctly identify the meaning of a word in context. All natural languages exhibit word sense ambiguities and these are often hard to resolve automatically. Consequently WSD is considered an ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 23, Issue 12

December 2024

237 pages

EISSN:2375-4702

DOI:10.1145/3613720

Editor:
Imed Zitouni
Google, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 November 2024

Online AM: 25 July 2024

Accepted: 25 May 2024

Revised: 13 April 2023

Received: 17 January 2022

Published in TALLIP Volume 23, Issue 12

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
271
Total Downloads

Downloads (Last 12 months)271
Downloads (Last 6 weeks)19

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents