skip to main content
research-article

Word Sense Disambiguation Combining Knowledge Graph and Text Hierarchical Structure

Published: 23 November 2024 Publication History

Abstract

Current supervised word sense disambiguation models have obtained high disambiguation results using annotated information of different word senses and pre-trained language models. However, the semantic data of the supervised word sense disambiguation models are in the form of short texts, and much of the corpus information is not rich enough to distinguish the semantics in different scenarios. This article proposes a bi-encoder word sense disambiguation method combining a knowledge graph and text hierarchy structure, by introducing structured knowledge from the knowledge graph to supplement more extended semantic information, using the hierarchy of contextual input text to describe the meaning of words and phrases, and constructing a BERT-based bi-encoder, introducing a graph attention network to reduce the noise information in the contextual input text, so as to improve the disambiguation accuracy of the target words in phrase form and ultimately improve the disambiguation effectiveness of the method. By comparing the method with the latest nine comparison algorithms in five test datasets, the disambiguation accuracy of the method mostly outperformed the comparison algorithms and achieved better results.

References

[1]
M. Almousa, R. Benlamri, and R. Khoury. 2022. A novel word sense disambiguation approach using WordNet knowledge graph. Computer Speech & Language74 (2022), 101337.
[2]
Edoardo Barba, Tommaso Pasini, and Roberto Navigli. 2021. ESC: Redesigning WSD with extractive sense comprehension. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4661–4672. DOI:
[3]
Edoardo Barba, Luigi Procopio, and Roberto Navigli. 2021. ConSeC: Word sense disambiguation as continuous sense comprehension. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 1492–1503. DOI:
[4]
M. Bevilacqua and R. Navigli. 2020. Breaking through the 80% glass ceiling: Raising the state of the art in word sense disambiguation by incorporating knowledge graph information. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2854–2864.
[5]
T. Blevins and L. Zettlemoyer. 2020. Moving down the long tail of word sense disambiguation with gloss-informed biencoders. arXiv:2005.02590 (2020).
[6]
Simone Conia and Roberto Navigli. 2021. Framing word sense disambiguation as a multi-label problem for model-agnostic knowledge integration. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL’21). 3269–3275.
[7]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[8]
Z. Dong and Q. Dong. 2003. HowNet—A hybrid language and knowledge resource. In Proceedings of the 2003 International Conference on Natural Language Processing and Knowledge Engineering.
[9]
Christian Hadiwinoto, Hwee Tou Ng, and Wee Chung Gan. 2019. Improved word sense disambiguation using pre-trained contextualized word representations. arXiv preprint arXiv:1910.00194 (2019).
[10]
L. Huang, C. Sun, X. Qiu, and X. Huang. 2019. GlossBERT: BERT for word sense disambiguation with gloss knowledge. arXiv:1908.07245 (2019).
[11]
Seyed Mehran Kazemi and David Poole. 2018. Simple embedding for link prediction in knowledge graphs. arXiv preprint arXiv:1802.04868 (2018).
[12]
D. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).
[13]
Sawan Kumar, Sharmistha Jat, Karan Saxena, and Partha Talukdar. 2019. Zero-shot word sense disambiguation using sense definition embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5670–5681.
[14]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[15]
D. Loureiro and A. Jorge. 2019. Language modelling makes sense: Propagating representations through WordNet for full-coverage word sense disambiguation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
[16]
Fuli Luo, Tianyu Liu, Zexue He, Qiaolin Xia, and Baobao Chang. 2018. Leveraging gloss knowledge in neural word sense disambiguation by hierarchical co-attention. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.
[17]
F. Luo, T. Liu, Q. Xia, B. Chang, and Z. Sui. 2018. Incorporating glosses into neural word sense disambiguation. In Proceedings of the Meeting of the Association for Computational Linguistics.
[18]
Marco Maru, Federico Scozzafava, Federico Martelli, and Roberto Navigli. 2019. SyntagNet: Challenging supervised word sense disambiguation with lexical-semantic combinations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.
[19]
George A. Miller. 1995. WordNet: A lexical database for English. Communications of the ACM 38, 11 (1995), 39–41.
[20]
George A. Miller, Claudia Leacock, Randee Tengi, and Ross T. Bunker. 1993. A semantic concordance. In Proceedings of the Workshop on Human Language Technology.
[21]
Andrea Moro and Roberto Navigli. 2015. SemEval-2015 task 13: Multilingual all-words sense disambiguation and entity linking. In Proceedings of the International Workshop on Semantic Evaluation.
[22]
Roberto Navigli, David Jurgens, and Daniele Vannella. 2013. SemEval-2013 task 12: Multilingual word sense disambiguation. In Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval’13). 222–231.
[23]
Martha Palmer, Christiane Fellbaum, Scott Cotton, Lauren Delfs, and Hoa Trang Dang. 2001. English tasks: All-words and verb lexical sample. In Proceedings of the 2nd International Workshop on Evaluating Word Sense Disambiguation Systems (SENSEVAL’01).
[24]
Sameer Pradhan, Edward Loper, Dmitriy Dligach, and Martha Palmer. 2007. SemEval-2007 task-17: English lexical sample, SRL and all words. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval’07).
[25]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.
[26]
Alessandro Raganato, Jose Camacho-Collados, and Roberto Navigli. 2017. Word sense disambiguation: A unified evaluation framework and empirical comparison. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers). 99–110.
[27]
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).
[28]
Bianca Scarlini, Tommaso Pasini, and Roberto Navigli. 2020. SensEmBERT: Context-enhanced sense embeddings for multilingual word sense disambiguation. Proceedings of the AAAI Conference on Artificial Intelligence 34, 5 (2020), 8758–8765.
[29]
Federico Scozzafava, Marco Maru, Fabrizio Brignone, Giovanni Torrisi, and Roberto Navigli. 2020. Personalized PageRank with syntagmatic information for multilingual word sense disambiguation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 37–46.
[30]
V. S. Silva, A. Freitas, and S. Handschuh. 2019. Exploring knowledge graphs in an interpretable composite approach for text entailment. Proceedings of the AAAI Conference on Artificial Intelligence 33 (2019), 7023–7030.
[31]
B. Snyder and M. Palmer. 2004. The English all-words task. In Proceedings of the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (SENSEVAL’04).
[32]
Ying Su, Hongming Zhang, Yangqiu Song, and Tong Zhang. 2022. Rare and zero-shot word sense disambiguation using Z-Reweighting. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 4713–4723. DOI:
[33]
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
[34]
Xiang Wang, Dingxian Wang, Canran Xu, Xiangnan He, and Tat Seng Chua. 2019. Explainable reasoning over knowledge graphs for recommendation. Proceedings of the AAAI Conference on Artificial Intelligence 33 (2019), 5329–5336.
[35]
Q. Zhang, Z. Sun, W. Hu, M. Chen, and Y. Qu. 2019. Multi-view knowledge graph embedding for entity alignment. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19).

Index Terms

  1. Word Sense Disambiguation Combining Knowledge Graph and Text Hierarchical Structure

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 12
    December 2024
    237 pages
    EISSN:2375-4702
    DOI:10.1145/3613720
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 November 2024
    Online AM: 25 July 2024
    Accepted: 25 May 2024
    Revised: 13 April 2023
    Received: 17 January 2022
    Published in TALLIP Volume 23, Issue 12

    Check for updates

    Author Tags

    1. Word sense disambiguation
    2. knowledge graph
    3. BERT
    4. graph attention network

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 271
      Total Downloads
    • Downloads (Last 12 months)271
    • Downloads (Last 6 weeks)19
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media