research-article

Word Sense Disambiguation by Refining Target Word Embedding

Authors:
Xuefeng Zhang

SKLSDE, School of Computer Science and Engineering, Beihang University, China

SKLSDE, School of Computer Science and Engineering, Beihang University, China

0000-0002-2604-1885
View Profile

,
Richong Zhang

SKLSDE, School of Computer Science and Engineering, Beihang University, China

SKLSDE, School of Computer Science and Engineering, Beihang University, China

0000-0002-1207-0300
View Profile

,
Xiaoyang Li

SKLSDE, School of Computer Science and Engineering, Beihang University, China

SKLSDE, School of Computer Science and Engineering, Beihang University, China

0000-0002-7418-4506
View Profile

,
Fanshuang Kong

SKLSDE, School of Computer Science and Engineering, Beihang University, China

SKLSDE, School of Computer Science and Engineering, Beihang University, China

0000-0002-9046-740X
View Profile

,
Junfan Chen

SKLSDE, School of Computer Science and Engineering, Beihang University, China

SKLSDE, School of Computer Science and Engineering, Beihang University, China

0000-0001-6807-0089
View Profile

,
Samuel Mensah

The University of Sheffield, United Kingdom

The University of Sheffield, United Kingdom

0000-0003-0779-5574
View Profile

,
Yongyi Mao

University of Ottawa, Canada

University of Ottawa, Canada

0000-0001-5298-5778
View Profile

Authors Info & Claims

WWW '23: Proceedings of the ACM Web Conference 2023April 2023Pages 1405–1414https://doi.org/10.1145/3543507.3583191

Published:30 April 2023Publication History

WWW '23: Proceedings of the ACM Web Conference 2023

Pages 1405–1414

ABSTRACT

Word Sense Disambiguation (WSD) which aims to identify the correct sense of a target word appearing in a specific context is essential for web text analysis. The use of glosses has been explored as a means for WSD. However, only a few works model the correlation between the target context and gloss. We add to the body of literature by presenting a model that employs a multi-head attention mechanism on deep contextual features of the target word and candidate glosses to refine the target word embedding. Furthermore, to encourage the model to learn the relevant part of target features that align with the correct gloss, we recursively alternate attention on target word features and that of candidate glosses to gradually extract the relevant contextual features of the target word, refining its representation and strengthening the final disambiguation results. Empirical studies on the five most commonly used benchmark datasets show that our proposed model is effective and achieves state-of-the-art results.

References

Satanjeev Banerjee and Ted Pedersen. 2002. An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet. In Computational Linguistics and Intelligent Text Processing, Third International Conference, CICLing 2002, Mexico City, Mexico, February 17-23, 2002, Proceedings(Lecture Notes in Computer Science, Vol. 2276). Springer, 136–145.Google Scholar
Edoardo Barba, Tommaso Pasini, and Roberto Navigli. 2021. ESC: Redesigning WSD with Extractive Sense Comprehension. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6-11, 2021. Association for Computational Linguistics, 4661–4672.Google ScholarCross Ref
Edoardo Barba, Luigi Procopio, and Roberto Navigli. 2021. ConSeC: Word Sense Disambiguation as Continuous Sense Comprehension. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. Association for Computational Linguistics, 1492–1503.Google ScholarCross Ref
Pierpaolo Basile, Annalina Caputo, and Giovanni Semeraro. 2014. An Enhanced Lesk Word Sense Disambiguation Algorithm through a Distributional Semantic Model. In COLING 2014, 25th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, August 23-29, 2014, Dublin, Ireland, Jan Hajic and Junichi Tsujii (Eds.). ACL, 1591–1600.Google Scholar
Michele Bevilacqua and Roberto Navigli. 2020. Breaking Through the 80% Glass Ceiling: Raising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, 2854–2864. https://doi.org/10.18653/v1/2020.acl-main.255Google ScholarCross Ref
Terra Blevins, Mandar Joshi, and Luke Zettlemoyer. 2021. FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the Dictionary. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 455–465.Google ScholarCross Ref
Terra Blevins and Luke Zettlemoyer. 2020. Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, 1006–1017.Google ScholarCross Ref
Claudio Delli Bovi, Luca Telesca, and Roberto Navigli. 2015. Large-scale information extraction from textual definitions through deep syntactic and semantic analysis. Transactions of the Association for Computational Linguistics 3 (2015), 529–543.Google ScholarCross Ref
Yee Seng Chan, Hwee Tou Ng, and David Chiang. 2007. Word sense disambiguation improves statistical machine translation. In Proceedings of the 45th annual meeting of the association of computational linguistics. 33–40.Google Scholar
Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning. 160–167.Google ScholarDigital Library
Yann N Dauphin, Angela Fan, Michael Auli, and David Grangier. 2017. Language modeling with gated convolutional networks. In International conference on machine learning. PMLR, 933–941.Google Scholar
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805(2018).Google Scholar
Philip Edmonds and Scott Cotton. 2001. Senseval-2: overview. In Proceedings of SENSEVAL-2 Second International Workshop on Evaluating Word Sense Disambiguation Systems. 1–5.Google Scholar
Christian Hadiwinoto, Hwee Tou Ng, and Wee Chung Gan. 2019. Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations. In EMNLP.Google Scholar
Zijian Hu, Fuli Luo, Yutong Tan, Wenxin Zeng, and Zhifang Sui. 2019. WSD-GAN: Word Sense Disambiguation Using Generative Adversarial Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 9943–9944.Google ScholarDigital Library
Luyao Huang, Chi Sun, Xipeng Qiu, and Xuanjing Huang. 2019. GlossBERT: BERT for Word Sense Disambiguation with Gloss Knowledge. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 3507–3512. https://doi.org/10.18653/v1/D19-1355Google ScholarCross Ref
Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. Embeddings for word sense disambiguation: An evaluation study. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 897–907.Google ScholarCross Ref
Ganesh Jawahar, Benoît Sagot, and Djamé Seddah. 2019. What Does BERT Learn about the Structure of Language¿. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics, 3651–3657.Google ScholarCross Ref
Adam Kilgarriff and Joseph Rosenzweig. 2000. Framework and results for English SENSEVAL. Computers and the Humanities 34 (2000), 15–48.Google ScholarCross Ref
Sawan Kumar, Sharmistha Jat, Karan Saxena, and Partha P. Talukdar. 2019. Zero-shot Word Sense Disambiguation using Sense Definition Embeddings. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, Anna Korhonen, David R. Traum, and Lluís Màrquez (Eds.). Association for Computational Linguistics, 5670–5681.Google ScholarCross Ref
Michael Lesk. 1986. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceedings of the 5th annual international conference on Systems documentation. 24–26.Google ScholarDigital Library
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692(2019).Google Scholar
Daniel Loureiro and Alipio Jorge. 2019. Language modelling makes sense: Propagating representations through wordnet for full-coverage word sense disambiguation. arXiv preprint arXiv:1906.10007(2019).Google Scholar
Fuli Luo, Tianyu Liu, Zexue He, Qiaolin Xia, Zhifang Sui, and Baobao Chang. 2018. Leveraging gloss knowledge in neural word sense disambiguation by hierarchical co-attention. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1402–1411.Google ScholarCross Ref
Fuli Luo, Tianyu Liu, Qiaolin Xia, Baobao Chang, and Zhifang Sui. 2018. Incorporating Glosses into Neural Word Sense Disambiguation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL). Association for Computational Linguistics, 2473–2482. http://aclweb.org/anthology/P18-1230Google ScholarCross Ref
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025(2015).Google Scholar
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781(2013).Google Scholar
George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995), 39–41.Google ScholarDigital Library
George A Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G Thomas. 1994. Using a semantic concordance for sense identification. In HUMAN LANGUAGE TECHNOLOGY: Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994.Google ScholarDigital Library
Andrea Moro and Roberto Navigli. 2015. Semeval-2015 task 13: Multilingual all-words sense disambiguation and entity linking. In Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015). 288–297.Google ScholarCross Ref
Roberto Navigli, David Jurgens, and Daniele Vannella. 2013. Semeval-2013 task 12: Multilingual word sense disambiguation. In Second Joint Conference on Lexical and Computational Semantics (* SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013). 222–231.Google Scholar
Steven Neale, Luís Gomes, Eneko Agirre, Oier Lopez de Lacalle, and António Branco. 2016. Word sense-aware machine translation: Including senses as contextual features for improved translation models. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). 2777–2783.Google Scholar
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.Google ScholarCross Ref
Sameer Pradhan, Edward Loper, Dmitriy Dligach, and Martha Palmer. 2007. Semeval-2007 task-17: English lexical sample, srl and all words. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007). 87–92.Google ScholarDigital Library
Alessandro Raganato, Claudio Delli Bovi, and Roberto Navigli. 2017. Neural sequence learning models for word sense disambiguation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1156–1167.Google ScholarCross Ref
Alessandro Raganato, Jose Camacho-Collados, and Roberto Navigli. 2017. Word sense disambiguation: A unified evaluation framework and empirical comparison. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 99–110.Google ScholarCross Ref
Bianca Scarlini, Tommaso Pasini, and Roberto Navigli. 2020. With more contexts comes better performance: Contextualized sense embeddings for all-round Word Sense Disambiguation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 3528–3539.Google ScholarCross Ref
Benjamin Snyder and Martha Palmer. 2004. The English all-words task. In Proceedings of SENSEVAL-3, the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text. 41–43.Google Scholar
Ying Su, Hongming Zhang, Yangqiu Song, and Tong Zhang. 2022. Rare and Zero-shot Word Sense Disambiguation using Z-Reweighting. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 4713–4723.Google ScholarCross Ref
Ming Wang and Yinglin Wang. 2020. A synset relation-enhanced framework with a try-again mechanism for Word Sense Disambiguation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 6229–6240.Google ScholarCross Ref
Ming Wang and Yinglin Wang. 2021. Word sense disambiguation: Towards interactive context exploitation from both word and sense perspectives. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 5218–5229.Google ScholarCross Ref
Guobiao Zhang, Wenpeng Lu, Xueping Peng, Shoujin Wang, Baoshuo Kan, and Rui Yu. 2022. Word Sense Disambiguation with Knowledge-Enhanced and Local Self-Attention-based Extractive Sense Comprehension. In Proceedings of the 29th International Conference on Computational Linguistics. 4061–4070.Google Scholar
Junwei Zhang, Ruifang He, Fengyu Guo, Jinsong Ma, and Mengnan Xiao. 2022. Disentangled Representation for Long-tail Senses of Word Sense Disambiguation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2569–2579.Google ScholarDigital Library
Zhi Zhong and Hwee Tou Ng. 2010. It Makes Sense: A Wide-Coverage Word Sense Disambiguation System for Free Text. In Proceedings of the ACL 2010 System Demonstrations. Association for Computational Linguistics, Uppsala, Sweden, 78–83. https://www.aclweb.org/anthology/P10-4014Google Scholar

Index Terms

Word Sense Disambiguation by Refining Target Word Embedding
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Lexical semantics

Recommendations

A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation

Word Sense Disambiguation (WSD) aims to automatically predict the correct sense of a word used in a given context. All human languages exhibit word sense ambiguity, and resolving this ambiguity can be difficult. Standard benchmark resources are required ...
Read More
A word sense disambiguation corpus for Urdu
Abstract
The aim of word sense disambiguation (WSD) is to correctly identify the meaning of a word in context. All natural languages exhibit word sense ambiguities and these are often hard to resolve automatically. Consequently WSD is considered an ...
Read More
Unsupervised translated word sense disambiguation in constructing bilingual lexical database
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

The performance of a machine translation system depends on the availability of bilingual lexical dictionary and completion of its word sense disambiguation performance. Word sense disambiguation plays a vital role in several applications such as machine ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '23: Proceedings of the ACM Web Conference 2023
April 2023
4293 pages
ISBN:9781450394161
DOI:10.1145/3543507
Editors:
Ying Ding,
Jie Tang,
Juan Sequeda,
Lora Aroyo,
Carlos Castillo,
Geert-Jan Houben
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 April 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Embedding Refinement
Recursive Attention
Word Sense Disambiguation
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Upcoming Conference
WWW '24

Sponsor:

sigweb

The ACM Web Conference 2024

May 13 - 17, 2024

Singapore , Singapore
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 258
  Total Downloads
- Downloads (Last 12 months)258
- Downloads (Last 6 weeks)16
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Word Sense Disambiguation by Refining Target Word Embedding

WWW '23: Proceedings of the ACM Web Conference 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation

A word sense disambiguation corpus for Urdu

Unsupervised translated word sense disambiguation in constructing bilingual lexical database

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Word Sense Disambiguation by Refining Target Word Embedding

WWW '23: Proceedings of the ACM Web Conference 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Sense Annotated Corpus for All-Words Urdu Word Sense Disambiguation

A word sense disambiguation corpus for Urdu

Unsupervised translated word sense disambiguation in constructing bilingual lexical database

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media