skip to main content
research-article

Unsupervised similarity-based word sense disambiguation using context vectors and sentential word importance

Published: 16 May 2012 Publication History

Abstract

The process of identifying the actual meanings of words in a given text fragment has a long history in the field of computational linguistics. Due to its importance in understanding the semantics of natural language, it is considered one of the most challenging problems facing this field. In this article we propose a new unsupervised similarity-based word sense disambiguation (WSD) algorithm that operates by computing the semantic similarity between glosses of the target word and a context vector. The sense of the target word is determined as that for which the similarity between gloss and context vector is greatest. Thus, whereas conventional unsupervised WSD methods are based on measuring pairwise similarity between words, our approach is based on measuring semantic similarity between sentences. This enables it to utilize a higher degree of semantic information, and is more consistent with the way that human beings disambiguate; that is, by considering the greater context in which the word appears. We also show how performance can be further improved by incorporating a preliminary step in which the relative importance of words within the original text fragment is estimated, thereby providing an ordering that can be used to determine the sequence in which words should be disambiguated. We provide empirical results that show that our method performs favorably against the state-of-the-art unsupervised word sense disambiguation methods, as evaluated on several benchmark datasets through different models of evaluation.

References

[1]
Agirre, E. and Rigau, G. 1996. Word sense disambiguation using conceptual density. In Proceedings of the 16th International Conference on Computational Linguistic (COLING). 16--22.
[2]
Agirre, E. and Soroa, A. 2009. Personalizing PageRank for word sense disambiguation. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL). 33--41.
[3]
Banerjee, S. and Pedersen, T. 2003. Extended gloss overlaps as a measure of semantic relatedness. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI). 805--810.
[4]
Barzilay, R. and Elhadad, M. 1997. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization. 10--17.
[5]
Brin, S. and Page, L. 1998. The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30, 1--7, 107--117.
[6]
Budanitsky, A. and Hirst, G. 2006. Evaluating WordNet-based measures of lexical semantic relatedness. Computat. Ling. 32, 1, 13--47.
[7]
Carpuat, M. and Wu, D. 2007. Improving statistical machine translation using word sense disambiguation. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). 61--72.
[8]
Chan, Y. S., Ng, H. T., and Chiang, D. 2007. Word sense disambiguation improves statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. 33--40.
[9]
Cowie, J., Guthrie, J., and Guthrie, L. 1992. Lexical disambiguation using simulated annealing. In Proceedings of the 14th International Conference on Computational Linguistic (COLING). 359--365.
[10]
Erkan, G. and Radev, D. R. 2004. LexRank: Graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457--479.
[11]
Fellbaum, C., Ed. 1998. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA.
[12]
Francis, W. N. and Kucera, H. 1964. Manual of information to accompany a standard sample of present-day edited American English, for use with digital computers. Tech. rep., Department of Linguistics, Brown University, Providence, RI. (Revised Ed. 1971; revised and augmented 1979.)
[13]
Freeman, L. C. 1977. A set of measures of centrality based on betweenness. Sociometry. 40, 1, 35--41.
[14]
Freeman, L. C. 1979. Centrality in social networks: Conceptual clarification I. Social Netw., 1, 3, 215--239.
[15]
Harris, Z. 1954. Distributional structure. In The Philosophy of Linguistics J. J. Katz, Ed., Oxford University Press, Oxford, UK, 26--47.
[16]
Halliday, M. and Hasan, R. 1976. Cohesion in English. Longman, London.
[17]
Islam, A. and Inkpen, D. 2008. Semantic text similarity using corpus-based word similarity and string similarity. ACM Trans. Knowl. Discov. Data. 2, 2, 1--25.
[18]
Jiang, J. J. and Conrath, D. W. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of the 10th International Conference on Research in Computational Linguistics (ROCLING). 19--33.
[19]
Kilgarriff, A. and Rosenzweig, J. 2000. English SENSEVAL: Report and results. In Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC). 1239--1244.
[20]
Leacock, C. and Chodorow, M. 1998. Combining local context and WordNet similarity for word sense identification. In WordNet: A Lexical Reference System and its Application, C. Fellbaum Ed., MIT Press, Cambridge, MA, 265--283.
[21]
Lesk, M. 1986. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. In Proceedings of the 5th Annual International Conference On Systems Documentation (SIGDOC). 24--26.
[22]
Li, Y., McLean, D., Bandar, Z., O'Shea, J., and Crockett, K. 2006. Sentence similarity based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Engin., 18, 8, 1138--1149.
[23]
Lin, D. 1998. An information-theoretic definition of similarity. In Proceedings of the 15th International Conference on Machine Learning. (ICML). 296--304.
[24]
Liu, B. 2006. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. Springer.
[25]
Mihalcea, R. 2005. Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP). 411--418.
[26]
Mihalcea, R. and Moldovan, D. I. 2001. eXtended Word-Net: Progress report. In Proceedings of the NAACL Workshop on WordNet and Other Lexical Resources. 95--100.
[27]
Mihalcea, R. and Tarau, P. 2004. TextRank: Bringing order into texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 404--411.
[28]
Miller, G., Leacock, C., Randee, T., and Bunker, R. 1993. A semantic concordance. In Proceedings of the 3rd DARPA Workshop on Human Language Technology. 303--308.
[29]
Navigli, R. 2008. A structural approach to the automatic adjudication of word sense disagreements. Natural Lang. Eng. 14, 4, 547--573.
[30]
Navigli, R. 2009. Word sense disambiguation: A survey. ACM Comput. Surv. 41, 2, 1--69.
[31]
Navigli, R. and Lapata, M. 2007. Graph connectivity measures for unsupervised word sense disambiguation. In Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI). 1683--1688.
[32]
Navigli, R. and Lapata, M. 2010. An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Trans. Pattern Anal. Mach. Intell. 32, 4, 678--692.
[33]
Navigli, R. and Velardi, P. 2005. Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Trans. Pattern Anal. Mach. Intell. 27, 7, 1075--88.
[34]
Ng, A. Y., Jordan, M. I., and Weiss, Y. 2001. On spectral clustering: Analysis and an algorithm. In Proceedings of the Conference on Advances in Neural Information Processing Systems (NIPS). 849--856.
[35]
Palmer, M., Fellbaum, C., Cotton, S., Delfs, L., and Dang, H. 2001. English tasks: All-words and verb lexical sample. In Proceedings of the Meeting of the Association of Computational Linguistics (ACL/SIGLEX). 21--24.
[36]
Patwardhan, S., Banerjee, S., and Pedersen, T. 2003. Using measures of semantic relatedness for word sense disambiguation. In Proceedings of the 4th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing). 241--257.
[37]
Ponzetto, S. P. and Navigli, R. 2010. Knowledge-rich word sense disambiguation rivaling supervised systems. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL). 1522--1531.
[38]
Pradhan, S., Loper, E., Dligach, D., and Palmer, M. 2007. Semeval-2007 Task-17: English lexical sample, SRL and all words. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval). 87--92.
[39]
Rada, R., Mili, H., Bicknell, E., and Blettner, M. 1989. Development and application of a metric to semantic nets. IEEE Trans. Syst. Man Cybern. 19, 1, 17--30.
[40]
Ramakrishnan, G., Jadhav, A., Joshi, A., Chakrabarti, S., and Bhattacharyya, P. 2003. Question answering via Bayesian inference on lexical relations. In Proceedings of the ACL Workshop On Multilingual Summarization and Question Answering (MultiSumQA). 1--10.
[41]
Rand, W. M. 1971. Objective criteria for the evaluation of clustering methods. J. Amer. Statist. Assoc. 66, 338, 846--850.
[42]
Resnik, P. 1995. Using information content to evaluate semantic similarity. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI) 1, 448--453.
[43]
Rosenberg, A. and Hirschberg, J. 2007. V-Measure: A conditional entropy-based external cluster evaluation measure. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL). 410-420.
[44]
Sinha, R. and Mihalcea, R. 2007. Unsupervised graph-based word sense disambiguation using measures of word semantic similarity. In Proceedings of the IEEE International Conference on Semantic Computing (ICSC). 363-369.
[45]
Snyder, B. and Palmer, M. 2004. The English all-words task. In Proceedings of the Meeting of the Association of Computational Linguistics (ACL/SIGLEX). 41--43.
[46]
Steinbach, M., Karypis, G., and Kumar, V. 2000. A comparison of document clustering techniques. In Proceedings of the International SIGKDD Conference on Knowledge Discovery and Data Mining Workshop on Text Mining.
[47]
STOKOE, C. 2005. Differentiating homonymy and polysemy in information retrieval. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP). 403--410.
[48]
Tsatsaronis, G., Varlamis, I., and Norvag, K. 2010. An experimental study on unsupervised graph-based word sense disambiguation. In Proceedings of the 11th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing). LNCS 6008, 184--198.
[49]
Vickrey, D., Biewald, L., Teyssier, M., and Koller, D. 2005. Word-sense disambiguation for machine translation. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP). 771--778.
[50]
Wu, Z. and Palmer, M. 1994. Verb semantics and lexical selection. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (ACL). 133--138.
[51]
Wong, W. and Fu, A. 2000. Incremental document clustering for web page classification. In Proceedings of the IEEE International Conference on Information Society in the 21st Century: Emerging Technologies and New Challenges (IS2000).
[52]
Yarowsky, D. and Florian, R. 2002. Evaluating sense disambiguation across diverse parameter spaces. Natural Lang. Eng. 8, 4, 293--310.

Cited By

View all
  • (2022)AcXProceedings of the VLDB Endowment10.14778/3551793.355181215:11(2530-2544)Online publication date: 1-Jul-2022
  • (2020)Improving Semantic Graph Connectivity for Word Sense IdentificationProceedings of the 12th International Conference on Computer Modeling and Simulation10.1145/3408066.3408099(132-137)Online publication date: 22-Jun-2020
  • (2020)Dynamics of topic formation and quantitative analysis of hot trends in physical scienceScientometrics10.1007/s11192-020-03610-6Online publication date: 13-Jul-2020
  • Show More Cited By

Index Terms

  1. Unsupervised similarity-based word sense disambiguation using context vectors and sentential word importance

    Recommendations

    Reviews

    Franz J Kurfess

    One of the major challenges for computers remains the ability to deal with text formulated in a natural language. While significant advances in computational linguistics, artificial intelligence, computing power, and other aspects have given us tools like search engines that are practically indispensable for many tasks, we also deal with other issues such as translation and virtual assistants, with performance ranging from amazing to appalling. One of the core underlying issues is the ability of humans to quickly determine the meaning of a natural language statement, usually without much effort. This interpretation of natural language statements relies on a combination of analyzing the structure of sentences and determining the intended meaning of words and phrases. For computers, the second aspect is far more challenging, both conceptually and computationally. In this paper, the authors describe an approach that addresses the word sense disambiguation (WSD) problem by measuring the semantic similarity between different possible interpretations of a particular word and its context, represented by the remaining words within the text fragment under consideration. This approach is also referred to as knowledge-based WSD, in contrast to the corpus-based methods that rely on the availability of a set of training data consisting of words labeled with the correct meaning. Knowledge-based WSD has been pursued by other researchers with various similarity measures and computational refinements, but it is seriously affected by the amount of calculations resulting from the comparison of different interpretations for all of the words under consideration. Naive approaches typically lead to combinatorial increases in time or space requirements, whereas refinements often require restrictions such as considering only shorter fragments of text. The new method's computational complexity is quadratic with the number of words in the context, whereas other approaches are often exponential with the size of the context window (also a measure of the number of words). Further improvements are achieved by establishing the order in which the words of the text fragment are considered for disambiguation. This relies on a graph-based approach to weigh the "importance" of words within a text fragment, similar to Google's PageRank algorithm for ordering Web pages in search results. The paper is well written, nicely structured, and reasonably easy to follow. It offers a good overview of the current state of WSD, with brief descriptions of commonly used methods. The results obtained by the authors are better than the methods they compared them against, both in standalone experiments based on commonly used datasets and in more challenging tasks involving sentence similarity measurement and sentence clustering. Especially for the sentence similarity task, the authors report significantly better results over previous approaches, also surpassing the mean performance of human participants in the experiment. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Speech and Language Processing
    ACM Transactions on Speech and Language Processing   Volume 9, Issue 1
    May 2012
    44 pages
    ISSN:1550-4875
    EISSN:1550-4883
    DOI:10.1145/2168748
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 May 2012
    Accepted: 01 October 2011
    Revised: 01 October 2011
    Received: 01 May 2011
    Published in TSLP Volume 9, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Word sense disambiguation
    2. WordNet
    3. semantic similarity
    4. unsupervised similarity-based
    5. word importance

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)AcXProceedings of the VLDB Endowment10.14778/3551793.355181215:11(2530-2544)Online publication date: 1-Jul-2022
    • (2020)Improving Semantic Graph Connectivity for Word Sense IdentificationProceedings of the 12th International Conference on Computer Modeling and Simulation10.1145/3408066.3408099(132-137)Online publication date: 22-Jun-2020
    • (2020)Dynamics of topic formation and quantitative analysis of hot trends in physical scienceScientometrics10.1007/s11192-020-03610-6Online publication date: 13-Jul-2020
    • (2019)Textual Entailment based on Semantic Similarity Using WordNet2019 2nd International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT)10.1109/ICICICT46008.2019.8993180(1188-1192)Online publication date: Jul-2019
    • (2018)Centroid-Based Lexical ClusteringRecent Applications in Data Clustering10.5772/intechopen.75433Online publication date: 1-Aug-2018
    • (2018)Personality modelling and sentiment analysis on Chinese micro-blog postsInternational Journal of Intelligent Information and Database Systems10.5555/3271882.327188611:1(67-78)Online publication date: 1-Jan-2018
    • (2018)Trends in Document AnalysisData Management, Analytics and Innovation10.1007/978-981-13-1402-5_19(249-262)Online publication date: 10-Aug-2018
    • (2017)Representations of context in recognizing the figurative and literal usages of idiomsProceedings of the Thirty-First AAAI Conference on Artificial Intelligence10.5555/3298023.3298038(3230-3236)Online publication date: 4-Feb-2017
    • (2017)A knowledge-based word sense disambiguation algorithm utilizing syntactic dependency relation2017 8th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)10.1109/IEMCON.2017.8117155(85-90)Online publication date: Oct-2017
    • (2016)Survey of the word sense disambiguation and challenges for the Slovak language2016 IEEE 17th International Symposium on Computational Intelligence and Informatics (CINTI)10.1109/CINTI.2016.7846408(000225-000230)Online publication date: Nov-2016
    • Show More Cited By

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media