Abstract
Senseval was the first open, community-based evaluation exercisefor Word Sense Disambiguation programs. It adopted the quantitativeapproach to evaluation developed in MUC and other ARPA evaluationexercises. It took place in 1998. In this paper we describe thestructure, organisation and results of the SENSEVAL exercise forEnglish. We present and defend various design choices for theexercise, describe the data and gold-standard preparation, considerissues of scoring strategies and baselines, and present the resultsfor the 18 participating systems. The exercise identifies thestate-of-the-art for fine-grained word sense disambiguation, wheretraining data is available, as 74–78% correct, with a number ofalgorithms approaching this level of performance. For systems thatdid not assume the availability of training data, performance wasmarkedly lower and also more variable. Human inter-tagger agreementwas high, with the gold standard taggings being around 95%replicable.
Similar content being viewed by others
References
Harley, A. and D. Glennon. “Combining Different Tests with Additive Weighting and Their Evaluation”. In Tagging Text with Lexical Semantics: Why, What and How? Ed. M. Light, Washington, 1997, pp. 74–78.
References
Li, X., S. Szpakowicz and S. Matwin. “A WordNet-based Algorithm for Word Sense Disambiguation”. In Proceedings, IJCAI '95. Montreal, 1995, pp. 1368–1374.
Szpakowicz, S., S. Matwin and K. Barker. “WordNet-based Word Sense Disambiguation that Works for Small Texts”. Technical Report Computer Science TR–96–03, School of Information Technology and Engineering, University of Ottawa, 1996.
References
Guo, C.-M. Constructing a MTD from LDOCE, Chapt. Part 2. Norwood, New Jersey: Ablex, 1995, pp. 145–234.
Wilks, Y., D. Fass, C.-M. Guo, J. McDonald, T. Plate and B. Slator: 1989, 'A Tractable Machine Dictionary as a Resource for Computational Semantics”. In Computational Lexicography for Natural Language Processing. Eds. B. K. Boguraev and E. J. Briscoe, Harlow: Longman, pp. 193–238.
References
Atkins, S. “Tools for Computer-Aided Corpus Lexicography: The Hector Project”. Acta Linguistica Hungarica, 41 (1993), 5–72.
Byrd, R. J., N. Calzolari, M. S. Chodorow, J. L. Klavans, M. S. Neff and O. A. Rizk. “Tools and Methods for Computational Lexicology”. Computational Linguistics, 13 (1987), 219–240.
CIDE. “Cambridge International Dictionary of English”. Cambridge, England: CUP, 1995.
Fellbaum, C. (ed.). WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press, 1998.
Gale, W., K. Church and D. Yarowsky. “Estimating Upper and Lower Bounds on the Performance of Word-sense Disambiguation Programs”. In Proceedings, 30th ACL, 1992, pp. 249–156.
Harley, A. and D. Glennon. “Combining Different Tests with Additive Weighting and Their Evaluation”. In Tagging Text with Lexical Semantics: Why, What and How? Ed. M. Light, Washington, 1997, pp. 74–78.
Hirschman, L. “The Evolution of Evaluation: Lessons from the Message Understanding Conferences”. Computer Speech and Language, 12(4) (1998), 281–307.
Jorgensen, J. C. “The Psychological Reality of Word Senses”. Journal of Psycholinguistic Research, 19(3) (1990), 167–190.
Kilgarriff, A.: 1992, 'Polysemy'. Ph.D. thesis, University of Sussex, CSRP 261, School of Cognitive and Computing Sciences.
Kilgarriff, A.: 1997, 'Evaluating Word Sense Disambiguation Programs: Progress Report'. In Proc. SALT Workshop on Evaluation in Speech and Language Technology. Ed. R. Gaizauskas, Sheffield, pp. 114–120.
Kilgarriff, A. “Gold Standard Datasets for Evaluating Word Sense Disambiguation Programs”. Computer Speech and Language, 12(4) (1998), 453–472. Special Issue on Evaluation of Speech and Language Technology, edited by R. Gaizauskas.
Lesk, M. E. “Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone”. In Proc. 1986 SIGDOC Conference. Toronto, Canada, 1986.
Ng, H. T. and H. B. Lee. “Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach”. In ACL Proceedings. Santa Cruz, California, 1996, pp. 40–47.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kilgarriff, A., Rosenzweig, J. Framework and Results for English SENSEVAL. Computers and the Humanities 34, 15–48 (2000). https://doi.org/10.1023/A:1002693207386
Issue Date:
DOI: https://doi.org/10.1023/A:1002693207386