Tagger Evaluation Given Hierarchical Tag Sets

Melamed, I. Dan; Resnik, Philip

doi:10.1023/A:1002402902356

Tagger Evaluation Given Hierarchical Tag Sets

Published: April 2000

Volume 34, pages 79–84, (2000)
Cite this article

Computers and the Humanities Aims and scope Submit manuscript

I. Dan Melamed¹ &
Philip Resnik²

75 Accesses
12 Citations
Explore all metrics

Abstract

We present methods for evaluating human and automatictaggers that extend current practice in three ways. First, we show howto evaluate taggers that assign multiple tags to each test instance,even if they do not assign probabilities. Second, we show how toaccommodate a common property of manually constructed ``gold standards''that are typically used for objective evaluation, namely that there isoften more than one correct answer. Third, we show how to measureperformance when the set of possible tags is tree-structured in an IS-Ahierarchy. To illustrate how our methods can be used to measureinter-annotator agreement, we show how to compute the kappa coefficientover hierarchical tag sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Atkins, S. “Tools for computer-aided lexicography: the Hector project”. In Papers in Computational Lexicography: COMPLEX '93. Budapest, 1993.
Carletta, J. “Assessing agreement on classification tasks: the Kappa statistic”. Computational Linguistics 22(2), 249–254, 1996.
Google Scholar
Chinchor, N. (ed.) “Proceedings of the 7th Message Understanding Conference”. Columbia,MD: Science Applications International Corporation (SAIC), 1998. Online publication athttp://www.muc.saic.com/proceedings/muc_7_toc.html.
Google Scholar
Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database; Cambridge, MA: MIT Press, 1998.
Google Scholar
Krishnamurthy, R. and D. Nicholls. “Peeling an onion: the lexicographer's experience of manual sense-tagging”. In SENSEVAL Workshop. Sussex, England, 1998.
Resnik, P. and D. Yarowsky. “A perspective on word sense disambiguation methods and their evaluation”. In M. Light (ed.): ACL SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and How? Washington, D.C., 1997.
Resnik, P. and D. Yarowsky. “Distinguishing systems and distinguishing senses: New evaluation methods for word sense disambiguation”. Natural Language Engineering, 5(2), 1999.
Siegel, S. and N.J. Castellan, Jr. Nonparametric Statistics for the Behavioral Sciences. Second edition. McGraw-Hill, 1988.

Download references

Author information

Authors and Affiliations

West Group, USA
I. Dan Melamed
University of Maryland, USA
Philip Resnik

Authors

I. Dan Melamed
View author publications
You can also search for this author in PubMed Google Scholar
Philip Resnik
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Melamed, I.D., Resnik, P. Tagger Evaluation Given Hierarchical Tag Sets. Computers and the Humanities 34, 79–84 (2000). https://doi.org/10.1023/A:1002402902356

Download citation

Issue Date: April 2000
DOI: https://doi.org/10.1023/A:1002402902356

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tagger Evaluation Given Hierarchical Tag Sets

Abstract

Access this article

Similar content being viewed by others

Inter-annotator Agreement

A Bayesian index of association: comparison with other measures and performance

Introduction: The Handbook of Linguistic Annotation

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Tagger Evaluation Given Hierarchical Tag Sets

Abstract

Access this article

Similar content being viewed by others

Inter-annotator Agreement

A Bayesian index of association: comparison with other measures and performance

Introduction: The Handbook of Linguistic Annotation

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation