Abstract
This paper presents the construction of a Chinese word sense-tagged corpus. The resulting lexical resource includes mainly three components: 1) a corpus annotated with word senses; 2) a lexicon containing sense distinction and description in the feature-based formalism; 3) the linking between the sense entries in the lexicon and CCD synsets. A dynamic model is put forward to build the three knowledge bases simultaneously and interactively. The strategy to improve consistency is addressed since consistency is a thorny issue for constructing semantic resources. The inter-annotator agreement of the sense-tagged corpus is satisfied. The database will grow up to be a powerful lexical resource both for linguistic researches on Chinese lexical semantics and word sense disambiguation.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
SENSEVAL: http://ww.senseval.org
Ng, H.T.: Getting Serious about Word Sense Disambiguation. In: Proceedings of ANLP 1997 Workshop on Tagging Text with Lexical Semantics: Why, What, and How? (1997)
Veronis, J.: Sense Tagging: Does It Make Sense? In: Wilson, et al. (eds.) Corpus Linguistics by the Rule: a Festschrift for Geoffrey Leech. (2003)
Landes, S., Leacock, C., Tengi, R.I.: Building Semantic Concordances. In: Fellbaum, C. (ed.) WordNet: an Electronic Lexical Database. MIT Press, Cambridge (1999)
Xue, N., Chiou, F.D., Palmer, M.: Building a Large-Scale Annotated Chinese Corpus. In: Proceedings of COLING (2002)
Huang, C.R., Chen, K.J.: A Chinese Corpus for Linguistics Research. In: Proceedings of COLING (1992)
Li, M.Q., Li, J.Z., Dong, Z.D., Wang, Z.Y., Lu, D.J.: Building a Large Chinese Corpus Annotated with Semantic Dependency. In: Proceedings of the 2nd SIGHAN Workshop on Chinese Language Processing (2003)
Huang, C.R., Chen, C.L., Weng, C.X., Chen, K.J.: The Sinica Sense Management System: Design and Implementation. In: Recent advancement in Chinese lexical semantics (2004)
Liu, Y., Yu, S.W., Yu, J.S.: Building a Bilingual WordNet-like Lexicon: the New Approach and Algorithms. In: Proceedings of COLING (2002)
Ide, N., Wilks, Y.: Making Sense About Sense. In: Agirre, E., Edmonds, P. (eds.) Word Sense Disambiguation: Algorithms and Applications. Springer, Heidelberg (2006)
Nerbonne, J.: Computational Semantics – Linguistics and Processing. In: Lappin, S. (ed.) The Handbook of contemporary semantic theory. Foreign Language Teaching and Research Press and Blackwell Publishers Ltd. (2001)
Colin, F.B., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet Project. In: Proceedings of COLING-ACL (1998)
Mihalcea, R., Chklovsky, T., Kilgarriff, A.: The SENSEVAL-3 English Lexical Sample Task. In: Third International Workshop on the Evaluation of Systems for the Semantic analysis of Text (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wu, Y., Jin, P., Zhang, Y., Yu, S. (2006). A Chinese Corpus with Word Sense Annotation. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_43
Download citation
DOI: https://doi.org/10.1007/11940098_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49667-0
Online ISBN: 978-3-540-49668-7
eBook Packages: Computer ScienceComputer Science (R0)