Abstract
Word Sense Disambiguation (WSD) is the problem of determining the right sense of a polysemous word in a given context. In this paper, we will investigate the use of unlabeled data for WSD within the framework of semi supervised learning, in which the original labeled dataset is iteratively extended by exploiting unlabeled data. This paper addresses two problems occurring in this approach: determining a subset of new labeled data at each extension and generating the final classifier. By giving solutions for these problems, we generate some variants of bootstrapping algorithms and apply to word sense disambiguation. The experiments were done on the datasets of four words: interest, line, hard, and serve; and on English lexical sample of Senseval-3.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proc. COLT, pp. 92–100 (1998)
Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: Proc. ICML 2000, pp. 327–334 (2000)
Lee, Y.K., Ng, H.T.: An empirical evaluation of knowledge sources and learning algorithms for word sense disambiguation. In: Proc. EMNLP, pp. 41–48 (2002)
Le, C.A., Huynh, V.-N., Dam, H.-C., Shimazu, A.: Combining Classifiers Based on OWA Operators with an Application to Word Sense Disambiguation. In: Ślęzak, D., Wang, G., Szczuka, M.S., Düntsch, I., Yao, Y. (eds.) RSFDGrC 2005. LNCS, vol. 3641, pp. 512–521. Springer, Heidelberg (2005)
Mihalcea, R.: Co-training and Self-training for Word Sense Disambiguation. In: Proc. CoNLL, pp. 33–40 (2004)
Ng, H.T., Lee, H.B.: Integrating multiple knowledge sources to Disambiguate Word Sense: An exemplar-based approach. In: Proc. ACL, pp. 40–47 (1996)
Pham, T.P., Ng, H.T., Lee, W.S.: Word Sense Disambiguation with Semi-Supervised Learning. In: Proc. AAAI, pp. 1093–1098 (2005)
Pierce, D., Cardie, C.: Limitations of co-training for natural language learning from large datasets. In: Proc. EMNLP, pp. 1–9 (2001)
Yarowsky, D.: Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In: Proc. ACL, pp. 189–196 (1995)
Yu, N.Z., Hong, J.D., Lim, T.C.: Word Sense Disambiguation Using Label Propagation Based Semi-supervised Learning Method. In: Proc. ACL, pp. 395–402 (2005)
Su, W., Carpuat, M., Wu, D.: Semi-Supervised Training of a Kernel PCA-Based Model for Word Sense Disambiguation. In: Proc. COLING, pp. 1298–1304 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Le, AC., Shimazu, A., Nguyen, LM. (2006). Investigating Problems of Semi-supervised Learning for Word Sense Disambiguation. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_51
Download citation
DOI: https://doi.org/10.1007/11940098_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49667-0
Online ISBN: 978-3-540-49668-7
eBook Packages: Computer ScienceComputer Science (R0)