Abstract
Sensitive words are the compound words whose syntactic category is different from those of their components. According to the segmentation, a sensitive word may play different roles, leading to significantly different syntactic structures. If a syntactic analysis fails for a Chinese sentence, instead of examining each segmentation alternative in turn, sensitive words should be first examined in order to change the syntactic structure of the sentence. This will lead to a higher efficiency. Our examination of a machine-readable dictionary shows that there are a great number of such words. This shows that sensitive word is a widespread phenomenon in Chinese.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chang, J.-S., Chen, C.-D. and Chen, S.-D. Chinese word segmentation through constraint satisfaction and statistical optimisation. ROCLING-IV, Taiwan, 1991, pp. 147–165.
Chiang, T.-H., Chang, J.-S., Lin, M.-Y. and Su, K.-Y. Statistical models for segmentation and unknown word resolution. 5th R.O.C. Computational Linguistics Conference, 1992, pp. 123–146.
Dunning, T., Accurate Methods for the Statistics of Surprise and Coincidence. Computational Linguistics, vol. 19, 1993, pp. 61–74.
Li, B.-Y., Lien, S., Sun, C.-F. and Sun, M.-S. A maximal matching automatic Chinese word segmentation algorithm using corpus tagging for ambiguity resolution. R.O.C. Computational Linguistics Conference, Taiwan, 1991, pp. 135–146.
Liang, N. Y. and Zhen, Y.-B. A Chinese word segmentation model and a Chinese word segmentation system PC-CWSS. COLIPS, 1(1), 1991, pp. 51–55.
Liu, K.Y. Estimation report of Chinese word segmentation, Chinese Computer World, vol. 584, no. 12, 1996, pp. 187–189.
Sproat, R. and Shih, C. A statistical method for finding word boundaries in Chinese text. Computer Processing of Chinese and Oriental Languages, 4(4), 1991, pp. 336–351.
Yeh, C.-L. and Lee, H.-J. Rule-based word identification for Mandarin Chinese sentences-A unification approach. Computer processing of Chinese and Oriental Languages, 5(2), 1991, pp. 97–118.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ren, F., Nie, JY. (2000). Sensitive Words and Their Application to Chinese Processing. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2000. Lecture Notes in Computer Science(), vol 1902. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45323-7_21
Download citation
DOI: https://doi.org/10.1007/3-540-45323-7_21
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41042-3
Online ISBN: 978-3-540-45323-9
eBook Packages: Springer Book Archive