Skip to main content
Log in

Adding New Words into A Chinese Thesaurus

  • Published:
Computers and the Humanities Aims and scope Submit manuscript

Abstract

In this paper, we study the problem of adding a large number of new words into a Chinese thesaurus according to their definitions in a Chinese dictionary, while minimizing the effort of hand tagging. To deal with the problem, we first make use of a kind of supervised learning technique to learn a set of defining formats for each class in the thesaurus, which tries to characterize the regularities about the definitions of the words in the class. We then use traditional techniques in Graph theory to derive a minimal subset of the new words to be added into the thesaurus, which meets the following condition: if we add the new words in the subset into the thesaurus by hand, the other new words can be added into the thesaurus automatically by matching their definitions with the defining formats of each class in the thesaurus. The method uses little, if any, language-specific or thesaurus-specific knowledge, and can be applied to the thesauri of other languages.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Boguraev, B. "Building a Lexicon." International Journal of Lexicography, 4(3) (1991).

  • Chang, J.S. and Y.J. Lin. "An Estimation of the Entropy of Chinese: A New Approach to Constructing Class-based n-grams Models." Proceedings of ROCLING VII. Taiwan, 1995, pp. 149–169.

  • Cormen, H., C.E. Leiserson and R.L. Rivest. Introduction to Algorithms. MIT Press, 1990.

  • Hopcroft, J. and J.D. Ullman. Introduction to Automata Theory, Language, and Computation. Reading, MA: Addison-Wesley, 1979.

  • Ker, S. J. and J.J.S.Chang. "Automatic Acquisition of Class-based Rules for Word Alignment." Proceedings of the 10th Pacific Asia Conference. Hong Kong, 1996, pp. 173–183.

  • Kozima, H. and T. Furugori. "Similarity between Words Computed by Spreading Activation on an English Dictionary." In Proceeding of 6th Conference of the European Chapter of ACL.} Utrecht, the Netherlands}, 1993}, pp. 232–

    Google Scholar 

  • Knight, K. "Building a Large Ontology for Machine Translation." Proceedings of DARPA Human Language Conference. Princeton, USA, 1993, pp. 185–190.

    Google Scholar 

  • Li, H. and N. Abe. "Generalizing Case Frames Using a Thesaurus and the MDL Principle." Proceedings of Recent Advances in Natural Language Processing, 1995, pp. 239–248.

  • Lesk, M. "Automated Word Sense Disambiguation using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone." Proceedings of the ACM SIGDOC Conference, Toronto, Ontario, 1986.

  • Lua, K.T. "A Study of Chinese Word Semantics and Its Prediction." Journal of Computer Processing of Chinese and Oriental Languages, 7(2) (1993), 167–189.

    Google Scholar 

  • Mei, J.J. et al. TongYiCi CiLin(A Chinese Thesaurus). Shanghai: Shanghai Cishu Press, 1983.

    Google Scholar 

  • Miller, G.A., R.Backwith, C.Fellbaum, D.Gross and K.J. Miller. "Introduction to WordNet: An On-line Lexical Database." International Journal of Lexicography}, 3(4)} (1990}) (Special Is

  • Nagao, M. "Some Rationales and Methodologies for Example-Based Approach." Proceedings of workshop on Future Generation Natural Language Processing. Manchester: UMIST 1992.

  • Hearst, M.A. and H. Schutze. "Customizing a Lexicon to Better Suit a Computational Task." Proceedings of 31st Annual Meeting of ACL, Columbus, Ohio, USA, 1993, pp. 55–69.

  • Procter, P. et al. Longman Dictionary of Contemporary English, Longman Group, 1978.

  • Resnik, P. "Disambiguating Noun Groupings with respect to WordNet Senses." Proceedings of 3rd Workshop on Very Large Corpus. MIT, USA, 1995, pp. 54–68. XianDai HanYu CiDian (A modern Chinese Dictionary). Beijing: Shangwu Press, 1978.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Donghong, J., Junping, G. & Changning, H. Adding New Words into A Chinese Thesaurus. Computers and the Humanities 31, 203–227 (1997). https://doi.org/10.1023/A:1000980024577

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1000980024577

Navigation