Skip to main content

Chinese Terminology Extraction Using Window-Based Contextual Information

  • Conference paper
Book cover Computational Linguistics and Intelligent Text Processing (CICLing 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4394))

Abstract

Terminology extraction is an important work for automatic update of domain specific knowledge. Contextual information helps to decide whether the extracted new terms are terminology or not. As extraction based on fixed patterns has very limited use to handle natural language text, we need both syntactical and semantic information in the context of a term to determine its termhood. In this paper, we investigate two window-based context word extraction methods taking into account of syntactic and semantic information. Based on the performance of each method individually, a hybrid method which combines both syntactical and semantic information is proposed. Experiments show that the hybrid method can achieve significant improvement.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Daille, B.: Study and Implementation of Combined Techniques for Automatic extraction of terminology. In: Resnik, P., Klavans, J. (eds.) The Balancing Act: Combining Symbolic and Statistical Approaches to Language, pp. 49–66. MIT Press, Cambridge (1996)

    Google Scholar 

  2. Milios, E., Zhang, Y., He, B., Dong, L.: Automatic Term Extraction and Document Similarity in Special Text Corpora. In: Proc. of the 6th Conference of the Pacific Association for Computational Linguistics, Halifax, NS, Canada, August 22-25, pp. 275–284 (2003)

    Google Scholar 

  3. Yirong, C., Qin, L., Wenjie, L., Zhifang, S., Luning, J.: A Study on Terminology Extraction Based on Classified Corpora. In: LREC2006 (2006)

    Google Scholar 

  4. Chien, L.F.: Pat-tree-based adaptive keyphrase extraction for intelligent Chinese information retrieval. Information Processing and Management 35, 501–521 (1999)

    Article  Google Scholar 

  5. Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., Nevill-Manning, C.G.: Domain-specific keyphrase Extraction. In: Proc. of 16th Int. Joint Conf. on Artificial Intelligence IJCAI-99, pp. 668–673 (1999)

    Google Scholar 

  6. Nakagawa, H., Mori, T.: A simple but powerful automatic term extraction method. In: Proc. of the 2nd Int. Workshop on Computational Terminology, Taipei,Taiwan, August 31, pp. 29–35 (2002)

    Google Scholar 

  7. Fahmi, I.: C-value method for multi-word term extraction. In: Seminar in Statistics and Methodology, May 23 (2005)

    Google Scholar 

  8. Chang, J.-S.: Domain Specific Word Extraction from Hierarchical Web Documents: A First Step Toward Building Lexicon Trees from Web Corpora. Proc. of the Fourth SIGHAN Workshop on Chinese Language Learning, 64–71 (2005)

    Google Scholar 

  9. Kageura, K., Umino, B.: Methods of automatic term recognition: a review. Terminology 3(2), 259–289 (1996)

    Article  Google Scholar 

  10. Frantzi, K.T.: Incorporating Context Information for the Extraction of Terms. In: Proc. of ACL/EACL ’97, Madrid, Spain, July, pp. 501–503 (1997)

    Google Scholar 

  11. Frantzi, K.T., Annaniadou, S.: Extracting nested collocations. In: Proc. Of COLING’96, pp. 41–46 (1996)

    Google Scholar 

  12. Lu, Q., Chan, S.-T., Li, B., Yu, S.: A Unicode-based Adaptive Segmenter. Journal of Chinese Language and Computing 14(3), 221–234 (2004)

    Google Scholar 

  13. Schone, P., Jurafsky, D.: Is knowledge-free induction of multiword unit dictionary headwords a solved problem? In: Proc. of EMNLP (2001)

    Google Scholar 

  14. Luo, S., Sun, M.: Two-Character Chinese Word Extraction Based on Hybrid of Internal and Contextual Measures. In: Proc. of the Second SIGHAN Workshop on Chinese Language Processing, July, pp. 24–30 (2003)

    Google Scholar 

  15. Sui, Z., Chen, Y.: The Research on the automatic Term Extraction in the Domain of Information Science and Technology. In: Proc. of the 5th East Asia Forum of the Terminology (2002)

    Google Scholar 

  16. Hisamitsu, T., Niwa, Y.: A measure of term representativeness based on the number of co-occurring salient words. In: Proc. of the 19th COLING (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ji, L., Sum, M., Lu, Q., Li, W., Chen, Y. (2007). Chinese Terminology Extraction Using Window-Based Contextual Information. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2007. Lecture Notes in Computer Science, vol 4394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70939-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70939-8_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70938-1

  • Online ISBN: 978-3-540-70939-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics