Skip to main content

Dual Filtering Strategy for Chinese Term Extraction

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3614))

Abstract

Automatic term extraction (ATR) is an important problem in natural language processing. But most of extraction methods focus on the extraction of multiword units. Inevitably, many common words (or phrases) as terms are extracted at the same time. In this paper, we propose a hybrid method for automatic extraction of term from domain-specific un-annotated Chinese documents by means of linguistics knowledge and statistical techniques, taking dual filtering strategy and introducing a weight formula to filter term candidates. The results of the research indicate that our system is more efficient and precise than previous methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alegria, I., Arregi, O., Balza, I.: Linguistic and Statistical Approaches to Basque Term Extraction (2004), http://ixa.is.ehu.es

  2. Bourigault, D.: Lexter, a Natural Language Processing Tool for Terminology Extraction. In: Proceedings of 7th EURALEX International Congress (1996)

    Google Scholar 

  3. Wenliang, C., Jingbo, Z., Tianshun, Y.: Automatic Learning Field Words by Bootstrapping. Language Computing and Content-based Text Processing, 67–72 (2003)

    Google Scholar 

  4. Church, K.W., Hanks, P.P.: Word association norms, mutual information and lexicography. In: Proceedings of the 27th Annual Meeting of the ACL, pp. 76–83 (1989)

    Google Scholar 

  5. Dias, G., Guillore, S., Lopes, J.G.P.: Mutual Expectation: A Measure for Multiword Lexical Unit Extraction. In: Proceedings of VEXTAL Venezia per il Trattamento Automatico delle Lingue (1999)

    Google Scholar 

  6. Justeson, J.S., Katz, S.M.: Technical Terminology: Some Linguistic Properties and an Algorithm for Identification in Text. Natural Language Engineering 1(1), 9–27 (1993)

    Google Scholar 

  7. Jianzhou, L., Tingting, H., Donghong, J.: Extracting Chinese Term Based on Open Corpus. Advances in Computation of Oriental Languages, 43–49 (2003)

    Google Scholar 

  8. Shengfen, L., Maosong, S.: Chinese Word Extraction Based on the Internal Associative Strength of Character Strings. Journal of Chinese Information Processing 2003(3), 9–14 (2003)

    Google Scholar 

  9. Pantel, P., Lin, D.: A Statistical Corpus-Based Term Extractor. In: Canadian Conference on AI 2001, pp. 36–46 (2001)

    Google Scholar 

  10. Navigli, R., Velardi, P.: Semantic Interpretation of Terminological Strings. In: Proceedings of 4th Conference. Terminology and Knowledge Engineering (TKE 2002), pp. 325–353 (2002)

    Google Scholar 

  11. Smadja, F.: Retrieving Collocations from Text: XTRACT. Computational Linguistics 19(1), 143–177 (1993)

    Google Scholar 

  12. Binyong, Y., Shizeng, F.: Word Frequency Counting: A new concept and a new approach. Applied Linguistics 1994 (2), 69–75 (1994)

    Google Scholar 

  13. Pu, Z.: The Application of Circulation to Recognizing Terms in the Field of IT. In: Proceedings of Conference of the 20th Anniversary of CIPSC, pp. 111–120 (2001)

    Google Scholar 

  14. Jiaheng, Z., Yongping, D., Lepeng, S.: The Research on Lexical Acquisition of Agricultural Plant Diseases and Insect Pests. In: Language Computing and Content-based Text Processing, pp. 61–66 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, X., Li, X., Hu, Y., Lu, R. (2005). Dual Filtering Strategy for Chinese Term Extraction. In: Wang, L., Jin, Y. (eds) Fuzzy Systems and Knowledge Discovery. FSKD 2005. Lecture Notes in Computer Science(), vol 3614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11540007_97

Download citation

  • DOI: https://doi.org/10.1007/11540007_97

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28331-7

  • Online ISBN: 978-3-540-31828-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics