Skip to main content

Word Frequency Statistics Model for Chinese Base Noun Phrase Identification

  • Conference paper
Book cover Intelligent Computing Methodologies (ICIC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8589))

Included in the following conference series:

Abstract

The Chinese base phrase identification plays an important role in the field of natural language processing. It needs to be improved in the recognition scope and methods currently. This paper presents a method based on word frequency statistics model for Chinese base noun phrase identification: Building the noun phrase dictionary by training corpus, calculating the co-occurrence frequency and threshold of the noun phrase, and constructing word table according to the different roles of the words in the noun phrase. Unknown word processing and rule templates are added. Improve the results with error correction processing at last. Experiments on the test corpus show that the average precision and average recall rate of the base noun phrases identification in different areas are 91.28% and 93.22%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Xu, F., Zong, C.Q., Wang, X.: Chinese baseNP chunking by Error-driven Combination Classifiers. Journal of Chinese Information Processing 21(1), 115–119 (2007)

    Google Scholar 

  2. Xu, Y.H.: Corpus-based studies of base noun phrase. Language Application (1), 120–125 (2008)

    Google Scholar 

  3. Hu, N.Q., Zhu, Q.M., Zhou, G.D.: Hybrid Method to Chinese Base Noun Phrase Recognition. Computer Engineering 35(20), 199–201 (2009)

    Google Scholar 

  4. Church, K.W.: A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text, pp. 136–143 (1988)

    Google Scholar 

  5. Eric, B.: Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging. Computational Linguistics 21(4) (1995)

    Google Scholar 

  6. Zhao, J., Huang, C.N.: A Chinese base noun phrase identification model based on the transformation. Journal of Chinese Information Processing 13(2), 1–7 (1998)

    MathSciNet  Google Scholar 

  7. Zhou, Y.Q., Guo, Y., Huang, X.J.: Chinese and English BaseNP Recognition Based on a Maximum Entropy Model. Journal of Computer Research and Development 40(3), 440–446 (2003)

    Google Scholar 

  8. Tan, W., Kong, F., Ni, J.: A mixed statistical model-based method for identifying Chinese base noun phrase. Computer Applications and Software 28(8), 254–256 (2011)

    Google Scholar 

  9. Zhang, Y.Q., Zhou, Q.: Automatic identification of Chinese Base Phrases. Journal of Chinese Information Processing 16(6), 1–8 (2002)

    Google Scholar 

  10. Liu, S., Li, Y., Zhang, L.: Chinese Text Chunking Using Co-training Method. Journal of Chinese Information Processing 19(3), 73–79 (2005)

    MATH  Google Scholar 

  11. Huang, C.N., Jin, G.J.: To observe three Chinese grammar problems from Chinese TreeBank. The language of science 12(2), 178–192 (2013)

    Google Scholar 

  12. Yin, B.Y., Fang, S.Z.: New concepts and methods of word frequency statistics. In: Proceedings of Language Application (1995)

    Google Scholar 

  13. Hu, W.T., Yang, Y., Yin, H.F.: Organization name recognition based on word frequency statics. Application Research of Computers 30(7), 2014–2016 (2013)

    Google Scholar 

  14. Wu, X.Q., Lv, N.: An analysis method based on keyword co-occurrence frequency. The Intelligence Theory and Practice 35(8), 115–119 (2012)

    Google Scholar 

  15. Mao, T., Yang, J.D., Wang, W.G.: State transition method of natural language based on finite automata machine. Journal of Liaoning Technical University (Natural Science) 31(6), 885–888 (2012)

    Google Scholar 

  16. Fu, G.H., Wang, P., Wang, X.L.: Research on the approach of integrating Chinese word segmentation with Part-of-speech Tagging. Application Research of Computers (7), 24–26 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kong, L., Ren, F., Sun, X., Quan, C. (2014). Word Frequency Statistics Model for Chinese Base Noun Phrase Identification. In: Huang, DS., Jo, KH., Wang, L. (eds) Intelligent Computing Methodologies. ICIC 2014. Lecture Notes in Computer Science(), vol 8589. Springer, Cham. https://doi.org/10.1007/978-3-319-09339-0_64

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09339-0_64

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09338-3

  • Online ISBN: 978-3-319-09339-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics