Skip to main content

Parsing the Penn Chinese Treebank with Semantic Knowledge

  • Conference paper
Natural Language Processing – IJCNLP 2005 (IJCNLP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3651))

Included in the following conference series:


We build a class-based selection preference sub-model to incorporate external semantic knowledge from two Chinese electronic semantic dictionaries. This sub-model is combined with modifier-head generation sub-model. After being optimized on the held out data by the EM algorithm, our improved parser achieves 79.4% (F1 measure), as well as a 4.4% relative decrease in error rate on the Penn Chinese Treebank (CTB). Further analysis of performance improvement indicates that semantic knowledge is helpful for nominal compounds, coordination, and N⋄V tagging disambiguation, as well as alleviating the sparseness of information available in treebank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. PhD thesis, University of Pennsylvania (1999)

    Google Scholar 

  2. Resnik, P.S.: Selection and Information: A Class-Based Approach to Lexical Relationships. PhD thesis, University of Pennsylvania, Philadelphia, PA, USA (1993)

    Google Scholar 

  3. Harabagiu, S.: An Application of WordNet to Prepositional Attachement. In: Proceedings of ACL-1996, Santa Cruz CA, June 1996, pp. 360–363 (1996)

    Google Scholar 

  4. Krymolowski, Y., Roth, D.: Incorporating Knowledge in Natural Language Learning: A Case Study. In: COLING-ACL 1998 Workshop on Usage of WordNet in Natural Language Processing Systems, Montreal, Canada (1998)

    Google Scholar 

  5. McLauchlan, M.: Thesauruses for Prepositional Phrase Attachment. In: Proceedings of CoNLL-2004, Boston, MA, USA, pp. 73–80 (2004)

    Google Scholar 

  6. Xia, F.: Automatic Grammar Generation from Two Different Perspectives. PhD thesis, University of Pennsylvania (1999)

    Google Scholar 

  7. Klein, D., Manning, C.D.: Fast Exact Natural Language Parsing with a Factored Model. Advances in Neural Information Processing Systems 15 (NIPS-2002) (2002)

    Google Scholar 

  8. Klein, D., Manning, C.D.: Accurate Unlexicalized Parsing. In: Proceedings of ACL-2003 (2003)

    Google Scholar 

  9. Gildea, D.: Corpus variation and parser performance. In: Proceedings of EMNLP-2001, Pittsburgh, Pennsylvania (2001)

    Google Scholar 

  10. Bikel, D.M.: On the Parameter Space of Generative Lexicalized Statistical Parsing Models. PhD thesis, University of Pennsylvania (2004a)

    Google Scholar 

  11. Xue, N., Xia, F.: The Bracketing Guidelines for Chinese Treebank Project. Technical Report IRCS 00-08, University of Pennsylvania (2000)

    Google Scholar 

  12. Levy, R., Manning, C.: Is it harder to parse Chinese, or the Chinese Treebank? In: Proceedings of ACL-2003 (2003)

    Google Scholar 

  13. Xiong, D., Liu, Q., Lin, S.: Lexicalized Beam Thresholding Parsing with Prior and Boundary Estimates. In: Proceedings of the 6th Conference on Intelligent Text Processing and Computational Linguistics (CICLing), Mexico City, Mexico (2005)

    Google Scholar 

  14. Bikel, D.M., Chiang, D.: Two statistical parsing models applied to the chinese treebank. In: Proceedings of the Second Chinese Language Processing Workshop, pp. 1–6 (2000)

    Google Scholar 

  15. Bikel, D.M.: Intricacies of Collins’ Parsing Model. to appear in Computational Linguistics (2004b)

    Google Scholar 

  16. Chen, K., Hong, W.: Resolving Ambiguities of Predicate-object and Modifier-noun Structures for Chinese V-N Patterns. Communication of COLIPS 6(2), 73–79 (1996) (in Chinese)

    Google Scholar 

  17. Chiang, D., Bikel, D.M.: Recovering Latent Information in Treebanks. In: Proceedings of COLING 2002 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xiong, D., Li, S., Liu, Q., Lin, S., Qian, Y. (2005). Parsing the Penn Chinese Treebank with Semantic Knowledge. In: Dale, R., Wong, KF., Su, J., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2005. IJCNLP 2005. Lecture Notes in Computer Science(), vol 3651. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29172-5

  • Online ISBN: 978-3-540-31724-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics