Abstract
We present a deterministic model to predict all the phrase boundaries of a syntactic tree, including base constituent boundaries and nested constituent boundaries. The model only uses the word and part-of-speech (POS) information, while general parsers also use the phrase type information. Our model is divided into two stages and finally turned into four classification sub-models. The f-score of our model is comparable to Stanford parser’s PCFG model and factored model when tested on Penn Treebank Section 23 using gold-standard POS tags, which shows that phrase boundary identification could be done without phrase labels and could achieve comparable result to Stanford parser.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
McClosky, D., Charniak, E., Johnson, M.: Effective self-training for parsing. In: Proceedings of the HLT-NAACL, New York City, USA (2006)
Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. Ph.D. Thesis, The University of Pennsylvania (1999)
Charniak, E.: A maximum-entropy-inspired parser. In: Proceedings of the North American Chapter of Association for Computational Linguistics, New Brunswick, NJ (2000)
Chen, W., Zhang, Y., Isahara, H.: A Two Stage Parser for Multilingual Dependency Parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL, pp. 1129–1133 (2007)
McDonald, R., Lerman, K., Pereira, F.: Multilingual dependency analysis with a two stage discriminative parser. In: Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL-X), pp. 216–220 (2006)
The Stanford Parser, http://nlp.stanford.edu/software/lex-parser.shtml
Sagae, K., Lavie, A.: A classifier-based parser with linear run-time complexity. In: Proceedings of the IWPT (2005)
Wang, M., Sagae, K., Mitamura, T.: A Fast, Accurate Deterministic Parser for Chinese. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL (2006)
Chenhai, X., Maosong, S.: Automatic Prediction of Chinese Phrase Boundary Location with Neural Networks. Journal of Chinese Information Processing (2002)
Kudo, T., Matsumoto, Y.: Chunking with support vector machines. In: Proceedings of NAACL (2001)
Coeling, R.: Chunking with Maximum Entropy Models. In: Proceedings of CoNLL-2000 and LLL-2000, pp. 139–141 (2000)
Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of HLT-NAACL (2003)
Ratnaparkhi, A.: Learning to parse natural language with maximum entropy models. Machine Learning 34(1-3), 151–176 (1999)
Bikel, D.M.: On the Parameter Space of Generative Lexicalized Statistical Parsing Models. Ph.D. Thesis, The University of Pennsylvania (2004)
Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: Proceedings of EMNLP, pp. 133–142 (1996)
Luo, X.: A maximum entropy Chinese character-based parser. In: Proceedings of EMNLP (2003)
Bracket scoring program, http://nlp.cs.nyu.edu/evalb
Sun, G., Huang, C., Wang, X., Xu, Z.: Chinese Chunking Based on Maximum Entropy Markov Models. Computational Linguistics and Chinese Language Processing 11(2), 115–136 (2006)
Xin, X., Fan, S., Wang, X., Wang, X.: Dependency Parsing Based on Maximum Entropy Model. Journal of Chinese Information Processing (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dong, Z., Zhao, T. (2010). A Deterministic Method to Predict Phrase Boundaries of a Syntactic Tree. In: Huang, DS., Zhang, X., Reyes GarcÃa, C.A., Zhang, L. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence. ICIC 2010. Lecture Notes in Computer Science(), vol 6216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14932-0_80
Download citation
DOI: https://doi.org/10.1007/978-3-642-14932-0_80
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14931-3
Online ISBN: 978-3-642-14932-0
eBook Packages: Computer ScienceComputer Science (R0)