Abstract
This paper presents a method for the syntactic parsing of Hungarian natural language texts using a machine learning approach. This method learns tree patterns with various phrase types described by regular expressions from an annotated corpus. The PGS algorithm, an improved version of the RGLearn method, is developed and applied as a classifier in classifier combination schemas. Experiments show that classifier combinations, especially the Boosting algorithm, can effectively improve the recognition accuracy of the syntactic parser.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abney, S.: Partial Parsing via Finite-State Cascades. In: Proceedings of ESSLLI 1996 Robust Parsing Workshop, pp. 1–8 (1996)
Alexin, Z., Csirik, J., Gyimóthy, T., Bibok, K., Hatvani, C., Prószéky, G., Tihanyi, L.: Manually Annotated Hungarian Corpus. In: Proceedings of the Research Note Sessions of the 10th Conference of the European Chapter of the Association for Computational Linguis-tics EACL 2003, Budapest, Hungary, pp. 53–56 (2003)
Erjavec, T., Monachini, M.: Specification and Notation for Lexicon Encoding, Copernicus project 106 ”MULTEXT-EAST”,Work Package WP1 - Task 1.1 Deliverable D1.1F (1997)
Hócza, A.: Noun Phrase Recognition with Tree Patterns. In: Proceedings of the Acta Cybernetica, Szeged, Hungary (2004)
Kis, B., Naszódy, M., Prószéki, G.: Complex Hungarian syntactic parser system. In: Proceedings of the MSZNY 2003, Szeged, Hungary, pp. 145–151 (2003)
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: the Penn Treebank, Association for Computational Linguistics (1993)
Ramshaw, L.A., Marcus, M.P.: Text Chunking Using Transformational-Based Learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora, Association for Computational Linguistics (1995)
Tjong Kim Sang, E.F., Veenstra, J.: Representing text chunks. In: Proceedings of EACL 1999, Association for Computational Linguistics (1999)
Tjong Kim Sang, E.F.: Noun Phrase Recognition by System Combination. In: Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, Seattle, pp. 50–55 (2000)
Váradi, T.: Shallow Parsing of Hungarian Business News. In: Proceedings of the Corpus Linguistics, Conference, Lancaster, pp. 845–851 (2003)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. John Wiley & Sons Inc., Chichester (2001)
Jain, A.K.: Statistical Pattern Recognition: A Review. IEEE Trans. Pattern Analysis and Machine Intelligence 22(1) (January 2000)
Shapire, R.E.: The Strength of Weak Learnability. Machine Learnings 5, 197–227 (1990)
Vapnik, V.N.: Statistical Learning Theory. John Wiley & Sons Inc., Chichester (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hócza, A., Felföldi, L., Kocsor, A. (2005). Learning Syntactic Patterns Using Boosting and Other Classifier Combination Schemas. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_9
Download citation
DOI: https://doi.org/10.1007/11551874_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28789-6
Online ISBN: 978-3-540-31817-0
eBook Packages: Computer ScienceComputer Science (R0)