Abstract
While rule-based shallow parsers usually recognise phrases’ syntactic heads, the same does not hold for statistical syntactic chunkers. The task of finding heads within already recognised chunks is not trivial for freer word order languages like German or Polish, while this information may be very useful.
We propose a simple solution that allows to incorporate head recognition into existing chunkers by extending the standard IOB2 representation with information on head location. To evaluate this approach we introduced the new representation into a CRF chunker for Polish. Although this idea is very simple, the results are surprisingly good.
This work was financed by the National Centre for Research and Development (NCBiR) project SP/I/1/77065/10 (“SyNaT”).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abney, S.: Parsing by chunks. In: Principle-Based Parsing. pp. 257–278. Kluwer Academic Publishers (1991)
Broda, B., Marcińczuk, M., Maziarz, M., Radziszewski, A., Wardyński, A.: KPWr: Towards a free corpus of Polish. In: Calzolari, N., Choukri, K., Declerck, T., Doǧan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of LREC 2012. ELRA, Istanbul (2012)
Hobbs, J.R., Riloff, E.: Information extraction. In: Indurkhya, N., Damerau, F.J. (eds.) Handbook of Natural Language Processing, 2nd edn. Chapman & Hall/CRC Press, Taylor & Francis Group (2010)
Kermes, H., Evert, S.: YAC — a recursive chunker for unrestricted German text. In: Rodriguez, M.G., Araujo, C.P. (eds.) Proceedings of the Third International Conference on , vol. V, pp. 1805–1812 (2002)Language Resources and Evaluation
Maziarz, M., Radziszewski, A., Wieczorek, J.: Chunking of Polish: guidelines, discussion and experiments with Machine Learning. In: Proceedings of the 5th Language & Technology Conference, LTC 2011, Poznań, Poland (2011)
Osenova, P.: Bulgarian nominal chunks and mapping strategies for deeper syntactic analyses. In: Proceedings of the Workshop on Treebanks and Linguistic Theories (TLT 2002), Sozopol, Bulgaria, September 20-21 (2002)
Przepiórkowski, A., Bańko, M., Górski, R.L., Lewandowska-Tomaszczyk, B. (eds.): Narodowy Korpus Języka Polskiego. Wydawnictwo Naukowe PWN, Warsaw (2012)
Radziszewski, A., Maziarz, M., Wieczorek, J.: Shallow syntactic annotation in the Corpus of Wrocław University of Technology. Cognitive Studies 12 (2012)
Radziszewski, A., Pawlaczek, A.: Large-scale experiments with NP chunking of polish. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 143–149. Springer, Heidelberg (2012)
Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora, Cambridge, MA, USA, pp. 82–94 (1995)
Sang, E.F.T.K., Veenstra, J.: Representing text chunks. In: Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics, pp. 173–179. Association for Computational Linguistics, Morristown (1999)
Tjong Kim Sang, E.F., Buchholz, S.: Introduction to the CoNLL-2000 shared task: Chunking. In: Proceedings of CoNLL-2000 and LLL-2000, Lisbon, Portugal pp. 127–132 (2000)
Vučković, K.: Model parsera za hrvatski jezik. Ph.D. thesis, Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, Croatia (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Radziszewski, A., Pawlaczek, A. (2013). Incorporating Head Recognition into a CRF Chunker. In: Kłopotek, M.A., Koronacki, J., Marciniak, M., Mykowiecka, A., Wierzchoń, S.T. (eds) Language Processing and Intelligent Information Systems. IIS 2013. Lecture Notes in Computer Science, vol 7912. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38634-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-38634-3_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38633-6
Online ISBN: 978-3-642-38634-3
eBook Packages: Computer ScienceComputer Science (R0)