Abstract
In this article we present two grammars (GramCat and GramEsp) for chunking of unrestricted Catalan and Spanish texts. With these grammars we extend the classical notion of chunk as it is defined by Abney, taking advantage of Catalan and Spanish morphosyntactic features: Catalan and Spanish rich inflectional morphology and the high frequency of some prepositional patterns allow us to include both pre- and post-nominal modifiers in the noun phrase.
The work presented here was partially funded by the Xtract2 project (Platform of Linguistic Engineering resources BFF2002-04226-C03-03).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abney, S.: Parsing by Chunks. Principle-Based Parsing (1991)
Abney, S.: Partial Parsing via Finite-State Cascades. Proceedings of the ESSLI’96 Robust Parsing Workshop (1996)
Arévalo, M., Civit, M., Martí, M.A: MICE: A Module for Named Entity Recognition and Classification. International Journal of Corpus Linguistics Volume 9, Number 1, (2004). John Benjamins
Atserias, J., Rodrǵuez, H.: TACAT: TAgged Corpus Text Analyser. Technical Report, Software Department, UPC (1998)
Bosque, I. Demonte, V.: Gramática Descriptiva de la Lengua Española. Espasa-Calpe (1999)
Civit, M., Martí, M.A.: Design Principles for a Spanish Treebank. Proceedings of the First Workshop on Treebanks and Linguistics Theories (TLT2002). Sozopol, Bulgaria (2002), 61–77
Civit, M.: Criterios de etiquetación y desambiguación morfosintáactica de corpus en español. Sociedad Española para el Procesamiento del Lenguaje Natural. Colección monografías. 3. (2003)
Civit, M., Bufí, N., Valverde, M.P.: CAT3LB: a Treebank for Catalan with Word Sense Annotation. 3rd Workshop on Treebanks and Linguistic Theories. Tuebingen, Germany (2004)
Gala, N.: Using the Incremental Finite State Architecture to create a Spanish Shallow Parser. SEPLN, Proceedings of the 15th Conference of the SEPLN Lleida (1999), 75–82
Gelbukh, A., Sidorov, G. Galicia-Haro, S., Bolsharov, I.: Environment for Development of a Natural Language Syntactic Analyzer. Acta Academia (2002)
Kermes, H., Evert, S.: Text analysis meets corpus linguistics. Corpus Linguistics (2003) 402–411
Moreno, A., Grishman, R., López, S., Sánchez, F., Sekine, S.: A Treebank of Spanish and its Application to Parsing. Procedings of the Second Conference on Language Resources and Evaluation (LREC) (2000) 107–111
Sebastián, N., Martí, M.A., Carreiras, M.F., Cuetos, F.: LEXESP: Léxico Informatizado del Español. Edicions de la Universitat de Barcelona, (2000)
Solà, J., Lloret, M.R., Mascaró, J., Pérez, M.: Gramàtica del català contemporani. Empúries, (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Civit, M., Antònia Martí, M. (2005). GramCat and GramEsp: two grammars for chunking. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 31. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-32392-9_17
Download citation
DOI: https://doi.org/10.1007/3-540-32392-9_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25056-2
Online ISBN: 978-3-540-32392-1
eBook Packages: EngineeringEngineering (R0)