Abstract
We present a system for the detection of agreement errors in Basque, a language with agglutinative morphology and free order of the main sentence constituents. Due to their complexity, agreement errors are one of the most frequent error types found in written texts. As the constituents concerning agreement can appear in any order in the sentence, we have implemented a system that makes use of dependency trees of the sentence, which abstract over specific constituent orders. We have used Saroi, a tool that obtains the analysis trees that fulfill a set of restrictions described by means of declarative rules. This tool is applied to the output of two dependency analyzers: MaltIxa (data-driven) and EDGK (rule-based). The system has been evaluated on two corpora: a group of texts containing errors, and another one composed of correct texts. As a secondary result, we have also estimated a measure of the impact of syntactic ambiguity on the quality of the results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Zubiri, I.: Gramática didáctica del euskara. Didaktiker, Bilbo (1994)
Díaz de Ilarraza, A., Gojenola, K., Oronoz, M.: Evaluating the Impact of Morphosyntactic Ambiguity in Grammatical Error Detection. In: Recent Advances in Natural Language Processing 2009, Borovets, Bulgary (2009)
Díaz de Ilarraza, A., Gojenola, K., Oronoz, M.: Design and development of a system for the detection of agreement errors in Basque. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 793–802. Springer, Heidelberg (2005)
Aranzabe, M.: Dependentzia-ereduan oinarritutako baliabide sintaktikoak: zuhaitz-bankua eta gramatika konputazionala. Ph.D. thesis, UPV-EHU (2008)
Bengoetxea, K., Gojenola, K.: Exploring treebank transformations in dependency parsing. In: RANLP 2009, Borovets, Bulgaria (2009)
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryiǧit, G., Kübler, S., Marinov, S., Marsi, E.: Maltparser: A language-independent system for data-driven dependency parsing. Natural Language Engineering 13(2), 95–135 (2007)
Aduriz, I., Aranzabe, M., Arriola, J.M., Atutxa, A., Díaz de Ilarraza, A., Garmendia, A., Oronoz, M.: Construction of a Basque dependency treebank. In: TLT 2003. Second Workshop on Treebanks and Linguistic Theories, Vaxjo, Sweden (2003)
Tetreault, J.R., Chodorow, M.: The ups and downs of preposition error detection in esl writing. In: Proceedings of Coling, Manchester (2008)
Carlson, A.J., Rosen, J., Roth, D.: Scaling up context-sensitive text correction. In: Proceedings of the Thirteenth Innovative Applications of Artificial Intelligence Conference (IAAI 2001), Menlo Park CA (2001)
Bigert, J., Knutsson, O.: Robust error detection: A hybrid approach combining unsupervised error detection and linguistic knowledge. In: Romand 2002, Italy (2002)
Karlsson, F., Voutilainen, A., Heikkila, J., Anttila, A.: Constraint Grammar: Language-independent System for Parsing Unrestricted Text, Berlin (1995)
Karttunen, L., Gaál, T., Kempe, A.: Xerox finite state tool. Technical report, Xerox Research Centre Europe (1997)
Teixeira Martins, R., Hasegawa, R., Volpe Nunes, M., Montilha, G., De Oliveira Jr., O.N.: Linguistic issues in the development of ReGra: A grammar checker for Brazilian Portuguese. NLE 4(4), 287–307 (1998)
Foster, J., Vogel, C.: Good reasons for noting bad grammar: Constructing a corpus of ungrammatical language. In: International Conference on Linguistic Evidence: Empirical, Theoretical and Computational Perspectives, Tübingen, Germany (2004)
Wagner, J., Foster, J.: The effect of correcting grammatical errors on parse probabilities. In: Proceedings of the 11th International Conference on Parsing Technologies (IWPT 2009), Paris, France (2009)
Birn, J.: Detecting grammar errors with Lingsoft’s Swedish grammar-checker. In: Proceedings from the 12th Nordiske Datalingvistikkdager, Nordgard (2000)
Hashemi, S.H.: Detecting Grammar Errors in Children’s Writing: A Finite State Approach. In: Proceedings of the 13th Nordic Conference on Computational Linguistics (NoDaLiDa 2001), Uppsala, Sweden (2000)
Foster, J., Andersen, O.E.: Generrate: Generating errors for use in grammatical error detection. In: Proceedings from the 4th Workshop on Innovative Use of NLP for Building Educational Applications (2009)
Aduriz, I., Aranzabe, M., Arriola, J.M., Díaz de Ilarraza, A., Gojenola, K., Oronoz, M., Uria, L.: A cascaded syntactic analyser for Basque. In: Gelbukh, A. (ed.) CICLing 2004. LNCS, vol. 2945, pp. 124–134. Springer, Heidelberg (2004)
Ezeiza, N., Aduriz, I., Alegria, I., Arriola, J.M., Urizar, R.: Combining stochastic and rule-based methods for disambiguation in agglutinative languages. In: COLING 1998, Montreal (1998)
Gojenola, K., Sarasola, K.: Aplicación de la relajación gradual de restricciones para la detección y corrección de errores sintácticos. In: Actas de SEPLN 1994, Córdoba, Spain (1994)
Golding, A.R., Roth, D.: A Winnow-Based Approach to Context-Sensitive Spelling Correction. Machine Learning 34(1-3), 107–130 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oronoz, M., Díaz de Ilarraza, A., Gojenola, K. (2010). Design and Evaluation of an Agreement Error Detection System: Testing the Effect of Ambiguity, Parser and Corpus Type. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science(), vol 6233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14770-8_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-14770-8_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14769-2
Online ISBN: 978-3-642-14770-8
eBook Packages: Computer ScienceComputer Science (R0)