Abstract
The paper contributes to the task of automated evaluation of surface coherence. It introduces a coreference-related extension to the EVALD applications, which aim at evaluating essays produced by native and non-native students learning Czech. Having successfully employed the coreference resolver and coreference-related features, our system outperforms the original EVALD approaches by up to 8% points. The paper also introduces a dataset for non-native speakers’ evaluation, which was collected from multiple corpora and the parts with missing annotation of coherence grade were manually judged. The resulting corpora contains sufficient number of examples for each of the grading levels.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Attali, Y., Burstein, J.: Automated essay scoring with e-rater® V.2. J. Technol. Learn. Assess. 4(3), 1–31 (2006)
Bejček, E., Hajičová, E., Hajič, J., Jínová, P., Kettnerová, V., Kolářová, V., Mikulová, M., Mírovský, J., Nedoluzhko, A., Panevová, J., Poláková, L., Ševčíková, M., Štěpánek, J., Zikánová, Š.: Prague Dependency Treebank 3.0. ÚFAL MFF UK, Prague (2013)
Boyd, A., Hana, J., Nicolas, L., Meurers, D., Wisniewski, K., Abel, A., Schöne, K., Štindlová, B., Vettori, C.: The MERLIN corpus: learner language and the CEFR. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 1281–1288. European Language Resources Association, Reykjavík (2014)
Burstein, J., Marcu, D., Knight, K.: Finding the WRITE stuff: automatic identification of discourse structure in student essays. IEEE Intell. Syst. 18(1), 32–39 (2003)
Dikli, S.: An overview of automated scoring of essays. J. Technol. Learn. Assess. 5(1), 1–36 (2006)
Hancke, J., Meurers, D.: Exploring CEFR classification for German based on rich linguistic modeling. Learner Corpus Research 2013, Book of Abstracts, Bergen, Norway, pp. 54–56 (2013)
Novák, M.: Coreference resolution system not only for Czech. In: ITAT 2017: Information Technologies-Applications and Theory (Proceedings). CreateSpace Independent Publishing Platform, Martinské Hole (2017)
Novák, V., Žabokrtský, Z.: Feature engineering in maximum spanning tree dependency parser. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 92–98. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74628-7_14
Page, E.B.: The use of the computer in analyzing student essays. Int. Rev. Educ. 14(2), 210–225 (1968)
Pilán, I., Volodina, E., Zesch, T.: Predicting proficiency levels in learner writings by transferring a linguistic complexity model from expert-written coursebooks. In: COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, pp. 2101–2111. ACL, Osaka (2016)
Poláková, L., Mírovský, J., Nedoluzhko, A., Jínová, P., Zikánová, Š., Hajičová, E.: Introducing the Prague discourse Treebank 1.0. In: Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp. 91–99. Asian Federation of Natural Language Processing, Nagoya (2013)
Popel, M., Žabokrtský, Z.: TectoMT: modular NLP framework. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) NLP 2010. LNCS (LNAI), vol. 6233, pp. 293–304. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14770-8_33
Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., Webber, B.: The Penn discourse Treebank 2.0. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), pp. 2961–2968. European Language Resources Association, Marrakech (2008)
Rysová, K., Mírovský, J., Novák, M., Rysová, M.: EVALD 1.0. ÚFAL MFF UK, Prague (2016)
Rysová, K., Mírovský, J., Novák, M., Rysová, M.: EVALD 1.0 for Foreigners. ÚFAL MFF UK, Prague (2016)
Rysová, K., Rysová, M., Mírovský, J.: Automatic evaluation of surface coherence in L2 texts in Czech. In: Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ROCLING XXVIII, pp. 214–228. National Cheng Kung University, The Association for Computational Linguistics and Chinese Language Processing (ACLCLP), Taipei (2016)
Rysová, K., Rysová, M., Mírovský, J., Novák, M.: Automatic evaluation of discourse in Czech - software applications EVALD 1.0 and EVALD 1.0 for foreigners. In: Recent Advances in Natural Language Processing 2017. RANLP 2017 Organising Committee/ACL, Varna (2017)
Rysová, M., Synková, P., Mírovský, J., Hajičová, E., Nedoluzhko, A., Ocelák, R., Pergler, J., Poláková, L., Pavlíková, V., Zdeňková, J., Zikánová, Š.: Prague Discourse Treebank 2.0. ÚFAL MFF UK, Prague (2016)
Šebesta, K., Bedřichová, Z., Šormová, K., et al.: AKCES 5 (CzeSL-SGT). ÚTKL FF UK, Prague (2014)
Šebesta, K., Goláňová, H., Letafková, J., et al.: AKCES 1. ÚTKL FF UK, Prague (2016)
Sgall, P., Hajičová, E., Panevová, J., Mey, J.: The Meaning of the Sentence in its Semantic and Pragmatic Aspects. Springer, Heidelberg (1986)
Simpson, E.H.: Measurement of diversity. Nature 163, 688 (1949). doi:10.1038/163688a0
Straková, J., Straka, M., Hajič, J.: Open-source tools for morphology, lemmatization, POS tagging and named entity recognition. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 13–18. Association for Computational Linguistics, Baltimore (2014)
Vajjala, S., Lõo, K.: Automatic CEFR level prediction for Estonian learner text. In: Proceedings of the Third Workshop on NLP for Computer-Assisted Language Learning at SLTC 2014, no. 107, pp. 113–127. Linköping University Electronic Press, Linköping (2014)
Wonowidjojo, G., Hartono, M.S., Frendy, Suhartono, D., Asmani, A.B.: Automated essay scoring by combining syntactically enhanced latent semantic analysis and coreference resolution. In: 6th International Workshop on Computer Science and Engineering, Tokyo, Japan, pp. 580–584 (2016)
Yule, G.U.: The Statistical Study of Literary Vocabulary. Cambridge University Press, Cambridge (1944)
Zupanc, K., Bosnić, Z.: Automated essay evaluation with semantic analysis. Knowl.-Based Syst. 120(3), 118–132 (2017)
Acknowledgment
The authors acknowledge support from the Ministry of Culture of the Czech Republic (project No. DG16P02B016 Automatic Evaluation of Text Coherence in Czech). This work has been using language resources developed, stored and distributed by the LINDAT/CLARIN project of the Ministry of Education, Youth and Sports of the Czech Republic (project LM2015071).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Novák, M., Rysová, K., Rysová, M., Mírovský, J. (2017). Incorporating Coreference to Automatic Evaluation of Coherence in Essays. In: Camelin, N., Estève, Y., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2017. Lecture Notes in Computer Science(), vol 10583. Springer, Cham. https://doi.org/10.1007/978-3-319-68456-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-68456-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68455-0
Online ISBN: 978-3-319-68456-7
eBook Packages: Computer ScienceComputer Science (R0)