Incorporating Coreference to Automatic Evaluation of Coherence in Essays

Novák, Michal; Rysová, Kateřina; Rysová, Magdaléna; Mírovský, Jiří

doi:10.1007/978-3-319-68456-7_5

Michal Novák¹⁶,
Kateřina Rysová¹⁶,
Magdaléna Rysová¹⁶ &
…
Jiří Mírovský¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10583))

Included in the following conference series:

International Conference on Statistical Language and Speech Processing

1246 Accesses
2 Citations

Abstract

The paper contributes to the task of automated evaluation of surface coherence. It introduces a coreference-related extension to the EVALD applications, which aim at evaluating essays produced by native and non-native students learning Czech. Having successfully employed the coreference resolver and coreference-related features, our system outperforms the original EVALD approaches by up to 8% points. The paper also introduces a dataset for non-native speakers’ evaluation, which was collected from multiple corpora and the parts with missing annotation of coherence grade were manually judged. The resulting corpora contains sufficient number of examples for each of the grading levels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.alte.org.

References

Attali, Y., Burstein, J.: Automated essay scoring with e-rater® V.2. J. Technol. Learn. Assess. 4(3), 1–31 (2006)
Google Scholar
Bejček, E., Hajičová, E., Hajič, J., Jínová, P., Kettnerová, V., Kolářová, V., Mikulová, M., Mírovský, J., Nedoluzhko, A., Panevová, J., Poláková, L., Ševčíková, M., Štěpánek, J., Zikánová, Š.: Prague Dependency Treebank 3.0. ÚFAL MFF UK, Prague (2013)
Google Scholar
Boyd, A., Hana, J., Nicolas, L., Meurers, D., Wisniewski, K., Abel, A., Schöne, K., Štindlová, B., Vettori, C.: The MERLIN corpus: learner language and the CEFR. In: Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 1281–1288. European Language Resources Association, Reykjavík (2014)
Google Scholar
Burstein, J., Marcu, D., Knight, K.: Finding the WRITE stuff: automatic identification of discourse structure in student essays. IEEE Intell. Syst. 18(1), 32–39 (2003)
Article Google Scholar
Dikli, S.: An overview of automated scoring of essays. J. Technol. Learn. Assess. 5(1), 1–36 (2006)
Google Scholar
Hancke, J., Meurers, D.: Exploring CEFR classification for German based on rich linguistic modeling. Learner Corpus Research 2013, Book of Abstracts, Bergen, Norway, pp. 54–56 (2013)
Google Scholar
Novák, M.: Coreference resolution system not only for Czech. In: ITAT 2017: Information Technologies-Applications and Theory (Proceedings). CreateSpace Independent Publishing Platform, Martinské Hole (2017)
Google Scholar
Novák, V., Žabokrtský, Z.: Feature engineering in maximum spanning tree dependency parser. In: Matoušek, V., Mautner, P. (eds.) TSD 2007. LNCS (LNAI), vol. 4629, pp. 92–98. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74628-7_14
Chapter Google Scholar
Page, E.B.: The use of the computer in analyzing student essays. Int. Rev. Educ. 14(2), 210–225 (1968)
Article Google Scholar
Pilán, I., Volodina, E., Zesch, T.: Predicting proficiency levels in learner writings by transferring a linguistic complexity model from expert-written coursebooks. In: COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, pp. 2101–2111. ACL, Osaka (2016)
Google Scholar
Poláková, L., Mírovský, J., Nedoluzhko, A., Jínová, P., Zikánová, Š., Hajičová, E.: Introducing the Prague discourse Treebank 1.0. In: Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp. 91–99. Asian Federation of Natural Language Processing, Nagoya (2013)
Google Scholar
Popel, M., Žabokrtský, Z.: TectoMT: modular NLP framework. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) NLP 2010. LNCS (LNAI), vol. 6233, pp. 293–304. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14770-8_33
Chapter Google Scholar
Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A., Webber, B.: The Penn discourse Treebank 2.0. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), pp. 2961–2968. European Language Resources Association, Marrakech (2008)
Google Scholar
Rysová, K., Mírovský, J., Novák, M., Rysová, M.: EVALD 1.0. ÚFAL MFF UK, Prague (2016)
Google Scholar
Rysová, K., Mírovský, J., Novák, M., Rysová, M.: EVALD 1.0 for Foreigners. ÚFAL MFF UK, Prague (2016)
Google Scholar
Rysová, K., Rysová, M., Mírovský, J.: Automatic evaluation of surface coherence in L2 texts in Czech. In: Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ROCLING XXVIII, pp. 214–228. National Cheng Kung University, The Association for Computational Linguistics and Chinese Language Processing (ACLCLP), Taipei (2016)
Google Scholar
Rysová, K., Rysová, M., Mírovský, J., Novák, M.: Automatic evaluation of discourse in Czech - software applications EVALD 1.0 and EVALD 1.0 for foreigners. In: Recent Advances in Natural Language Processing 2017. RANLP 2017 Organising Committee/ACL, Varna (2017)
Google Scholar
Rysová, M., Synková, P., Mírovský, J., Hajičová, E., Nedoluzhko, A., Ocelák, R., Pergler, J., Poláková, L., Pavlíková, V., Zdeňková, J., Zikánová, Š.: Prague Discourse Treebank 2.0. ÚFAL MFF UK, Prague (2016)
Google Scholar
Šebesta, K., Bedřichová, Z., Šormová, K., et al.: AKCES 5 (CzeSL-SGT). ÚTKL FF UK, Prague (2014)
Google Scholar
Šebesta, K., Goláňová, H., Letafková, J., et al.: AKCES 1. ÚTKL FF UK, Prague (2016)
Google Scholar
Sgall, P., Hajičová, E., Panevová, J., Mey, J.: The Meaning of the Sentence in its Semantic and Pragmatic Aspects. Springer, Heidelberg (1986)
Google Scholar
Simpson, E.H.: Measurement of diversity. Nature 163, 688 (1949). doi:10.1038/163688a0
Article MATH Google Scholar
Straková, J., Straka, M., Hajič, J.: Open-source tools for morphology, lemmatization, POS tagging and named entity recognition. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 13–18. Association for Computational Linguistics, Baltimore (2014)
Google Scholar
Vajjala, S., Lõo, K.: Automatic CEFR level prediction for Estonian learner text. In: Proceedings of the Third Workshop on NLP for Computer-Assisted Language Learning at SLTC 2014, no. 107, pp. 113–127. Linköping University Electronic Press, Linköping (2014)
Google Scholar
Wonowidjojo, G., Hartono, M.S., Frendy, Suhartono, D., Asmani, A.B.: Automated essay scoring by combining syntactically enhanced latent semantic analysis and coreference resolution. In: 6th International Workshop on Computer Science and Engineering, Tokyo, Japan, pp. 580–584 (2016)
Google Scholar
Yule, G.U.: The Statistical Study of Literary Vocabulary. Cambridge University Press, Cambridge (1944)
Google Scholar
Zupanc, K., Bosnić, Z.: Automated essay evaluation with semantic analysis. Knowl.-Based Syst. 120(3), 118–132 (2017)
Article Google Scholar

Download references

Acknowledgment

The authors acknowledge support from the Ministry of Culture of the Czech Republic (project No. DG16P02B016 Automatic Evaluation of Text Coherence in Czech). This work has been using language resources developed, stored and distributed by the LINDAT/CLARIN project of the Ministry of Education, Youth and Sports of the Czech Republic (project LM2015071).

Author information

Authors and Affiliations

Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, Malostranské náměstí 25, 11800, Prague 1, Czech Republic
Michal Novák, Kateřina Rysová, Magdaléna Rysová & Jiří Mírovský

Authors

Michal Novák
View author publications
You can also search for this author in PubMed Google Scholar
Kateřina Rysová
View author publications
You can also search for this author in PubMed Google Scholar
Magdaléna Rysová
View author publications
You can also search for this author in PubMed Google Scholar
Jiří Mírovský
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michal Novák .

Editor information

Editors and Affiliations

University of Le Mans, Le Mans, France
Nathalie Camelin
University of Le Mans, Le Mans, France
Yannick Estève
Rovira i Virgili University, Tarragona, Spain
Carlos Martín-Vide

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Novák, M., Rysová, K., Rysová, M., Mírovský, J. (2017). Incorporating Coreference to Automatic Evaluation of Coherence in Essays. In: Camelin, N., Estève, Y., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2017. Lecture Notes in Computer Science(), vol 10583. Springer, Cham. https://doi.org/10.1007/978-3-319-68456-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-68456-7_5
Published: 27 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68455-0
Online ISBN: 978-3-319-68456-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics