Abstract
Automated scoring of student language is a complex task that requires systems to emulate complex and multi-faceted human evaluation criteria. Summary scoring brings an additional layer of complexity to automated scoring because it involves two texts of differing lengths that must be compared. In this study, we present our approach to automate summary scoring by evaluating a corpus of approximately 5,000 summaries based on 103 source texts, each summary being scored on a 4-point Likert scale for seven different evaluation criteria. We train and evaluate a series of Machine Learning models that use a combination of independent textual complexity indices from the ReaderBench framework and Deep Learning models based on the Transformer architecture in a multitask setup to predict concurrently all criteria. Our models achieve significantly lower errors than previous work using a similar dataset, with MAE ranging from 0.10–0.16 and corresponding R2 values of up to 0.64. Our findings indicate that Longformer-based [1] models are adequate for contextualizing longer text sequences and effectively scoring summaries according to a variety of human-defined evaluation criteria using a single Neural Network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)
Bensoussan, M., Kreindler, I.: Improving advanced reading comprehension in a foreign language: summaries vs. short-answer questions. J. Res. Read. 13(1), 55–68 (1990)
Brown, A.L., Campione, J.C., Day, J.D.: Learning to learn: on training students to learn from texts. Educ. Res. 10(2), 14–21 (1981)
Bean, T.W., Steenwyk, F.L.: The effect of three forms of summarization instruction on sixth graders’ summary writing and comprehension. J. Read. Behav. 16(4), 297–306 (1984)
Karbalaei, A., Rajyashree, K.S.: The impact of summarization strategy training on university ESL learners’ reading comprehension. Int. J. Lang. Soc. Cult. 30, 41–53 (2010)
Pakzadian, M., Rasekh, A.E.: The effects of using summarization strategies on Iranian EFL learners’ reading comprehension. Engl. Linguist. Res. 1(1), 118–125 (2012)
Nenkova, A., Passonneau, R.J.: Evaluating content selection in summarization: the pyramid method. In: The Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2004), Boston, Massachusetts, USA, pp. 145–152. ACL (2004)
Van Halteren, H., Teufel, S.: Examining the consensus between human summaries: initial experiments with factoid analysis. In: HLT-NAACL 03 Text Summarization Workshop, Edmonton, Canada, pp. 57–64. ACL (2003)
Steinberger, J., Jezek, K.: Using latent semantic analysis in text summarization and summary evaluation. In: Proceedings of the 7th International Conference ISIM, pp. 93–100 (2004)
Botarleanu, R.-M., Dascalu, M., Allen, L.K., Crossley, S.A., McNamara, D.S.: Automated summary scoring with ReaderBench. In: Cristea, A.I., Troussas, C. (eds.) Intelligent Tutoring Systems. Lecture Notes in Computer Science, vol. 12677, pp. 321–332. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-80421-3_35
Torres-Moreno, J.-M., Saggion, H., Cunha, I.D, SanJuan, E., Velázquez-Morales, P.: Summary evaluation with and without references. Polibits 42, 13–20 (2010)
Facebook Inc.: Transformers (n.d.). https://huggingface.co/docs/transformers/index. Accessed 20 Jan 2022
Acknowledgments
This research was supported by a grant from the Romanian National Authority for Scientific Research and Innovation, CNCS – UEFISCDI, project number TE 70 PN-III-P1-1.1-TE-2019-2209, ATES – “Automated Text Evaluation and Simplification”, the Institute of Education Sciences (R305A180144 and R305A180261), and the Office of Naval Research (N00014-17-1-2300; N00014-20-1-2623; N00014-19-1-2424, N00014-20-1-2627). The opinions expressed are those of the authors and do not represent the views of the IES or ONR.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Botarleanu, RM., Dascalu, M., Allen, L.K., Crossley, S.A., McNamara, D.S. (2022). Multitask Summary Scoring with Longformers. In: Rodrigo, M.M., Matsuda, N., Cristea, A.I., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2022. Lecture Notes in Computer Science, vol 13355. Springer, Cham. https://doi.org/10.1007/978-3-031-11644-5_79
Download citation
DOI: https://doi.org/10.1007/978-3-031-11644-5_79
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11643-8
Online ISBN: 978-3-031-11644-5
eBook Packages: Computer ScienceComputer Science (R0)