Abstract
Summarization is an effective strategy to promote and enhance learning and deep comprehension of texts. However, summarization is seldom implemented by teachers in classrooms because the manual evaluation of students’ summaries requires time and effort. This problem has led to the development of automated models of summarization quality. However, these models often rely on features derived from expert ratings of student summarizations of specific source texts and are therefore not generalizable to summarizations of new texts. Further, many of the models rely of proprietary tools that are not freely or publicly available, rendering replications difficult. In this study, we introduce an automated summarization evaluation (ASE) model that depends strictly on features of the source text or the summary, allowing for a purely text-based model of quality. This model effectively classifies summaries as either low or high quality with an accuracy above 80%. Importantly, the model was developed on a large number of source texts allowing for generalizability across texts. Further, the features used in this study are freely and publicly available affording replication.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Marzano, R.J., Pickering, D.J., Pollock, J.E.: Classroom Instruction That Works: Research-Based Strategies for Increasing Student Achievement. Association for Supervision and Curriculum Development, Alexandria (2008)
Graham, S., Herbert, M.A.: Writing to Read: Evidence for How Writing Can Improve Reading: A Carnegie Corporation Time to Act Report. Alliance for Excellent Education, Washington (2010)
Spirgel, A.S., Delaney, P.F.: Does writing summaries improve memory for text? Educ. Psychol. Rev. 28, 171–196 (2016)
van Dijk, T.A., Kintsch, W.: Strategies of Discourse Comprehension. Academic Press, New York (1983)
Wade-Stein, D., Kintsch, E.: Summary street: Interactive computer support for writing (2004). http://www.tandfonline.com/doi/abs/10.1207/s1532690xci2203_3
Rinehart, S.D., Stahl, S.A., Erickson, L.G.: Some effects of summarization training on reading and studying. Read. Res. Q. 21, 422–438 (1986)
Brown, A.L., Campione, J.C., Day, J.D.: Learning to learn: on training students to learn from texts. Educ. Res. 10, 14–21 (1981)
Brown, A.L., Day, J.D.: Macrorules for summarizing texts: the development of expertise. J. Verbal Learn. Verbal Behav. 22, 1–14 (1983)
van Dijk, T.A., Kintsch, W.: Strategies of Discourse Comprehension. Academic, New York (1977)
Westby, C., Culatta, B., Lawrence, B., Hall-Kenyon, K.: Summarizing expository texts. Top. Lang. Disord. 30(4), 275–287 (2010)
Jones, R.: Strategies for reading comprehension: Summarizing
Gil, L., Bråten, I., Vidal-Abarca, E., Strømsø, H.I.: Summary versus argument tasks when working with multiple documents: which is better for whom? Contemp. Educ. Psychol. 35, 157–173 (2010)
Perin, D., Lauterbach, M., Raufman, J., Kalamkarian, H.S.: Text-based writing of low-skilled postsecondary students: relation to comprehension, self-efficacy and teacher judgments. Read. Writ. 30, 887–915 (2017)
Chiu, C.-H.: Enhancing reading comprehension and summarization abilities of EFL learners through online summarization practice. J. Lang. Teach. Learn. 5(1), 79–95 (2015)
Rogevich, M.E., Perin, D.: Effects on science summarization of a reading comprehension intervention for adolescents with behavior and attention disorders. Except. Child. 74, 135–154 (2008)
Graham, S., Perin, D.: A meta-analysis of writing instruction for adolescent students. J. Educ. Psychol. 99(3), 445–476 (2007)
Li, H., Cai, Z., Graesser, A.C.: Computerized summary scoring: crowdsourcing-based latent semantic analysis. Behav. Res. Methods 50(5), 2144–2161 (2018)
Ruseti, S., et al.: Scoring summaries using recurrent neural networks. In: Nkambou, R., Azevedo, R., Vassileva, J. (eds.) ITS 2018. LNCS, vol. 10858, pp. 191–201. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91464-0_19
Jorge-Botana, G., LuzĂłn, J.M., GĂłmez-Veiga, I., MartĂn-Cordero, J.I.: Automated LSA assessment of summaries in distance education: some variables to be considered. J. Educ. Comp. Res. 52(3), 341–364 (2015)
Landauer, T.K., Dumais, S.T.: A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychol. Rev. 104, 211–240 (1997)
Landauer, T.K., McNamara, D.S., Dennis, S., Kintsch, W.: Handbook of Latent Semantic Analysis. Lawrence Erlbaum, Mahwah (2007)
Madnani, N., Burstein, J., Sabatini, J., O’reilly, T.: Automated scoring of a summary writing task designed to measure reading comprehension. In: Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 163–168 (2013)
Mani, I.: Automatic Summarization. John Benjamins Publishing, Amsterdam (2001)
Sladoljev-agejev, T., Snajder, J., Analysis, T.: Using analytic scoring rubrics in the automatic assessment of college-level summary writing tasks in L2. In: Proceedings of the 8th International Joint Conference on Natural Language Processing, pp. 181–186 (2017)
Dole, J.A., Duffy, G.G., Roehler, L.R., Pearson, P.D.: Moving from the old to the new: research on reading comprehension instruction. Rev. Educ. Res. 61(2), 239–264 (1991)
Kintsch, W., Van Dijk, T.A.: Toward a model of text comprehension and production. Psychol. Rev. 85, 363–394 (1978)
Kintsch, W., Welsch, D., Schmalhofer, F., Zimny, S.: Sentence memory: a theoretical analysis. J. Mem. Lang. 29, 133–159 (1990)
Hinze, S.R., Rapp, D.N.: Retrieval (sometimes) enhances learning: performance pressure reduces the benefits of retrieval practice. Appl. Cogn. Psychol. 28(4), 597–606 (2014)
Butler, A.C., Karpicke, J.D., Roediger III, H.L.: The effect of type and timing of feedback on learning from multiple-choice tests. J. Exp. Psychol. Appl. 13(4), 273–281 (2007)
Stewart, T.L., Myers, A.C., Culley, M.R.: Enhanced learning and retention through “writing to learn” in the psychology classroom. Teach. Psychol. 37(1), 46–49 (2009)
Shokrpour, N., Fotovatian, S.: Effects of consciousness raising of metacognitive strategies on EFL students’ reading comprehension. ITL – Int. J. Appl. Linguist. 157, 75–92 (2009)
Mok, W.S.Y., Chan, W.W.L.: How do tests and summary writing tasks enhance long-term retention of students with different levels of test anxiety? Instruct. Sci. 44(6), 567–581 (2016)
Delaney, Y.A.: Investigating the reading-to-write construct. J. Engl. Acad. Purp. 7, 140–150 (2008)
Landauer, T.K., Lochbaum, K.E., Dooley, S.: A new formative assessment technology for reading and writing. Theor. Pract. 48(1), 44–52 (2009)
Franzke, M., Kintsch, E., Caccamise, D., Johnson, N., Dooley, S.: Summary street: computer support for comprehension and writing. J. Educ. Comput. Res. 33, 53–80 (2005)
Graesser, A.C., McNamara, D.S., Louwerse, M.M., Cai, Z.: Coh-Metrix: analysis of text on cohesion and language. Behav. Res. Meth. Ins. C. 36, 193–202 (2004)
McNamara, D.S., Graesser, A.C., McCarthy, P.M., Cai, Z.: Automated Evaluation of Text and Discourse with Coh-Metrix. Cambridge University Press, Cambridge (2014)
Kyle, K., Crossley, S., Berger, C.: The tool for the automatic analysis of lexical sophistication (TAALES) version 2.0. Behav. Res. Methods 50(3), 1030–1046 (2018)
Kyle, K.: Measuring syntactic development in L2 writing: fine grained indices of syntactic complexity and usage-based indices of syntactic sophistication. Doctoral Dissertation (2016). http://scholarworks.gsu.edu/alesl_diss/35
Crossley, S.A., Kyle, K., McNamara, D.S.: The tool for the automatic analysis of text cohesion (TAACO): automatic assessment of local, global, and text cohesion. Behav. Res. Methods 48(4), 1227–1237 (2016)
Brysbaert, M., New, B.: Moving beyond Kucera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behav. Res. Methods 40(4), 977–990 (2009)
Davies, M.: The 385+ million word Corpus of Contemporary American English (1990–2008+): design, architecture, and linguistic insights. Int. J. Corpus Linguist. 14, 159–190 (2009)
McCarthy, P.M., Jarvis, S.: MTLD, Vocd-D, and HD-D: a validation study of sophisticated approaches to lexical diversity assessment. Behav. Res. Methods 42(2), 381–392 (2010)
Witten, I.A., Frank, E., Hall, M.A.: Data mining: Practical Machine Learning and Techniques. Elsevier, San Francisco, CA (2011)
Bates, D., Maechler, M., Bolker, B., Walker, S.: lme4: linear mixed-effects models using Eigen and S4. R Packag. Version 1(7), 1–23 (2014)
Tremblay, A., Ransijn, J.: LMERConvenienceFunctions: a suite of functions to back-fit fixed effects and forward-fit random effects, as well as other miscellaneous functions. R Packag. Version 2, 919–931 (2013)
Barton, K., Barton, M.K.: Package MuMIn. Model selection and model averaging based on information criteria (2018)
Acknowledgments
This research was supported in part by the Institute for Education Sciences (IES R305A180261). Ideas expressed in this material are those of the authors and do not necessarily reflect the views of the IES. We would also like to express thanks to Amy Johnson, Kristopher Kopp, and Cecile Perret for their help in collecting the data.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Crossley, S.A., Kim, M., Allen, L., McNamara, D. (2019). Automated Summarization Evaluation (ASE) Using Natural Language Processing Tools. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds) Artificial Intelligence in Education. AIED 2019. Lecture Notes in Computer Science(), vol 11625. Springer, Cham. https://doi.org/10.1007/978-3-030-23204-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-23204-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23203-0
Online ISBN: 978-3-030-23204-7
eBook Packages: Computer ScienceComputer Science (R0)