Abstract
Educational standards put a renewed focus on strengthening students’ abilities to construct scientific explanations and engage in scientific arguments. Evaluating student explanatory writing is extremely time-intensive, so we are developing techniques to automatically analyze the causal structure in student essays so that effective feedback may be provided. These techniques rely on a significant training corpus of annotated essays. Because one of our long-term goals is to make it easier to establish this approach in new subject domains, we are keenly interested in the question of how much training data is enough to support this. This paper describes our analysis of that question, and looks at one mechanism for reducing that data requirement which uses student scores on a related multiple choice test.
P. Hastings—The assessment project described in this article is funded, in part, by the Institute for Education Sciences, U.S. Department of Education (Grant R305F100007). The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The choice of group size is significant. As mentioned above, the distribution of multiple choice scores was fairly normal, and the least frequent score, 0, was assigned to 31 students. In order to maintain balanced representation of groups in the training set, some aggregation is necessary otherwise we could only test on a maximum of 31 items from each group. If the aggregation was too broad, however, it would decrease any benefit of balance in the training set.
References
Achieve, Inc: Next Generation Science Standards: The common core standards for english language arts and literacy in history/social studies and science and technical subjects. Council of Chief State School Officers (2013)
Britt, M.A., Wallace, P., Blaum, D., Ko, M., Goldman, S.R.: Project READI science design team: multiple representations in science learning and assessment. In: Multiple Representations and Multimedia: Student Learning and Instruction. Symposium Conducted at the Annual Meeting of the AERA, Chicago, April 2015
Britt, M.A., Richter, T., Rouet, J.F.: Scientific literacy: the role of goal-directed reading and evaluation in understanding scientific information. Educ. Psychol. 49(2), 104–122 (2014). doi:10.1080/00461520.2014.916217
Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Mach. Learn. 15(2), 201–221 (1994). doi:10.1007/BF00993277
Dietterich, T.G.: Machine learning for sequential data: a review. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, p. 15. Springer, Heidelberg (2002)
Duschl, R., Osborne, J.: Supporting and promoting argumentation discourse in science education. Stud. Sci. Educ. 38, 39–72 (2002)
Hughes, S., Hastings, P., Britt, M.A., Wallace, P., Blaum, D.: Machine learning for holistic evaluation of scientific essays. In: Conati, C., Heffernan, N., Mitrovic, A., Verdejo, M.F. (eds.) AIED 2015. LNCS, vol. 9112, pp. 165–175. Springer, Heidelberg (2015)
Hughes, S., Hastings, P., Magliano, J., Goldman, S., Lawless, K.: Automated approaches for detecting integration in student essays. In: Cerri, S.A., Clancey, W.J., Papadourakis, G., Panourgia, K. (eds.) ITS 2012. LNCS, vol. 7315, pp. 274–279. Springer, Heidelberg (2012)
Kelly, G.J., Druker, S., Chen, C.: Students’ reasoning about electricity: combining performance assessments with argumentation analysis. Int. J. Sci. Educ. 20(7), 849–871 (1998)
Meyer, B.J., Freedle, R.O.: Effects of discourse type on recall. Am. Educ. Res. J. 22(1), 121–143 (1984)
Millis, K.K., Morgan, D., Graesser, A.C.: The influence of knowledge-based inferences on the reading time of expository text. Psychol. Learn. Motiv. 25, 197–212 (1990)
Osborne, J., Erduran, S., Simon, S.: Enhancing the quality of argumentation in science classrooms. J. Res. Sci. Teach. 41(10), 994–1020 (2004)
Osborne, J., Patterson, A.: Scientific argument and explanation: a necessary distinction? Sci. Educ. 95, 627–638 (2011)
Shahrokh Esfahani, M., Dougherty, E.R.: Effect of separate sampling on classification accuracy. Bioinformatics 30(2), 242–250 (2014). http://bioinformatics.oxfordjournals.org/content/30/2/242.abstract
Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Hastings, P., Hughes, S., Blaum, D., Wallace, P., Britt, M.A. (2016). Stratified Learning for Reducing Training Set Size. In: Micarelli, A., Stamper, J., Panourgia, K. (eds) Intelligent Tutoring Systems. ITS 2016. Lecture Notes in Computer Science(), vol 9684. Springer, Cham. https://doi.org/10.1007/978-3-319-39583-8_39
Download citation
DOI: https://doi.org/10.1007/978-3-319-39583-8_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39582-1
Online ISBN: 978-3-319-39583-8
eBook Packages: Computer ScienceComputer Science (R0)