Active Learning for Improving Machine Learning of Student Explanatory Essays

Hastings, Peter; Hughes, Simon; Britt, M. Anne

doi:10.1007/978-3-319-93843-1_11

Peter Hastings²¹,
Simon Hughes²¹ &
M. Anne Britt²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10947))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

Abstract

There is an increasing emphasis, especially in STEM areas, on students’ abilities to create explanatory descriptions. Holistic, overall evaluations of explanations can be performed relatively easily with shallow language processing by humans or computers. However, this provides little information about an essential element of explanation quality: the structure of the explanation, i.e., how it connects causes to effects. The difficulty of providing feedback on explanation structure can lead teachers to either avoid giving this type of assignment or to provide only shallow feedback on them. Using machine learning techniques, we have developed successful computational models for analyzing explanatory essays. A major cost of developing such models is the time and effort required for human annotation of the essays. As part of a large project studying students’ reading processes, we have collected a large number of explanatory essays and thoroughly annotated them. Then we used the annotated essays to train our machine learning models. In this paper, we focus on how to get the best payoff from the expensive annotation process within such an educational context and we evaluate a method called Active Learning.

The assessment project described in this article was funded, in part, by the Institute for Education Sciences, U.S. Department of Education (Grant R305G050091 and Grant R305F100007). The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

From the Automated Assessment of Student Essay Content to Highly Informative Feedback: a Case Study

Article Open access 25 January 2024

Rhetor: Providing LLM-Based Feedback for Students’ Argumentative Essays

Identifying the Structure of Students’ Explanatory Essays

Notes

1.
The percentages are all parameters to the model. These were selected because they allowed us to see the performance of the models at a reasonable granularity. It should be noted, however, that in our case, 10% of the total set represents over 100 additional essays. In real-world settings, a smaller increment would likely be used due to the cost of annotation.
2.
We used a validation or holdout set to provide a consistent basis on which to judge the performance of the models.
3.
For what it’s worth, these are analogous to the U.S. House of Representatives and Senate, respectively, with one giving more weight to more “populous” (i.e., frequent) entities, and the other giving “equal representation” to each entity.
4.
Alternatively, we could have used the frequencies from the training set. We used frequencies from the remainder pool because they would be more accurate, especially at the earlier stages. In a real-life setting where the items in the remainder pool would be unlabeled, those frequencies would, of course, be unknown.

References

Osborne, J., Erduran, S., Simon, S.: Enhancing the quality of argumentation in science classrooms. J. Res. Sci. Teach. 41(10), 994–1020 (2004)
Article Google Scholar
Achieve Inc.: Next generation science standards (2013)
Google Scholar
Hastings, P., Britt, M.A., Rupp, K., Kopp, K., Hughes, S.: Computational analysis of explanatory essay structure. In: Millis, K., Long, D., Magliano, J.P., Wiemer, K. (eds.) Multi-Disciplinary Approaches to Deep Learning. Routledge, New York (2018). Accepted for publication
Google Scholar
Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: brat: a web-based tool for NLP-assisted text annotation. In: Proceedings of the Demonstrations Session at EACL 2012, Avignon, France, Association for Computational Linguistics, April 2012
Google Scholar
Stenetorp, P., Topić, G., Pyysalo, S., Ohta, T., Kim, J.D., Tsujii, J.: BioNLP shared task 2011: Supporting resources. In: Proceedings of BioNLP Shared Task 2011 Workshop, Portland, Oregon, USA, Association for Computational Linguistics, pp. 112–120, June 2011
Google Scholar
Goldman, S.R., Greenleaf, C., Yukhymenko-Lescroart, M., Brown, W., Ko, M., Emig, J., George, M., Wallace, P., Blaum, D., Britt, M.: Project READI: Explanatory modeling in science through text-based investigation: Testing the efficacy of the READI intervention approach. Technical Report 27, Project READI (2016)
Google Scholar
Shermis, M.D., Hamner, B.: Contrasting state-of-the-art automated scoring of essays: analysis. In: Annual National Council on Measurement in Education Meeting, pp. 14–16 (2012)
Google Scholar
Deane, P.: On the relation between automated essay scoring and modern views of the writing construct. Assessing Writ. 18(1), 7–24 (2013)
Article Google Scholar
Roscoe, R.D., Crossley, S.A., Snow, E.L., Varner, L.K., McNamara, D.S.: Writing quality, knowledge, and comprehension correlates of human and automated essay scoring. In: The Twenty-Seventh International Flairs Conference (2014)
Google Scholar
Shermis, M.D., Burstein, J.: Handbook of Automated Essay Evaluation: Current Applications and New Directions. Routledge (2013)
Google Scholar
Dikli, S.: Automated essay scoring. Turk. Online J. Distance Educ. 7(1), 49–62 (2015)
Google Scholar
Condon, W.: Large-scale assessment, locally-developed measures, and automated scoring of essays: Fishing for red herrings? Assessing Writ. 18(1), 100–108 (2013)
Article Google Scholar
Riaz, M., Girju, R.: Recognizing causality in verb-noun pairs via noun and verb semantics. EACL 2014, 48 (2014)
Google Scholar
Rink, B., Bejan, C.A., Harabagiu, S.M.: Learning textual graph patterns to detect causal event relations. In: Guesgen, H.W., Murray, R.C. (eds.) FLAIRS Conference. AAAI Press (2010)
Google Scholar
Hughes, S., Hastings, P., Britt, M.A., Wallace, P., Blaum, D.: Machine learning for holistic evaluation of scientific essays. In: Conati, C., Heffernan, N., Mitrovic, A., Verdejo, M.F. (eds.) AIED 2015. LNCS (LNAI), vol. 9112, pp. 165–175. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19773-9_17
Chapter Google Scholar
Hughes, S.: Automatic inference of causal reasoning chains from student essays. Ph.D. thesis, DePaul University, Chicago, IL (2018)
Google Scholar
Wolpert, D.H.: Stacked generalization. Neural Netw. 5(2), 241–259 (1992)
Article Google Scholar
Hastings, P., Hughes, S., Blaum, D., Wallace, P., Britt, M.A.: Stratified learning for reducing training set size. In: Micarelli, A., Stamper, J., Panourgia, K. (eds.) ITS 2016. LNCS, vol. 9684, pp. 341–346. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39583-8_39
Chapter Google Scholar
Settles, B.: Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin-Madison (2009)
Google Scholar
Sharma, M., Bilgic, M.: Most-surely vs. least-surely uncertain. In: 13th International Conference on Data Mining (ICDM), pp. 667–676. IEEE (2013)
Google Scholar
Ferdowsi, Z.: Active learning for high precision classification with imbalanced data. Ph.D. thesis, DePaul University, Chicago, IL, USA, May 2015
Google Scholar
Cawley, G.C.: Baseline methods for active learning. In: Active Learning and Experimental Design Workshop in Conjunction with AISTATS 2010, pp. 47–57 (2011)
Google Scholar
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2, 45–66 (2001)
MATH Google Scholar
Mirroshandel, S.A., Ghassem-Sani, G., Nasr, A.: Active learning strategies for support vector machines, application to temporal relation classification. In: Proceedings of 5th International Joint Conference on Natural Language Processing, pp. 56–64 (2011)
Google Scholar
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM (1998)
Google Scholar
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Book Google Scholar
Joachims, T.: Learning to Classify Text Using Support Vector Machines - Methods, Theory, and Algorithms. Kluwer/Springer, New York (2002)
Book Google Scholar
Olsson, F.: A literature survey of active machine learning in the context of natural language processing. Technical Report T2009:06, Swedish Institute of Computer Science (2009). http://eprints.sics.se/3600/1/SICS-T-2009-06-SE.pdf. Accessed 8 Feb 2017

Download references

Author information

Authors and Affiliations

School of Computing, DePaul University, Chicago, IL, USA
Peter Hastings & Simon Hughes
Psychology Department, Northern Illinois University, DeKalb, IL, USA
M. Anne Britt

Authors

Peter Hastings
View author publications
You can also search for this author in PubMed Google Scholar
Simon Hughes
View author publications
You can also search for this author in PubMed Google Scholar
M. Anne Britt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peter Hastings .

Editor information

Editors and Affiliations

Carnegie Mellon University, Pittsburgh, PA, USA
Carolyn Penstein Rosé
University of Technology, Sydney, NSW, Australia
Roberto Martínez-Maldonado
University of Duisburg-Essen, Duisburg, Germany
H. Ulrich Hoppe
UCL Institute of Education, London, UK
Rose Luckin
UCL Institute of Education, London, UK
Manolis Mavrikis
UCL Institute of Education, London, UK
Kaska Porayska-Pomsta
Carnegie Mellon University, Pittsburgh, PA, USA
Bruce McLaren
University of Sussex, Brighton, UK
Benedict du Boulay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hastings, P., Hughes, S., Britt, M.A. (2018). Active Learning for Improving Machine Learning of Student Explanatory Essays. In: Penstein Rosé, C., et al. Artificial Intelligence in Education. AIED 2018. Lecture Notes in Computer Science(), vol 10947. Springer, Cham. https://doi.org/10.1007/978-3-319-93843-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-93843-1_11
Published: 20 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93842-4
Online ISBN: 978-3-319-93843-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics