Reducing the Cost: Cross-Prompt Pre-finetuning for Short Answer Scoring

Funayama, Hiroaki; Asazuma, Yuya; Matsubayashi, Yuichiroh; Mizumoto, Tomoya; Inui, Kentaro

doi:10.1007/978-3-031-36272-9_7

Hiroaki Funayama^12,13,
Yuya Asazuma^12,13,
Yuichiroh Matsubayashi^12,13,
Tomoya Mizumoto¹³ &
…
Kentaro Inui^12,13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13916))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

3129 Accesses
2 Citations

Abstract

Automated Short Answer Scoring (SAS) is the task of automatically scoring a given input to a prompt based on rubrics and reference answers. Although SAS is useful in real-world applications, both rubrics and reference answers differ between prompts, thus requiring a need to acquire new data and train a model for each new prompt. Such requirements are costly, especially for schools and online courses where resources are limited and only a few prompts are used. In this work, we attempt to reduce this cost through a two-phase approach: train a model on existing rubrics and answers with gold score signals and finetune it on a new prompt. Specifically, given that scoring rubrics and reference answers differ for each prompt, we utilize key phrases, or representative expressions that the answer should contain to increase scores, and train a SAS model to learn the relationship between key phrases and answers using already annotated prompts (i.e., cross-prompts). Our experimental results show that finetuning on existing cross-prompt data with key phrases significantly improves scoring accuracy, especially when the training data is limited. Finally, our extensive analysis shows that it is crucial to design the model so that it can learn the task’s general property. We publicly release our code and all of the experimental settings for reproducing our results (https://github.com/hiro819/Reducing-the-cost-cross-prompt-prefinetuning-for-SAS).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/hiro819/Reducing-the-cost-cross-prompt-prefinetuning-for-SAS.
2.
https://aip-nlu.gitlab.io/resources/sas-japanese.
3.
Type of question in which the student reads a essay and answers prompts about its content.
4.
We used pretrained BERT models from https://github.com/cl-tohoku/bert-japanese for Japanese.

References

Aghajanyan, A., Gupta, A., Shrivastava, A., Chen, X., Zettlemoyer, L., Gupta, S.: Muppet: massive multi-task representations with pre-finetuning. In: EMNLP, pp. 5799–5811. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic (2021). https://doi.org/10.18653/v1/2021.emnlp-main.468
Burrows, S., Gurevych, I., Stein, B.: The eras and trends of automatic short answer grading. Int. J. Artif. Intell. Educ. 25(1), 60–117 (2015)
Article Google Scholar
Cohen, J.: Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol. Bull. 70(4), 213–220 (1968)
Article Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186 (2019). https://doi.org/10.18653/v1/N19-1423
Funayama, H., et al.: Balancing cost and quality: an exploration of human-in-the-loop frameworks for automated short answer scoring. In: Rodrigo, M.M., Matsuda, N., Cristea, A.I., Dimitrova, V. (eds.) AIED 2022. LNCS, vol. 13355, pp. 465–476. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-11644-5_38
Chapter Google Scholar
Haller, S., Aldea, A., Seifert, C., Strisciuglio, N.: Survey on automated short answer grading with deep learning: from word embeddings to transformers (2022)
Google Scholar
Kumar, Y., et al.: Get it scored using autosas - an automated system for scoring short answers. In: AAAI/IAAI/EAAI. AAAI Press (2019). https://doi.org/10.1609/aaai.v33i01.33019662
Mizumoto, T., et al.: Analytic score prediction and justification identification in automated short answer scoring. In: BEA, pp. 316–325 (2019). https://doi.org/10.18653/v1/W19-4433
Mohler, M., Bunescu, R., Mihalcea, R.: Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In: ACL-HLT, pp. 752–762 (2011)
Google Scholar
Oka, H., Nguyen, H.T., Nguyen, C.T., Nakagawa, M., Ishioka, T.: Fully automated short answer scoring of the trial tests for common entrance examinations for Japanese university. In: Rodrigo, M.M., Matsuda, N., Cristea, A.I., Dimitrova, V. (eds.) AIED 2022. LNCS, vol. 13355, pp. 180–192. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-11644-5_15
Chapter Google Scholar
Riordan, B., Horbach, A., Cahill, A., Zesch, T., Lee, C.M.: Investigating neural architectures for short answer scoring. In: BEA, pp. 159–168 (2017). https://doi.org/10.18653/v1/W17-5017
Saha, S., Dhamecha, T.I., Marvaniya, S., Foltz, P., Sindhgatta, R., Sengupta, B.: Joint multi-domain learning for automatic short answer grading. CoRR abs/1902.09183 (2019)
Google Scholar
Sakaguchi, K., Heilman, M., Madnani, N.: Effective feature integration for automated short answer scoring. In: NAACL-HLT, Denver, Colorado, pp. 1049–1054. Association for Computational Linguistics (2015). https://doi.org/10.3115/v1/N15-1111
Sultan, M.A., Salazar, C., Sumner, T.: Fast and easy short answer grading with high accuracy. In: NAACL-HLT, San Diego, California, pp. 1070–1075. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/N16-1123
Sung, C., Dhamecha, T., Saha, S., Ma, T., Reddy, V., Arora, R.: Pre-training BERT on domain resources for short answer grading. In: EMNLP-IJCNLP, Hong Kong, China, pp. 6071–6075. Association for Computational Linguistics (2019). https://doi.org/10.18653/v1/D19-1628
Wang, T., Funayama, H., Ouchi, H., Inui, K.: Data augmentation by rubrics for short answer grading. J. Nat. Lang. Process. 28(1), 183–205 (2021). https://doi.org/10.5715/jnlp.28.183
Article Google Scholar

Download references

Acknowledgments

We are grateful to Dr. Paul Reisert for their writing and editing assistance. This work was supported by JSPS KAKENHI Grant Number 22H00524, JP19K12112, JST SPRING, Grant Number JPMJSP2114. We also thank Takamiya Gakuen Yoyogi Seminar for providing invaluable data useful for our experiments. We would like to thank the anonymous reviewers for their insightful comments.

Author information

Authors and Affiliations

Tohoku University, Sendai, Japan
Hiroaki Funayama, Yuya Asazuma, Yuichiroh Matsubayashi & Kentaro Inui
RIKEN, Tokyo, Japan
Hiroaki Funayama, Yuya Asazuma, Yuichiroh Matsubayashi, Tomoya Mizumoto & Kentaro Inui

Authors

Hiroaki Funayama
View author publications
You can also search for this author in PubMed Google Scholar
Yuya Asazuma
View author publications
You can also search for this author in PubMed Google Scholar
Yuichiroh Matsubayashi
View author publications
You can also search for this author in PubMed Google Scholar
Tomoya Mizumoto
View author publications
You can also search for this author in PubMed Google Scholar
Kentaro Inui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hiroaki Funayama .

Editor information

Editors and Affiliations

University of Southern California, Los Angeles, CA, USA
Ning Wang
University of British Columbia, Vancouver, BC, Canada
Genaro Rebolledo-Mendez
North Carolina State University, Raleigh, NC, USA
Noboru Matsuda
Despacho 3.01, UNED-Grupo de Investigación aDeNu, Madrid, Spain
Olga C. Santos
University of Leeds, Leeds, UK
Vania Dimitrova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Funayama, H., Asazuma, Y., Matsubayashi, Y., Mizumoto, T., Inui, K. (2023). Reducing the Cost: Cross-Prompt Pre-finetuning for Short Answer Scoring. In: Wang, N., Rebolledo-Mendez, G., Matsuda, N., Santos, O.C., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2023. Lecture Notes in Computer Science(), vol 13916. Springer, Cham. https://doi.org/10.1007/978-3-031-36272-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-36272-9_7
Published: 26 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36271-2
Online ISBN: 978-3-031-36272-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Reducing the Cost: Cross-Prompt Pre-finetuning for Short Answer Scoring