Abstract
This paper presents our motivation, design and two experiments for automatic scoring of handwritten descriptive answers. The first experiment is on scoring of handwritten short descriptive answers in Japanese language exams. We used a deep neural network (DNN)-based handwriting recognizer and a transformer-based automatic scorer without correcting misrecognized characters or adding rubric annotations for scoring. We achieved acceptable agreement between the automatic scoring and the human scoring, while using only 1.7% of the human-scored answers for training. The second experiment is to score descriptive answers written on electronic paper for Japanese, English, and math drills. We used DNN-based online and offline handwriting recognizers for each subject and took simple perfect matching of recognized candidates with correct answers. The experiment shows that the False Negative rate is reduced by combining the online and offline recognizers and the False Positive rate is reduced by rejecting low recognition scores. Even with the current system, human scorers only need to manually score less than 30% of the answers, with false positive (risky) scores of about 2% or less for the three subjects.
C. T. Nguyen—Work done while at Tokyo University of Agriculture and Technology.
H. Oka—Work done while at The University of Tokyo.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Heffernan, N.T., Heffernan, C.L.: The ASSISTments ecosystem: building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching. Int. J. Artif. Intell. Educ. 24, 470–497 (2014)
The Central Council of Education, J.: 177th Report (in Japanese)
Plamondon, R., Pirlo, G., Anquetil, É., Rémi, C., Teulings, H.L., Nakagawa, M.: Personal digital bodyguards for e-security, e-learning and e-health: a prospective survey. Pattern Recognit. 81, 633–659 (2018)
Burrows, S., Gurevych, I., Stein, B.: The eras and trends of automatic short answer grading. Int. J. Artif. Intell. Educ. 25, 60–117 (2015)
Burstein, J., et al.: Automated scoring using a hybrid feature identification technique. In: 36th ACL and 17th COLING, Quebec, Canada, pp. 206–210 (1998)
Wild, F., Stahl, C., Stermsek, G., Neumann, G.: Parameters driving effectiveness of automated essay scoring with LSA. In: 9th Conference on Computer Assisted Assessment, Loughborough, England, pp. 485–494 (2005)
Ishioka, T., Kameda, M.: Automated Japanese essay scoring system:jess. In: Proceedings of International Workshop on Database Expert System Applications, pp. 4–8. IEEE (2004)
Srihari, S., Srihari, R., Babu, P., Srinivasan, H.: On the automatic scoring of handwritten essays. In: 20th International Joint Conference on Artificial Intelligence, pp. 2880–2884 (2007)
Leacock, C., Chodorow, M.: C-rater: automated scoring of short-answer questions. Comput. Hum. 37, 389–405 (2003)
Pulman, S.G., Sukkarieh, J.Z.: Automatic short answer marking. In: 2th Workshop on Building Educational Applications Using NLP, Michigan, USA, pp. 9–16 (2005)
Mitchell, T., Aldridge, N., Broomhead, P.: Computerised marking of short-answer free-text responses. In: 29th annual conference of the International Association for Educational Assessment, Manchester, UK, pp. 1–16 (2003)
Dzikovska, M.O., Nielsen, R.D., Brew, C.: Towards effective tutorial feedback for explanation questions: a dataset and baselines. In: 2012 NAACL: Human Language Technologies, Montréal, Canada, pp. 200–210 (2012)
Dzikovska, M.O., et al.: SemEval-2013 task 7: the joint student response analysis and 8th recognizing textual entailment challenge. In: 2nd Joint Conference on Lexical and Computational Semantics, Atlanta, USA, pp. 263–274 (2013)
Kaggle: Kaggle. http://www.kaggle.com/c/asap-aes. Accessed 25 Dec 2023
Taghipour, K., Ng, H.T.: A neural approach to automated essay scoring. In: EMNLP 2016, Austin, USA, pp. 1882–1891 (2016)
Dong, F., Zhang, Y.: Automatic features for essay scoring - An empirical study. In: NMNLP 2016, Austin, USA, pp. 1072–1077 (2016)
Alikaniotis, D., Yannakoudakis, H., Rei, M.: Automatic text scoring using neural networks. In: 54th ACL, Berlin, Germany, pp. 715–725 (2016)
Zhao, S., Zhang, Y., Xiong, X., Botelho, A., Heffernan, N.: A memory-augmented neural model for automated grading. In: 4th ACM Conference on Learning at Scale, Cambridge, USA, pp. 189–192 (2017)
Riordan, B., Horbach, A., Cahill, A., Zesch, T., Min Lee, C.: Investigating neural architectures for short answer scoring. In: 12th Workshop on Innovative Use of NLP for Building Educational Applications, Copenhagen, Denmark, pp. 159–168 (2017)
Sung, C., Dhamecha, T.I., Mukhi, N.: Improving short answer grading using transformer-based pre-training. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds.) AIED 2019. LNCS, vol. 11625, pp. 469–481. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23204-7_39
Camus, L., Filighera, A.: Investigating transformers for automatic short answer grading. In: Bittencourt, I., Cukurova, M., Muldner, K., Luckin, R., Millán, E. (eds.) AIED 2020. LNCS, vol. 12164, pp. 43–48. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52240-7_8
Lun, J., Zhu, J., Tang, Y., Yang, M.: Multiple data augmentation strategies for improving performance on automatic short answer scoring. In: 34th AAAI, New York, USA, pp. 13446–13453 (2020)
Li, Z., Tomar, Y., Passonneau, R.J.: A semantic feature-wise transformation relation network for automatic short answer grading. In: EMNLP 2021, Punta Cana, Dominican Republic, pp. 6030–6040 (2021)
Mizumoto, T., Ouchi, H., Isobe, Y., Reisert, P., Nagata, R., Sekine, S., Inui, K.: Analytic score prediction and justification identification in automated short answer scoring. In: 14th Workshop on Innovative Use of NLP for Building Educational Applications, Florence, Italy, pp. 316–325 (2019)
Funayama, H., Sasaki, S., Matsubayashi, Y., Mizumoto, T., Suzuki, J., Mita, M., Inui, K.: Preventing critical scoring errors in short answer scoring with confidence estimation. In: 58th ACL: Student Research Workshop, pp. 237–243 (2020)
Takano, S., Ichikawa, O.: Automatic scoring of short answers using justification cues estimated by BERT. In: 17th Workshop on Innovative Use of NLP for Building Educational Applications, Seattle, USA, pp. 8–13 (2022)
Informatics Research Data Repository, N.I. of informatics: RIKEN: RIKEN Dataset for Short Answer Assessment (2020)
Proceedings of the First International Workshop on Pen-Based Learning Technologies, PLT 2007. Catania, Italy (2007). https://doi.org/10.5555/1338440
Koile, K., et al.: Supporting pen-based classroom interaction: new findings and functionality for classroom learning partner. In: 1st International Workshop on Pen-Based Learning Technologies, pp. 1–7. Catania, Italy (2007)
Nakagawa, M., Lozano, N., Oda, H.: Paper architecture and an exam scoring application. In: 1st International Workshop on Pen-Based Learning Technologies, Catania, Italy, pp. 1–6 (2007)
Lozano, N., Hirosawa, K., Nakagawa, M.: A scoring tool for electronic paper exams. In: 7th IEEE International Conference on Advanced Learning Technologies, Niigata, Japan, pp. 120–121 (2007)
Prey, J., Reed, R.H., Berque, D.A.: The Impact of Tablet PCs and Pen-Based Technology on Education 2007: Beyond the Tipping Point. Purdue University Press (2007)
Yoshida, N., Koyama, K., Ng, K., Tsukahara, W., Nakagawa, M.: New features for a pen and paper-based exam scripts marking system. In: E-Learn 2009, Vancouver, Canada, pp. 3758–3765 (2009)
Koyama, K., Nakagawa, M.: Implementation of a pen and paper based exam marking system. In: E-Learn 2010, Orlando, Florida, pp. 1073–1078 (2010)
Khuong, V.T.M., Minh Khanh, P.Q., Huy, U.C., Tuan, N., Nakagawa, M.: A synthetic dataset for clustering handwritten math expression TUAT (Dset_Mix). https://tc11.cvc.uab.es/datasets/Dset_Mix_1. Accessed 25 Dec 2023
Khuong, V.T.M., Phan, K.M., Ung, H.Q., Nguyen, C.T., Nakagawa, M.: Clustering of handwritten mathematical expressions for computer-assisted marking. IEICE Trans. Inf. Syst. E104D, 275–284 (2021)
Mouchère, H., et al.: ICFHR 2016 CROHME: competition on recognition of online handwritten mathematical expressions. In: Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition, Shenzhen, China, pp. 607–612 (2016)
Mahdavi, M., Zanibbi, R., Mouchère, H.: ICDAR 2019 CROHME + TFD: competition on recognition of handwritten mathematical expressions and typeset formula detection. In: 15th ICDAR, Sydney, Australia, pp. 1533–1538 (2019)
Nguyen, C.T., Khuong, V.T.M., Nguyen, H.T., Nakagawa, M.: CNN based spatial classification features for clustering offline handwritten mathematical expressions. Pattern Recognit. Lett. 131, 113–120 (2020)
Zhu, Y., Xie, Z., Jin, L., Chen, X., Huang, Y., Zhang, M.: SCUT-EPT: new dataset and benchmark for offline Chinese text recognition in examination paper. IEEE Access. 7, 370–382 (2019)
MathNet. https://www.etrialstestbed.org/projects/mathnet-competition
Oka, H., Nguyen, H.T., Nguyen, C.T., Nakagawa, M., Ishioka, T.: Fully automated short answer scoring of the trial tests for common entrance examinations for Japanese university. In: Rodrigo, M.M., Matsuda, N., Cristea, A.I., Dimitrova, V. (eds.) AIED 2022. LNCS, vol. 13355, pp. 180–192. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-11644-5_15
Nguyen, H.T., Nguyen, C.T., Oka, H., Ishioka, T., Nakagawa, M.: Handwriting recognition and automatic scoring for descriptive answers in Japanese language tests. In: Porwal, U., Fornés, A., Shafait, F. (eds.) ICFHR 2022. LNCS, vol. 13639, pp. 274–284. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-21648-0_19
Nguyen, H.T., Ly, N.T., Nguyen, K.C., Nguyen, C.T., Nakagawa, M.: Attempts to recognize anomalously deformed Kana in Japanese historical documents. In: 4th International Workshop on Historical Document Imaging and Processing, New York, USA, pp. 31–36 (2017)
Saito, T., Yamada, H., Yamamoto, K.: On the database ETL 9 of handprinted characters in HIS Chinese characters and its analysis. Trans. IECE Jpn. J68-D(4), 757–764 (1986)
Devlin, J., Chang, M.-W.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, USA, pp. 4171–4186 (2019)
Cohen, J.: Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol. Bull. 70, 213–220 (1968)
Asakura, T., et al.: Digitalizing educational workbooks and collecting handwritten answers for automatic scoring. In: 5th Workshop on Intelligent Textbooks, Tokyo, Japan, pp. 78–87 (2023)
Nguyen, H.T., Nguyen, C.T., Nakagawa, M.: Online Japanese handwriting recognizers using recurrent neural networks. In: 16th International Conference on Frontiers in Handwriting Recognition, Niagara Falls, USA, pp. 435–440 (2018)
Ly, N.T., Nguyen, H.T., Nakagawa, M.: 2D self-attention convolutional recurrent network for offline handwritten text recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS (LNAI and LNB), vol. 12821, pp. 191–204. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86549-8_13
Nguyen, C.T., Nakagawa, M.: Finite state machine based decoding of handwritten text using recurrent neural networks. In: 15th International Conference on Frontiers in Handwriting Recognition, Shenzhen, China, pp. 246–251 (2016)
Nguyen, C.T., Truong, T.N., Nguyen, H.T., Nakagawa, M.: Global context for improving recognition of online handwritten mathematical expressions. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol 12822, pp. 617–631. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_40
Truong, T.-N., Nguyen, C.T., Nakagawa, M.: Syntactic data generation for handwritten mathematical expression recognition. Pattern Recognit. Lett. 153, 83–91 (2021)
Matsushita, T., Nakagawa, M.: A database of on-line handwritten mixed objects named “Kondate”. In: 14th International Conference on Frontiers in Handwriting Recognition, Hersonissos, Greece, pp. 369–374 (2014)
Liwicki, M., Bunke, H.: IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard. In: 2005 8th International Conference on Document Analysis and Recognition, pp. 956–961 (2005)
Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 5, 39–46 (2003). https://doi.org/10.1007/s100320200071
Acknowledgement
This work is partially being supported by the joint research budget from WACOM Co., Ltd. and KAKENHI JP24H00738, JP23H03511, JP22H00085, JP21K18136.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nakagawa, M. et al. (2024). Two Experiments for Automatic Scoring of Handwritten Descriptive Answers. In: Sfikas, G., Retsinas, G. (eds) Document Analysis Systems. DAS 2024. Lecture Notes in Computer Science, vol 14994. Springer, Cham. https://doi.org/10.1007/978-3-031-70442-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-70442-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70441-3
Online ISBN: 978-3-031-70442-0
eBook Packages: Computer ScienceComputer Science (R0)