Two Experiments for Automatic Scoring of Handwritten Descriptive Answers

Nakagawa, Masaki; Nguyen, Hung Tuan; Truong, Nghia Thanh; Ly, Nam Tuan; Nguyen, Cuong Tuan; Oka, Haruki; Ishioka, Tsunenori; Asakura, Tomo; Miyazawa, Hiroshi; Yamamoto, Takahiro; Horie, Toshihiko; Yasuno, Fumiko

doi:10.1007/978-3-031-70442-0_1

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14994))

Included in the following conference series:

International Workshop on Document Analysis Systems

318 Accesses

Abstract

This paper presents our motivation, design and two experiments for automatic scoring of handwritten descriptive answers. The first experiment is on scoring of handwritten short descriptive answers in Japanese language exams. We used a deep neural network (DNN)-based handwriting recognizer and a transformer-based automatic scorer without correcting misrecognized characters or adding rubric annotations for scoring. We achieved acceptable agreement between the automatic scoring and the human scoring, while using only 1.7% of the human-scored answers for training. The second experiment is to score descriptive answers written on electronic paper for Japanese, English, and math drills. We used DNN-based online and offline handwriting recognizers for each subject and took simple perfect matching of recognized candidates with correct answers. The experiment shows that the False Negative rate is reduced by combining the online and offline recognizers and the False Positive rate is reduced by rejecting low recognition scores. Even with the current system, human scorers only need to manually score less than 30% of the answers, with false positive (risky) scores of about 2% or less for the three subjects.

C. T. Nguyen—Work done while at Tokyo University of Agriculture and Technology.

H. Oka—Work done while at The University of Tokyo.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Handwriting Recognition and Automatic Scoring for Descriptive Answers in Japanese Language Tests

“SmartEval”—Evaluation System for Descriptive Answers in Examinations Using Natural Language Processing and Artificial Neural Networks

Fully Automated Short Answer Scoring of the Trial Tests for Common Entrance Examinations for Japanese University

References

Heffernan, N.T., Heffernan, C.L.: The ASSISTments ecosystem: building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching. Int. J. Artif. Intell. Educ. 24, 470–497 (2014)
Article Google Scholar
The Central Council of Education, J.: 177th Report (in Japanese)
Google Scholar
Plamondon, R., Pirlo, G., Anquetil, É., Rémi, C., Teulings, H.L., Nakagawa, M.: Personal digital bodyguards for e-security, e-learning and e-health: a prospective survey. Pattern Recognit. 81, 633–659 (2018)
Article Google Scholar
Burrows, S., Gurevych, I., Stein, B.: The eras and trends of automatic short answer grading. Int. J. Artif. Intell. Educ. 25, 60–117 (2015)
Article Google Scholar
Burstein, J., et al.: Automated scoring using a hybrid feature identification technique. In: 36th ACL and 17th COLING, Quebec, Canada, pp. 206–210 (1998)
Google Scholar
Wild, F., Stahl, C., Stermsek, G., Neumann, G.: Parameters driving effectiveness of automated essay scoring with LSA. In: 9th Conference on Computer Assisted Assessment, Loughborough, England, pp. 485–494 (2005)
Google Scholar
Ishioka, T., Kameda, M.: Automated Japanese essay scoring system:jess. In: Proceedings of International Workshop on Database Expert System Applications, pp. 4–8. IEEE (2004)
Google Scholar
Srihari, S., Srihari, R., Babu, P., Srinivasan, H.: On the automatic scoring of handwritten essays. In: 20th International Joint Conference on Artificial Intelligence, pp. 2880–2884 (2007)
Google Scholar
Leacock, C., Chodorow, M.: C-rater: automated scoring of short-answer questions. Comput. Hum. 37, 389–405 (2003)
Article Google Scholar
Pulman, S.G., Sukkarieh, J.Z.: Automatic short answer marking. In: 2th Workshop on Building Educational Applications Using NLP, Michigan, USA, pp. 9–16 (2005)
Google Scholar
Mitchell, T., Aldridge, N., Broomhead, P.: Computerised marking of short-answer free-text responses. In: 29th annual conference of the International Association for Educational Assessment, Manchester, UK, pp. 1–16 (2003)
Google Scholar
Dzikovska, M.O., Nielsen, R.D., Brew, C.: Towards effective tutorial feedback for explanation questions: a dataset and baselines. In: 2012 NAACL: Human Language Technologies, Montréal, Canada, pp. 200–210 (2012)
Google Scholar
Dzikovska, M.O., et al.: SemEval-2013 task 7: the joint student response analysis and 8th recognizing textual entailment challenge. In: 2nd Joint Conference on Lexical and Computational Semantics, Atlanta, USA, pp. 263–274 (2013)
Google Scholar
Kaggle: Kaggle. http://www.kaggle.com/c/asap-aes. Accessed 25 Dec 2023
Taghipour, K., Ng, H.T.: A neural approach to automated essay scoring. In: EMNLP 2016, Austin, USA, pp. 1882–1891 (2016)
Google Scholar
Dong, F., Zhang, Y.: Automatic features for essay scoring - An empirical study. In: NMNLP 2016, Austin, USA, pp. 1072–1077 (2016)
Google Scholar
Alikaniotis, D., Yannakoudakis, H., Rei, M.: Automatic text scoring using neural networks. In: 54th ACL, Berlin, Germany, pp. 715–725 (2016)
Google Scholar
Zhao, S., Zhang, Y., Xiong, X., Botelho, A., Heffernan, N.: A memory-augmented neural model for automated grading. In: 4th ACM Conference on Learning at Scale, Cambridge, USA, pp. 189–192 (2017)
Google Scholar
Riordan, B., Horbach, A., Cahill, A., Zesch, T., Min Lee, C.: Investigating neural architectures for short answer scoring. In: 12th Workshop on Innovative Use of NLP for Building Educational Applications, Copenhagen, Denmark, pp. 159–168 (2017)
Google Scholar
Sung, C., Dhamecha, T.I., Mukhi, N.: Improving short answer grading using transformer-based pre-training. In: Isotani, S., Millán, E., Ogan, A., Hastings, P., McLaren, B., Luckin, R. (eds.) AIED 2019. LNCS, vol. 11625, pp. 469–481. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23204-7_39
Chapter Google Scholar
Camus, L., Filighera, A.: Investigating transformers for automatic short answer grading. In: Bittencourt, I., Cukurova, M., Muldner, K., Luckin, R., Millán, E. (eds.) AIED 2020. LNCS, vol. 12164, pp. 43–48. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52240-7_8
Chapter Google Scholar
Lun, J., Zhu, J., Tang, Y., Yang, M.: Multiple data augmentation strategies for improving performance on automatic short answer scoring. In: 34th AAAI, New York, USA, pp. 13446–13453 (2020)
Google Scholar
Li, Z., Tomar, Y., Passonneau, R.J.: A semantic feature-wise transformation relation network for automatic short answer grading. In: EMNLP 2021, Punta Cana, Dominican Republic, pp. 6030–6040 (2021)
Google Scholar
Mizumoto, T., Ouchi, H., Isobe, Y., Reisert, P., Nagata, R., Sekine, S., Inui, K.: Analytic score prediction and justification identification in automated short answer scoring. In: 14th Workshop on Innovative Use of NLP for Building Educational Applications, Florence, Italy, pp. 316–325 (2019)
Google Scholar
Funayama, H., Sasaki, S., Matsubayashi, Y., Mizumoto, T., Suzuki, J., Mita, M., Inui, K.: Preventing critical scoring errors in short answer scoring with confidence estimation. In: 58th ACL: Student Research Workshop, pp. 237–243 (2020)
Google Scholar
Takano, S., Ichikawa, O.: Automatic scoring of short answers using justification cues estimated by BERT. In: 17th Workshop on Innovative Use of NLP for Building Educational Applications, Seattle, USA, pp. 8–13 (2022)
Google Scholar
Informatics Research Data Repository, N.I. of informatics: RIKEN: RIKEN Dataset for Short Answer Assessment (2020)
Google Scholar
Proceedings of the First International Workshop on Pen-Based Learning Technologies, PLT 2007. Catania, Italy (2007). https://doi.org/10.5555/1338440
Koile, K., et al.: Supporting pen-based classroom interaction: new findings and functionality for classroom learning partner. In: 1st International Workshop on Pen-Based Learning Technologies, pp. 1–7. Catania, Italy (2007)
Google Scholar
Nakagawa, M., Lozano, N., Oda, H.: Paper architecture and an exam scoring application. In: 1st International Workshop on Pen-Based Learning Technologies, Catania, Italy, pp. 1–6 (2007)
Google Scholar
Lozano, N., Hirosawa, K., Nakagawa, M.: A scoring tool for electronic paper exams. In: 7th IEEE International Conference on Advanced Learning Technologies, Niigata, Japan, pp. 120–121 (2007)
Google Scholar
Prey, J., Reed, R.H., Berque, D.A.: The Impact of Tablet PCs and Pen-Based Technology on Education 2007: Beyond the Tipping Point. Purdue University Press (2007)
Google Scholar
Yoshida, N., Koyama, K., Ng, K., Tsukahara, W., Nakagawa, M.: New features for a pen and paper-based exam scripts marking system. In: E-Learn 2009, Vancouver, Canada, pp. 3758–3765 (2009)
Google Scholar
Koyama, K., Nakagawa, M.: Implementation of a pen and paper based exam marking system. In: E-Learn 2010, Orlando, Florida, pp. 1073–1078 (2010)
Google Scholar
Khuong, V.T.M., Minh Khanh, P.Q., Huy, U.C., Tuan, N., Nakagawa, M.: A synthetic dataset for clustering handwritten math expression TUAT (Dset_Mix). https://tc11.cvc.uab.es/datasets/Dset_Mix_1. Accessed 25 Dec 2023
Khuong, V.T.M., Phan, K.M., Ung, H.Q., Nguyen, C.T., Nakagawa, M.: Clustering of handwritten mathematical expressions for computer-assisted marking. IEICE Trans. Inf. Syst. E104D, 275–284 (2021)
Article Google Scholar
Mouchère, H., et al.: ICFHR 2016 CROHME: competition on recognition of online handwritten mathematical expressions. In: Proceedings of the 15th International Conference on Frontiers in Handwriting Recognition, Shenzhen, China, pp. 607–612 (2016)
Google Scholar
Mahdavi, M., Zanibbi, R., Mouchère, H.: ICDAR 2019 CROHME + TFD: competition on recognition of handwritten mathematical expressions and typeset formula detection. In: 15th ICDAR, Sydney, Australia, pp. 1533–1538 (2019)
Google Scholar
Nguyen, C.T., Khuong, V.T.M., Nguyen, H.T., Nakagawa, M.: CNN based spatial classification features for clustering offline handwritten mathematical expressions. Pattern Recognit. Lett. 131, 113–120 (2020)
Article Google Scholar
Zhu, Y., Xie, Z., Jin, L., Chen, X., Huang, Y., Zhang, M.: SCUT-EPT: new dataset and benchmark for offline Chinese text recognition in examination paper. IEEE Access. 7, 370–382 (2019)
Article Google Scholar
MathNet. https://www.etrialstestbed.org/projects/mathnet-competition
Oka, H., Nguyen, H.T., Nguyen, C.T., Nakagawa, M., Ishioka, T.: Fully automated short answer scoring of the trial tests for common entrance examinations for Japanese university. In: Rodrigo, M.M., Matsuda, N., Cristea, A.I., Dimitrova, V. (eds.) AIED 2022. LNCS, vol. 13355, pp. 180–192. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-11644-5_15
Chapter Google Scholar
Nguyen, H.T., Nguyen, C.T., Oka, H., Ishioka, T., Nakagawa, M.: Handwriting recognition and automatic scoring for descriptive answers in Japanese language tests. In: Porwal, U., Fornés, A., Shafait, F. (eds.) ICFHR 2022. LNCS, vol. 13639, pp. 274–284. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-21648-0_19
Chapter Google Scholar
Nguyen, H.T., Ly, N.T., Nguyen, K.C., Nguyen, C.T., Nakagawa, M.: Attempts to recognize anomalously deformed Kana in Japanese historical documents. In: 4th International Workshop on Historical Document Imaging and Processing, New York, USA, pp. 31–36 (2017)
Google Scholar
Saito, T., Yamada, H., Yamamoto, K.: On the database ETL 9 of handprinted characters in HIS Chinese characters and its analysis. Trans. IECE Jpn. J68-D(4), 757–764 (1986)
Google Scholar
Devlin, J., Chang, M.-W.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, USA, pp. 4171–4186 (2019)
Google Scholar
Cohen, J.: Weighted kappa: nominal scale agreement provision for scaled disagreement or partial credit. Psychol. Bull. 70, 213–220 (1968)
Article Google Scholar
Asakura, T., et al.: Digitalizing educational workbooks and collecting handwritten answers for automatic scoring. In: 5th Workshop on Intelligent Textbooks, Tokyo, Japan, pp. 78–87 (2023)
Google Scholar
Nguyen, H.T., Nguyen, C.T., Nakagawa, M.: Online Japanese handwriting recognizers using recurrent neural networks. In: 16th International Conference on Frontiers in Handwriting Recognition, Niagara Falls, USA, pp. 435–440 (2018)
Google Scholar
Ly, N.T., Nguyen, H.T., Nakagawa, M.: 2D self-attention convolutional recurrent network for offline handwritten text recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS (LNAI and LNB), vol. 12821, pp. 191–204. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86549-8_13
Chapter Google Scholar
Nguyen, C.T., Nakagawa, M.: Finite state machine based decoding of handwritten text using recurrent neural networks. In: 15th International Conference on Frontiers in Handwriting Recognition, Shenzhen, China, pp. 246–251 (2016)
Google Scholar
Nguyen, C.T., Truong, T.N., Nguyen, H.T., Nakagawa, M.: Global context for improving recognition of online handwritten mathematical expressions. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol 12822, pp. 617–631. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86331-9_40
Truong, T.-N., Nguyen, C.T., Nakagawa, M.: Syntactic data generation for handwritten mathematical expression recognition. Pattern Recognit. Lett. 153, 83–91 (2021)
Article Google Scholar
Matsushita, T., Nakagawa, M.: A database of on-line handwritten mixed objects named “Kondate”. In: 14th International Conference on Frontiers in Handwriting Recognition, Hersonissos, Greece, pp. 369–374 (2014)
Google Scholar
Liwicki, M., Bunke, H.: IAM-OnDB - an on-line English sentence database acquired from handwritten text on a whiteboard. In: 2005 8th International Conference on Document Analysis and Recognition, pp. 956–961 (2005)
Google Scholar
Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recognit. 5, 39–46 (2003). https://doi.org/10.1007/s100320200071
Article Google Scholar

Download references

Acknowledgement

This work is partially being supported by the joint research budget from WACOM Co., Ltd. and KAKENHI JP24H00738, JP23H03511, JP22H00085, JP21K18136.

Author information

Authors and Affiliations

Tokyo University of Agriculture and Technology, Tokyo, Japan
Masaki Nakagawa, Hung Tuan Nguyen, Nghia Thanh Truong, Nam Tuan Ly & Tomo Asakura
Vietnamese-German University, Ho Chi Minh City, Vietnam
Cuong Tuan Nguyen
Recruit Co. Ltd., Tokyo, Japan
Haruki Oka
The National Center for University Entrance Examinations, Tokyo, Japan
Tsunenori Ishioka
Wacom Co., Ltd., Saitama, Japan
Tomo Asakura, Hiroshi Miyazawa, Takahiro Yamamoto & Toshihiko Horie
National Institute for Educational Policy Research, 3-2-2 Kasumigaseki, Tokyo, Japan
Fumiko Yasuno

Authors

Masaki Nakagawa
View author publications
You can also search for this author in PubMed Google Scholar
Hung Tuan Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Nghia Thanh Truong
View author publications
You can also search for this author in PubMed Google Scholar
Nam Tuan Ly
View author publications
You can also search for this author in PubMed Google Scholar
Cuong Tuan Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Haruki Oka
View author publications
You can also search for this author in PubMed Google Scholar
Tsunenori Ishioka
View author publications
You can also search for this author in PubMed Google Scholar
Tomo Asakura
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Miyazawa
View author publications
You can also search for this author in PubMed Google Scholar
Takahiro Yamamoto
View author publications
You can also search for this author in PubMed Google Scholar
Toshihiko Horie
View author publications
You can also search for this author in PubMed Google Scholar
Fumiko Yasuno
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masaki Nakagawa .

Editor information

Editors and Affiliations

University of West Attica, Egaleo, Greece
Giorgos Sfikas
National Technical University of Athens, Zografou, Greece
George Retsinas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nakagawa, M. et al. (2024). Two Experiments for Automatic Scoring of Handwritten Descriptive Answers. In: Sfikas, G., Retsinas, G. (eds) Document Analysis Systems. DAS 2024. Lecture Notes in Computer Science, vol 14994. Springer, Cham. https://doi.org/10.1007/978-3-031-70442-0_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-70442-0_1
Published: 11 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70441-3
Online ISBN: 978-3-031-70442-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)