ABSTRACT
We have developed an automated Japanese short-answer scoring and support machine for new National Center written test exams. Our approach is based on the fact that accurate recognizing textual entailment and/or synonymy has been almost impossible for several years. The system generates automated scores on the basis of evaluation criteria or rubrics, and human raters revise them. The system determines semantic similarity between the model answers and the actual written answers as well as a certain degree of semantic identity and implication. Owing to the need for the scoring results to be classified at multiple levels, we use random forests to utilize many predictors effectively rather than use support vector machines. An experimental prototype operates as a web system on a Linux computer. We compared human scores with the automated scores for a case in which 3--6 allotment points were placed in 8 categories of a social studies test as a trial examination. The differences between the scores were within one point for 70--90 percent of the data when high semantic judgment was not needed.
- Leo Breiman. 2001. Random Forests. Machine Learning 45, 1 (2001), 5--32. Google ScholarDigital Library
- Leo Breiman and Adele Cutler. 2004. Random Forests. (March 2004). http://www.stat.berkeley.eduGoogle Scholar
- Charles J. Fillmore. 1968. The Case for Case. Universals in Linguistic Theory, New York: Holt, Rinehart, and Winston, 1--88.Google Scholar
- The Hewlett Foundation. 2012. Short Answer Scoring. Automated Studendent Assessment Prize, Phase Two. https://www.kaggle.comGoogle Scholar
- Ben Hamner. 2015. Package 'Metrics' Evaluation metrics for machine learning. https://github.com/benhamner/Metrics/tree/master/RGoogle Scholar
- Tsunenori Ishioka and Msayuki Kameda. 2006. Automated Japanese essay scoring system based on articles written by experts. In Proceeding ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics (Coling-ACL 2006). Association for Computational Linguistics, 233--240. Google ScholarDigital Library
- Claudia Leacock and Martin Chodorow. 2003. C-rater: Automated Scoring of Short-Answer Questions. Computers and the Humanities 37, 4 (2003), 389--405.Google Scholar
- Andy Liaw and Matthew Wiener. 2015. Package RandomForest. Breiman and Cutler's Random Forests for Classification and Regression 4.6--12 (Oct. 2015).Google Scholar
- MEXT. 2016. Publishing the final report of high school and university articulation meeting. Ministry of Education, Culture, Sports, Science and Technology in Japan. http://www.mext.go.jpGoogle Scholar
- NII. 2013. It's ability which understands the meaning to be asked about. NII Today 60 (2013), 8--9.Google Scholar
- Stephen G. Pulman and Jana Z. Sukkarieh. 2005. Automatic short answer marking. In EdAppsNLP 05 Proceedings of the second workshop on Building Educational Applications Using NLP. Association for Computational Linguistics, 9--16. http://dl.acm.org/citation.cfm?id=1609831 Google ScholarDigital Library
- Mark D. Shermis and Jill Burstein. 2013. Handbook of Automated Essay Evaluation. Routledge.Google Scholar
- Luis Tandalla. 2012. Scoring Short Answer Essays. The Hewlett Foundation: Short Answer Scoring. https://kaggle2.blob.core.windows.net/competitions/kaggle/2959/media/TechnicalMethodsPaper.pdfGoogle Scholar
- Richard Vigilante. 1999. Online Computer Scoring of Constructed-Response Questions. Journal of Information Technology 1, 2 (1999), 57--62.Google Scholar
Index Terms
- Overwritable automated japanese short-answer scoring and support system
Recommendations
Fully Automated Short Answer Scoring of the Trial Tests for Common Entrance Examinations for Japanese University
Artificial Intelligence in EducationAbstractStudies on automated short-answer scoring (SAS) have been conducted to apply natural language processing to education. Short-answer scoring is a task to grade the responses from linguistic information. Most answer sheets for short-answer questions ...
A three-stage approach to the automated scoring of spontaneous spoken responses
This paper presents a description and evaluation of SpeechRater^S^M, a system for automated scoring of non-native speakers' spoken English proficiency, based on tasks which elicit spontaneous monologues on particular topics. This system builds on much ...
Exploiting Structured Error to Improve Automated Scoring of Oral Reading Fluency
Artificial Intelligence in EducationAbstractIn order to track the development of young readers’ oral reading fluency (ORF) at scale, it is necessary to move away from hand-scoring responses to automating the assessment of ORF, while retaining the quality of the scores. We present a method ...
Comments