Abstract
Automatic short answer grading for Intelligent Tutoring Systems has attracted much attention of the researchers over the years. While the traditional techniques for short answer grading are rooted in statistical learning and hand-crafted features, recent research has explored sentence embedding based techniques. We observe that sentence embedding techniques, while being effective for grading in-domain student answers, may not be best suited for out-of-domain answers. Further, sentence embeddings can be affected by non-sentential answers (answers given in the context of the question). On the other hand, token level hand-crafted features can be fairly domain independent and are less affected by non-sentential forms. We propose a novel feature encoding based on partial similarities of tokens (Histogram of Partial Similarities or HoPS), its extension to part-of-speech tags (HoPSTags) and question type information. On combining the proposed features with sentence embedding based features, we are able to further improve the grading performance. Our final model achieves better or competitive results in experimental evaluation on multiple benchmarking datasets and a large scale industry dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
All experiments on Sultan et al. ’s system were performed using their publicly available code at https://github.com/ma-sultan/short-answer-grader.
- 5.
- 6.
References
Agirre, E., Banea, C., Cardie, C., Cer, D., Diab, M., Gonzalez-Agirre, A., Guo, W., Lopez-Gazpio, I., Maritxalar, M., Mihalcea, R., et al.: Semeval-2015 task 2: semantic textual similarity, English, Spanish and pilot on interpretability. In: Proceedings of the International Workshop on Semantic Evaluation, pp. 252–263 (2015)
Agirre, E., Cer, D., Diab, M., Gonzalez-Agirre, A., Guo, W.: * SEM 2013 shared task: semantic textual similarity. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 1, pp. 32–43 (2013)
Agirre, E., Diab, M., Cer, D., Gonzalez-Agirre, A.: Semeval-2012 task 6: a pilot on semantic textual similarity. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, pp. 385–393 (2012)
Alikaniotis, D., Yannakoudakis, H., Rei, M.: Automatic text scoring using neural networks. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 715–725 (2016)
Bjerva, J., Bos, J., Van der Goot, R., Nissim, M.: The meaning factory: formal semantics for recognizing textual entailment and determining semantic similarity. In: Proceedings of the International Workshop on Semantic Evaluation. pp. 642–646 (2014)
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2015)
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 670–680 (2017)
Dzikovska, M.O., Nielsen, R.D., Brew, C., Leacock, C., Giampiccolo, D., Bentivogli, L., Clark, P., Dagan, I., Dang, H.T.: SemEval-2013 Task 7: the joint student response analysis and 8th recognizing textual entailment challenge. In: Proceedings of the NAACL-HLT International Workshop on Semantic Evaluation, pp. 263–274 (2013)
Heilman, M., Madnani, N.: ETS: domain adaptation and stacking for short answer scoring. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 2, pp. 275–279 (2013)
Jimenez, S., Becerra, C., Gelbukh, A.: SOFTCARDINALITY: hierarchical text overlap for student response analysis. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 2, pp. 280–284 (2013)
Kumar, S., Chakrabarti, S., Roy, S.: Earth movers distance pooling over siamese LSTMS for automatic short answer grading. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 2046–2052 (2017)
Levy, O., Zesch, T., Dagan, I., Gurevych, I.: Recognizing partial textual entailment. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 2, pp. 451–455 (2013)
Lopez-Gazpio, I., Maritxalar, M., Gonzalez-Agirre, A., Rigau, G., Uria, L., Agirre, E.: Interpretable semantic textual similarity: finding and explaining differences between sentences. Knowl. Based Syst. 119, 186–199 (2017)
Michalenko, J.J., Lan, A.S., Baraniuk, R.G.: D.TRUMP: data-mining textual responses to uncover misconception patterns. In: Proceedings of the Fourth ACM Conference on Learning @ Scale L@S., pp. 245–248 (2017)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Mitchell, T., Russell, T., Broomhead, P., Aldridge, N.: Towards robust computerised marking of free-text responses. In: Proceedings of the International Computer Assisted Assessment Conference (2002)
Mohler, M., Bunescu, R.C., Mihalcea, R.: Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 752–762 (2011)
Morgan, J.: Sentence fragments and the notion sentence. Issues Linguistics: Paper in Honor of Henry Renée Kahane, pp. 719–751 (1973)
Mueller, J., Thyagarajan, A.: Siamese recurrent architectures for learning sentence similarity. In: Proceedings of the Association for the Advancement of Artificial Intelligence, pp. 2786–2792 (2016)
Nielsen, R.D., Ward, W., Martin, J.H.: Recognizing entailment in intelligent tutoring systems. Nat. Lang. Eng. 15(4), 479–501 (2009)
Ott, N., Ziai, R., Hahn, M., Meurers, D.: CoMeT: integrating different levels of linguistic modeling for meaning assessment. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 2, pp. 608–616 (2013)
Ramachandran, L., Cheng, J., Foltz, P.: Identifying patterns for short answer scoring using graph-based lexico-semantic text matching. In: Proceedings of the Workshop on Innovative Use of NLP for Building Educational Applications, pp. 97–106 (2015)
Rus, V., Lintean, M., Banjade, R., Niraula, N., Stefanescu, D.: Semilar: the semantic similarity toolkit. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics: System Demonstrationsm, pp. 163–168 (2013)
Sukkarieh, J.Z., Pulman, S.G., Raikes, N.: Auto-marking 2: an update on the UCLES-Oxford University research into using computational linguistics to score short, free text responses. Int. Assoc. Educ. Assess. (2004)
Sultan, M.A., Salazar, C., Sumner, T.: Fast and easy short answer grading with high accuracy. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1070–1075 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Saha, S., Dhamecha, T.I., Marvaniya, S., Sindhgatta, R., Sengupta, B. (2018). Sentence Level or Token Level Features for Automatic Short Answer Grading?: Use Both. In: Penstein Rosé, C., et al. Artificial Intelligence in Education. AIED 2018. Lecture Notes in Computer Science(), vol 10947. Springer, Cham. https://doi.org/10.1007/978-3-319-93843-1_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-93843-1_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93842-4
Online ISBN: 978-3-319-93843-1
eBook Packages: Computer ScienceComputer Science (R0)