Sentence Level or Token Level Features for Automatic Short Answer Grading?: Use Both

Saha, Swarnadeep; Dhamecha, Tejas I.; Marvaniya, Smit; Sindhgatta, Renuka; Sengupta, Bikram

doi:10.1007/978-3-319-93843-1_37

Swarnadeep Saha²¹,
Tejas I. Dhamecha²¹,
Smit Marvaniya²¹,
Renuka Sindhgatta²¹ &
…
Bikram Sengupta²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10947))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

7006 Accesses

Abstract

Automatic short answer grading for Intelligent Tutoring Systems has attracted much attention of the researchers over the years. While the traditional techniques for short answer grading are rooted in statistical learning and hand-crafted features, recent research has explored sentence embedding based techniques. We observe that sentence embedding techniques, while being effective for grading in-domain student answers, may not be best suited for out-of-domain answers. Further, sentence embeddings can be affected by non-sentential answers (answers given in the context of the question). On the other hand, token level hand-crafted features can be fairly domain independent and are less affected by non-sentential forms. We propose a novel feature encoding based on partial similarities of tokens (Histogram of Partial Similarities or HoPS), its extension to part-of-speech tags (HoPSTags) and question type information. On combining the proposed features with sentence embedding based features, we are able to further improve the grading performance. Our final model achieves better or competitive results in experimental evaluation on multiple benchmarking datasets and a large scale industry dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatic Short Answer Scoring on an Indian Dataset Using Transformer-Based Language Models

Automatic Short Answer Grading Using Universal Sentence Encoder

Ans2vec: A Scoring System for Short Answers

Notes

1.
https://github.com/facebookresearch/InferSent.
2.
http://www.nltk.org/.
3.
http://scikit-learn.org/stable/.
4.
All experiments on Sultan et al. ’s system were performed using their publicly available code at https://github.com/ma-sultan/short-answer-grader.
5.
http://www.semanticsimilarity.org/.
6.
https://docs.google.com/spreadsheets/d/1Xe3lCi9jnZQiZW97hBfkg0x4cI3oDfztZPhK3TGO_gw/pub?output=html.

References

Agirre, E., Banea, C., Cardie, C., Cer, D., Diab, M., Gonzalez-Agirre, A., Guo, W., Lopez-Gazpio, I., Maritxalar, M., Mihalcea, R., et al.: Semeval-2015 task 2: semantic textual similarity, English, Spanish and pilot on interpretability. In: Proceedings of the International Workshop on Semantic Evaluation, pp. 252–263 (2015)
Google Scholar
Agirre, E., Cer, D., Diab, M., Gonzalez-Agirre, A., Guo, W.: * SEM 2013 shared task: semantic textual similarity. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 1, pp. 32–43 (2013)
Google Scholar
Agirre, E., Diab, M., Cer, D., Gonzalez-Agirre, A.: Semeval-2012 task 6: a pilot on semantic textual similarity. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, pp. 385–393 (2012)
Google Scholar
Alikaniotis, D., Yannakoudakis, H., Rei, M.: Automatic text scoring using neural networks. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 715–725 (2016)
Google Scholar
Bjerva, J., Bos, J., Van der Goot, R., Nissim, M.: The meaning factory: formal semantics for recognizing textual entailment and determining semantic similarity. In: Proceedings of the International Workshop on Semantic Evaluation. pp. 642–646 (2014)
Google Scholar
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2015)
Google Scholar
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 670–680 (2017)
Google Scholar
Dzikovska, M.O., Nielsen, R.D., Brew, C., Leacock, C., Giampiccolo, D., Bentivogli, L., Clark, P., Dagan, I., Dang, H.T.: SemEval-2013 Task 7: the joint student response analysis and 8th recognizing textual entailment challenge. In: Proceedings of the NAACL-HLT International Workshop on Semantic Evaluation, pp. 263–274 (2013)
Google Scholar
Heilman, M., Madnani, N.: ETS: domain adaptation and stacking for short answer scoring. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 2, pp. 275–279 (2013)
Google Scholar
Jimenez, S., Becerra, C., Gelbukh, A.: SOFTCARDINALITY: hierarchical text overlap for student response analysis. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 2, pp. 280–284 (2013)
Google Scholar
Kumar, S., Chakrabarti, S., Roy, S.: Earth movers distance pooling over siamese LSTMS for automatic short answer grading. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 2046–2052 (2017)
Google Scholar
Levy, O., Zesch, T., Dagan, I., Gurevych, I.: Recognizing partial textual entailment. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 2, pp. 451–455 (2013)
Google Scholar
Lopez-Gazpio, I., Maritxalar, M., Gonzalez-Agirre, A., Rigau, G., Uria, L., Agirre, E.: Interpretable semantic textual similarity: finding and explaining differences between sentences. Knowl. Based Syst. 119, 186–199 (2017)
Article Google Scholar
Michalenko, J.J., Lan, A.S., Baraniuk, R.G.: D.TRUMP: data-mining textual responses to uncover misconception patterns. In: Proceedings of the Fourth ACM Conference on Learning @ Scale L@S., pp. 245–248 (2017)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Mitchell, T., Russell, T., Broomhead, P., Aldridge, N.: Towards robust computerised marking of free-text responses. In: Proceedings of the International Computer Assisted Assessment Conference (2002)
Google Scholar
Mohler, M., Bunescu, R.C., Mihalcea, R.: Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 752–762 (2011)
Google Scholar
Morgan, J.: Sentence fragments and the notion sentence. Issues Linguistics: Paper in Honor of Henry Renée Kahane, pp. 719–751 (1973)
Google Scholar
Mueller, J., Thyagarajan, A.: Siamese recurrent architectures for learning sentence similarity. In: Proceedings of the Association for the Advancement of Artificial Intelligence, pp. 2786–2792 (2016)
Google Scholar
Nielsen, R.D., Ward, W., Martin, J.H.: Recognizing entailment in intelligent tutoring systems. Nat. Lang. Eng. 15(4), 479–501 (2009)
Article Google Scholar
Ott, N., Ziai, R., Hahn, M., Meurers, D.: CoMeT: integrating different levels of linguistic modeling for meaning assessment. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 2, pp. 608–616 (2013)
Google Scholar
Ramachandran, L., Cheng, J., Foltz, P.: Identifying patterns for short answer scoring using graph-based lexico-semantic text matching. In: Proceedings of the Workshop on Innovative Use of NLP for Building Educational Applications, pp. 97–106 (2015)
Google Scholar
Rus, V., Lintean, M., Banjade, R., Niraula, N., Stefanescu, D.: Semilar: the semantic similarity toolkit. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics: System Demonstrationsm, pp. 163–168 (2013)
Google Scholar
Sukkarieh, J.Z., Pulman, S.G., Raikes, N.: Auto-marking 2: an update on the UCLES-Oxford University research into using computational linguistics to score short, free text responses. Int. Assoc. Educ. Assess. (2004)
Google Scholar
Sultan, M.A., Salazar, C., Sumner, T.: Fast and easy short answer grading with high accuracy. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1070–1075 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

IBM Research, Bangalore, India
Swarnadeep Saha, Tejas I. Dhamecha, Smit Marvaniya, Renuka Sindhgatta & Bikram Sengupta

Authors

Swarnadeep Saha
View author publications
You can also search for this author in PubMed Google Scholar
Tejas I. Dhamecha
View author publications
You can also search for this author in PubMed Google Scholar
Smit Marvaniya
View author publications
You can also search for this author in PubMed Google Scholar
Renuka Sindhgatta
View author publications
You can also search for this author in PubMed Google Scholar
Bikram Sengupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Swarnadeep Saha .

Editor information

Editors and Affiliations

Carnegie Mellon University, Pittsburgh, PA, USA
Carolyn Penstein Rosé
University of Technology, Sydney, NSW, Australia
Roberto Martínez-Maldonado
University of Duisburg-Essen, Duisburg, Germany
H. Ulrich Hoppe
UCL Institute of Education, London, UK
Rose Luckin
UCL Institute of Education, London, UK
Manolis Mavrikis
UCL Institute of Education, London, UK
Kaska Porayska-Pomsta
Carnegie Mellon University, Pittsburgh, PA, USA
Bruce McLaren
University of Sussex, Brighton, UK
Benedict du Boulay

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saha, S., Dhamecha, T.I., Marvaniya, S., Sindhgatta, R., Sengupta, B. (2018). Sentence Level or Token Level Features for Automatic Short Answer Grading?: Use Both. In: Penstein Rosé, C., et al. Artificial Intelligence in Education. AIED 2018. Lecture Notes in Computer Science(), vol 10947. Springer, Cham. https://doi.org/10.1007/978-3-319-93843-1_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-93843-1_37
Published: 20 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93842-4
Online ISBN: 978-3-319-93843-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics