Skip to main content

Sentence Level or Token Level Features for Automatic Short Answer Grading?: Use Both

  • Conference paper
  • First Online:
Artificial Intelligence in Education (AIED 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10947))

Included in the following conference series:

  • 7006 Accesses

Abstract

Automatic short answer grading for Intelligent Tutoring Systems has attracted much attention of the researchers over the years. While the traditional techniques for short answer grading are rooted in statistical learning and hand-crafted features, recent research has explored sentence embedding based techniques. We observe that sentence embedding techniques, while being effective for grading in-domain student answers, may not be best suited for out-of-domain answers. Further, sentence embeddings can be affected by non-sentential answers (answers given in the context of the question). On the other hand, token level hand-crafted features can be fairly domain independent and are less affected by non-sentential forms. We propose a novel feature encoding based on partial similarities of tokens (Histogram of Partial Similarities or HoPS), its extension to part-of-speech tags (HoPSTags) and question type information. On combining the proposed features with sentence embedding based features, we are able to further improve the grading performance. Our final model achieves better or competitive results in experimental evaluation on multiple benchmarking datasets and a large scale industry dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/facebookresearch/InferSent.

  2. 2.

    http://www.nltk.org/.

  3. 3.

    http://scikit-learn.org/stable/.

  4. 4.

    All experiments on Sultan et al. ’s system were performed using their publicly available code at https://github.com/ma-sultan/short-answer-grader.

  5. 5.

    http://www.semanticsimilarity.org/.

  6. 6.

    https://docs.google.com/spreadsheets/d/1Xe3lCi9jnZQiZW97hBfkg0x4cI3oDfztZPhK3TGO_gw/pub?output=html.

References

  1. Agirre, E., Banea, C., Cardie, C., Cer, D., Diab, M., Gonzalez-Agirre, A., Guo, W., Lopez-Gazpio, I., Maritxalar, M., Mihalcea, R., et al.: Semeval-2015 task 2: semantic textual similarity, English, Spanish and pilot on interpretability. In: Proceedings of the International Workshop on Semantic Evaluation, pp. 252–263 (2015)

    Google Scholar 

  2. Agirre, E., Cer, D., Diab, M., Gonzalez-Agirre, A., Guo, W.: * SEM 2013 shared task: semantic textual similarity. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 1, pp. 32–43 (2013)

    Google Scholar 

  3. Agirre, E., Diab, M., Cer, D., Gonzalez-Agirre, A.: Semeval-2012 task 6: a pilot on semantic textual similarity. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, pp. 385–393 (2012)

    Google Scholar 

  4. Alikaniotis, D., Yannakoudakis, H., Rei, M.: Automatic text scoring using neural networks. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 715–725 (2016)

    Google Scholar 

  5. Bjerva, J., Bos, J., Van der Goot, R., Nissim, M.: The meaning factory: formal semantics for recognizing textual entailment and determining semantic similarity. In: Proceedings of the International Workshop on Semantic Evaluation. pp. 642–646 (2014)

    Google Scholar 

  6. Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2015)

    Google Scholar 

  7. Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 670–680 (2017)

    Google Scholar 

  8. Dzikovska, M.O., Nielsen, R.D., Brew, C., Leacock, C., Giampiccolo, D., Bentivogli, L., Clark, P., Dagan, I., Dang, H.T.: SemEval-2013 Task 7: the joint student response analysis and 8th recognizing textual entailment challenge. In: Proceedings of the NAACL-HLT International Workshop on Semantic Evaluation, pp. 263–274 (2013)

    Google Scholar 

  9. Heilman, M., Madnani, N.: ETS: domain adaptation and stacking for short answer scoring. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 2, pp. 275–279 (2013)

    Google Scholar 

  10. Jimenez, S., Becerra, C., Gelbukh, A.: SOFTCARDINALITY: hierarchical text overlap for student response analysis. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 2, pp. 280–284 (2013)

    Google Scholar 

  11. Kumar, S., Chakrabarti, S., Roy, S.: Earth movers distance pooling over siamese LSTMS for automatic short answer grading. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 2046–2052 (2017)

    Google Scholar 

  12. Levy, O., Zesch, T., Dagan, I., Gurevych, I.: Recognizing partial textual entailment. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 2, pp. 451–455 (2013)

    Google Scholar 

  13. Lopez-Gazpio, I., Maritxalar, M., Gonzalez-Agirre, A., Rigau, G., Uria, L., Agirre, E.: Interpretable semantic textual similarity: finding and explaining differences between sentences. Knowl. Based Syst. 119, 186–199 (2017)

    Article  Google Scholar 

  14. Michalenko, J.J., Lan, A.S., Baraniuk, R.G.: D.TRUMP: data-mining textual responses to uncover misconception patterns. In: Proceedings of the Fourth ACM Conference on Learning @ Scale L@S., pp. 245–248 (2017)

    Google Scholar 

  15. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  16. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  17. Mitchell, T., Russell, T., Broomhead, P., Aldridge, N.: Towards robust computerised marking of free-text responses. In: Proceedings of the International Computer Assisted Assessment Conference (2002)

    Google Scholar 

  18. Mohler, M., Bunescu, R.C., Mihalcea, R.: Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 752–762 (2011)

    Google Scholar 

  19. Morgan, J.: Sentence fragments and the notion sentence. Issues Linguistics: Paper in Honor of Henry Renée Kahane, pp. 719–751 (1973)

    Google Scholar 

  20. Mueller, J., Thyagarajan, A.: Siamese recurrent architectures for learning sentence similarity. In: Proceedings of the Association for the Advancement of Artificial Intelligence, pp. 2786–2792 (2016)

    Google Scholar 

  21. Nielsen, R.D., Ward, W., Martin, J.H.: Recognizing entailment in intelligent tutoring systems. Nat. Lang. Eng. 15(4), 479–501 (2009)

    Article  Google Scholar 

  22. Ott, N., Ziai, R., Hahn, M., Meurers, D.: CoMeT: integrating different levels of linguistic modeling for meaning assessment. In: Proceedings of the Joint Conference on Lexical and Computational Semantics, vol. 2, pp. 608–616 (2013)

    Google Scholar 

  23. Ramachandran, L., Cheng, J., Foltz, P.: Identifying patterns for short answer scoring using graph-based lexico-semantic text matching. In: Proceedings of the Workshop on Innovative Use of NLP for Building Educational Applications, pp. 97–106 (2015)

    Google Scholar 

  24. Rus, V., Lintean, M., Banjade, R., Niraula, N., Stefanescu, D.: Semilar: the semantic similarity toolkit. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics: System Demonstrationsm, pp. 163–168 (2013)

    Google Scholar 

  25. Sukkarieh, J.Z., Pulman, S.G., Raikes, N.: Auto-marking 2: an update on the UCLES-Oxford University research into using computational linguistics to score short, free text responses. Int. Assoc. Educ. Assess. (2004)

    Google Scholar 

  26. Sultan, M.A., Salazar, C., Sumner, T.: Fast and easy short answer grading with high accuracy. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1070–1075 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Swarnadeep Saha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Saha, S., Dhamecha, T.I., Marvaniya, S., Sindhgatta, R., Sengupta, B. (2018). Sentence Level or Token Level Features for Automatic Short Answer Grading?: Use Both. In: Penstein Rosé, C., et al. Artificial Intelligence in Education. AIED 2018. Lecture Notes in Computer Science(), vol 10947. Springer, Cham. https://doi.org/10.1007/978-3-319-93843-1_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93843-1_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-93842-4

  • Online ISBN: 978-3-319-93843-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics