Skip to main content

Automated Grading of Short Text Answers: Preliminary Results in a Course of Health Informatics

  • Conference paper
  • First Online:
Book cover Advances in Web-Based Learning – ICWL 2019 (ICWL 2019)

Abstract

Students learning Health Informatics in the degree course of Medicine and Surgery of the University of L’Aquila (Italy) are required – to pass the exam – to submit solutions to assignments concerning the execution and interpretation of statistical analyses. The paper presents a tool for the automated grading of such a kind of solutions, where the statistical analyses are made up R commands and outputs, and the interpretations are short text answers. The tool performs a static analysis of the R commands with the respective output, and uses Natural Language Processing techniques for the short text answers. The paper summarises the solution regarding the R commands and output, and delves into the method and the results used for the automated classification of the short text answers. In particular, we show that through FastText sentence embeddings and a tuned Support Vector Machines classifier, we obtained an accuracy of 0.89, Cohen’s K = 0.76, and F1 score of 0.91 on a binary classification task (i.e. pass or fail). Other experiments including additional linguistically-motivated features, whose goal was to capture lexical differences between the students’ answer and the gold standard sentence, did not yield any significant improvement. The paper ends with a discussion of the findings and the next steps to be taken in our research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    By using a normalised Levenshtein string similarity distance [17].

  2. 2.

    https://fasttext.cc/ (last accessed July, 2019).

  3. 3.

    https://fasttext.cc/docs/en/crawl-vectors.html (last accessed July, 2019).

References

  1. Angelone, A.M., Menini, S., Tonelli, S., Vittorini, P.: Dataset: short sentences on R analyses in a health informatics subject, June 2019. https://doi.org/10.5281/ZENODO.3257363

  2. Angelone, A.M., Vittorini, P.: The automated grading of R code snippets: preliminary results in a course of health informatics. In: Gennari, R., et al. (eds.) MIS4TEL 2019. AISC, vol. 1007, pp. 19–27. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-23990-9_3

    Chapter  Google Scholar 

  3. Aprosio, A.P., Moretti, G.: Tint 2.0: an all-inclusive suite for NLP in Italian. In: Proceedings of the Fifth Italian Conference on Computational Linguistics (CLiC-it 2018), Torino, Italy, 10–12 December 2018 (2018). http://ceur-ws.org/Vol-2253/paper58.pdf

  4. Bernardi, A., et al.: On the design and development of an assessment system with adaptive capabilities. In: Di Mascio, T., et al. (eds.) MIS4TEL 2018. AISC, vol. 804, pp. 190–199. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-98872-6_23

    Chapter  Google Scholar 

  5. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017). https://doi.org/10.1162/tacl_a_00051, https://www.aclweb.org/anthology/Q17-1010

  6. Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 632–642. Association for Computational Linguistics, Lisbon, September 2015. https://doi.org/10.18653/v1/D15-1075, https://www.aclweb.org/anthology/D15-1075

  7. Burrows, S., Gurevych, I., Stein, B.: The eras and trends of automatic short answer grading. Int. J. Artif. Intell. Educ. 25(1), 60–117 (2015)

    Article  Google Scholar 

  8. Cer, D., et al.: Universal sentence encoder. In: Submission to: EMNLP Demonstration, Brussels, Belgium (2018). https://arxiv.org/abs/1803.11175

  9. Cicchetti, D.V.: Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychol. Assess. 6(4), 284–290 (1994)

    Article  Google Scholar 

  10. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics, Minneapolis, June 2019. https://www.aclweb.org/anthology/N19-1423

  11. Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68(13), 13–18 (2013). https://doi.org/10.5120/11638-7118

    Article  Google Scholar 

  12. Harlen, W., James, M.: Assessment and learning: differences and relationships between formative and summative assessment. Assess. Educ.: Principles Policy Pract. 4(3), 365–379 (1997). https://doi.org/10.1080/0969594970040304

    Article  Google Scholar 

  13. Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. Technical report, National Taiwan University (2016)

    Google Scholar 

  14. Kiros, J., Chan, W.: InferLite: simple universal sentence representations from natural language inference data. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31 October–4 November 2018, pp. 4868–4874 (2018). https://aclanthology.info/papers/D18-1524/d18-1524

  15. Kuhn, M.: Building predictive models in R using the caret package. J. Stat. Softw. 28(5), 1–26 (2008). https://doi.org/10.18637/jss.v028.i05

    Article  Google Scholar 

  16. Kusner, M., Sun, Y., Kolkin, N., Weinberger, K.: From word embeddings to document distances. In: International Conference on Machine Learning, pp. 957–966 (2015)

    Google Scholar 

  17. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions and reversals. In: Soviet Physics Doklady, vol. 10, p. 707 (1966)

    Google Scholar 

  18. Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F.: e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien (2019). https://CRAN.R-project.org/package=e1071. Accessed July 2019

  19. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space (2013)

    Google Scholar 

  20. Mohler, M., Bunescu, R., Mihalcea, R.: Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT 2011, pp. 752–762. Association for Computational Linguistics, Stroudsburg (2011). http://dl.acm.org/citation.cfm?id=2002472.2002568

  21. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of EMNLP (2014)

    Google Scholar 

  22. Peters, M.E., et al.: Deep contextualized word representations. In: Walker, M.A., Ji, H., Stent, A. (eds.) NAACL-HLT, pp. 2227–2237. Association for Computational Linguistics (2018). http://dblp.uni-trier.de/db/conf/naacl/naacl2018-1.html#PetersNIGCLZ18

  23. R Core Team: R: A Language and Environment for Statistical Computing (2018). https://www.R-project.org/

  24. Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)

    Google Scholar 

  25. Souza, D.M., Felizardo, K.R., Barbosa, E.F.: A systematic literature review of assessment tools for programming assignments. In: 2016 IEEE 29th International Conference on Software Engineering Education and Training (CSEET), pp. 147–156. IEEE, April 2016. https://doi.org/10.1109/CSEET.2016.48

  26. Urbanek, S.: rJava: Low-Level R to Java Interface, R package version 0.9-11 (2019). https://CRAN.R-project.org/package=rJava. Accessed July 2019

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pierpaolo Vittorini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

De Gasperis, G., Menini, S., Tonelli, S., Vittorini, P. (2019). Automated Grading of Short Text Answers: Preliminary Results in a Course of Health Informatics. In: Herzog, M., Kubincová, Z., Han, P., Temperini, M. (eds) Advances in Web-Based Learning – ICWL 2019. ICWL 2019. Lecture Notes in Computer Science(), vol 11841. Springer, Cham. https://doi.org/10.1007/978-3-030-35758-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-35758-0_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-35757-3

  • Online ISBN: 978-3-030-35758-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics