skip to main content
10.1145/3574318.3574337acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfireConference Proceedingsconference-collections
research-article

Triplet Loss based Siamese Networks for Automatic Short Answer Grading

Published:12 January 2023Publication History

ABSTRACT

Grading student work is critical for assessing their understanding and providing necessary feedback. However, answer grading can become monotonous for teachers. On the standard ASAG data set, our system shows substantial improvements in classification disparity of correct and incorrect answers from a reference answer compared to baseline methods. Our supervised model (1) utilizes recent advances in semantic word embeddings and (2) implements ideas from one-shot learning methods, which are proven to work with minimal. We present experimental results from a model based on different approaches and demonstrates decent performance on standard benchmark dataset.

Skip Supplemental Material Section

Supplemental Material

References

  1. Daniel Bär, Chris Biemann, Iryna Gurevych, and Torsten Zesch. 2012. Ukp: Computing semantic textual similarity by combining multiple content similarity measures. In Proceedings of the First Joint Conference on Lexical and Computational Semantics-Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation. Association for Computational Linguistics, 435–440.Google ScholarGoogle Scholar
  2. Sumit Chopra, Raia Hadsell, and Yann LeCun. 2005. Learning a similarity metric discriminatively, with application to face verification. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, Vol. 1. IEEE, 539–546.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Gráinne Conole and Bill Warburton. 2005. A review of computer-assisted assessment. ALT-J 13, 1 (2005), 17–31.Google ScholarGoogle ScholarCross RefCross Ref
  4. Lushan Han, Abhay L Kashyap, Tim Finin, James Mayfield, and Jonathan Weese. 2013. UMBC_EBIQUITY-CORE: semantic textual similarity systems. In Second Joint Conference on Lexical and Computational Semantics (* SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity, Vol. 1. 44–52.Google ScholarGoogle Scholar
  5. Christian Hänig, Robert Remus, and Xose De La Puente. 2015. Exb themis: Extensive feature extraction from word alignments for semantic textual similarity. In Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015). 264–268.Google ScholarGoogle ScholarCross RefCross Ref
  6. Michael Heilman and Nitin Madnani. 2013. ETS: Domain adaptation and stacking for short answer scoring. In Second Joint Conference on Lexical and Computational Semantics (* SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Vol. 2. 275–279.Google ScholarGoogle Scholar
  7. Derrick Higgins, Jill Burstein, Daniel Marcu, and Claudia Gentile. 2004. Evaluating multiple aspects of coherence in student essays. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004.Google ScholarGoogle Scholar
  8. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sergio Jimenez, Claudia Becerra, and Alexander Gelbukh. 2013. SOFTCARDINALITY: Hierarchical text overlap for student response analysis. In Second Joint Conference on Lexical and Computational Semantics (* SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013), Vol. 2. 280–284.Google ScholarGoogle Scholar
  10. Thomas K Landauer, Peter W Foltz, and Darrell Laham. 1998. An introduction to latent semantic analysis. Discourse processes 25, 2-3 (1998), 259–284.Google ScholarGoogle Scholar
  11. Claudia Leacock and Martin Chodorow. 2003. C-rater: Automated scoring of short-answer questions. Computers and the Humanities 37, 4 (2003), 389–405.Google ScholarGoogle ScholarCross RefCross Ref
  12. Claudia Leacock, George A Miller, and Martin Chodorow. 1998. Using corpus statistics and WordNet relations for sense identification. Computational Linguistics 24, 1 (1998), 147–165.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Michael Lesk. 1986. Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone. In Proceedings of the 5th annual international conference on Systems documentation. ACM, 24–26.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. André Lynum, Partha Pakray, Björn Gambäck, and Sergio Jimenez. 2014. NTNU: Measuring semantic similarity with sublexical feature representations and soft cardinality. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). 448–453.Google ScholarGoogle ScholarCross RefCross Ref
  15. Manvi Mahana, Mishel Johns, and Ashwin Apte. 2012. Automated essay grading using machine learning. Mach. Learn. Session, Stanford University 5 (2012).Google ScholarGoogle Scholar
  16. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111–3119.Google ScholarGoogle Scholar
  17. Tom Mitchell, Terry Russell, Peter Broomhead, and Nicola Aldridge. 2002. Towards robust computerised marking of free-text responses. (2002).Google ScholarGoogle Scholar
  18. Michael Mohler, Razvan Bunescu, and Rada Mihalcea. 2011. Learning to grade short answer questions using semantic similarity measures and dependency graph alignments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 752–762.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Michael Mohler and Rada Mihalcea. 2009. Text-to-text semantic similarity for automatic short answer grading. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 567–575.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Rodney D Nielsen, Wayne Ward, and James H Martin. 2009. Recognizing entailment in intelligent tutoring systems. Natural Language Engineering 15, 4 (2009), 479–501.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.Google ScholarGoogle ScholarCross RefCross Ref
  22. Stephen G Pulman and Jana Z Sukkarieh. 2005. Automatic short answer marking. In Proceedings of the second workshop on Building Educational Applications Using NLP. Association for Computational Linguistics, 9–16.Google ScholarGoogle ScholarCross RefCross Ref
  23. Keisuke Sakaguchi, Michael Heilman, and Nitin Madnani. 2015. Effective feature integration for automated short answer scoring. In Proceedings of the 2015 conference of the North American Chapter of the association for computational linguistics: Human language technologies. 1049–1054.Google ScholarGoogle ScholarCross RefCross Ref
  24. Gerard Salton and Michael J McGill. 1986. Introduction to modern information retrieval. (1986).Google ScholarGoogle Scholar
  25. Richard Socher, Eric H Huang, Jeffrey Pennin, Christopher D Manning, and Andrew Y Ng. 2011. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Advances in neural information processing systems. 801–809.Google ScholarGoogle Scholar
  26. Jana Z Sukkarieh, Stephen G Pulman, and Nicholas Raikes. 2004. Auto-marking 2: An update on the UCLES-Oxford University research into using computational linguistics to score short, free text responses. International Association of Educational Assessment, Philadephia (2004).Google ScholarGoogle Scholar
  27. Md Arafat Sultan, Cristobal Salazar, and Tamara Sumner. 2016. Fast and easy short answer grading with high accuracy. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1070–1075.Google ScholarGoogle ScholarCross RefCross Ref
  28. Hao-Chuan Wang, Chun-Yen Chang, and Tsai-Yen Li. 2008. Assessing creative problem-solving with automated text grading. Computers & Education 51, 4 (2008), 1450–1466.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kilian Q Weinberger and Lawrence K Saul. 2009. Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research 10, Feb (2009), 207–244.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Wenpeng Yin and Hinrich Schütze. 2015. Learning meta-embeddings by using ensembles of embedding sets. arXiv preprint arXiv:1508.04257(2015).Google ScholarGoogle Scholar

Index Terms

  1. Triplet Loss based Siamese Networks for Automatic Short Answer Grading

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            FIRE '22: Proceedings of the 14th Annual Meeting of the Forum for Information Retrieval Evaluation
            December 2022
            101 pages
            ISBN:9798400700231
            DOI:10.1145/3574318

            Copyright © 2022 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 12 January 2023

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate19of64submissions,30%
          • Article Metrics

            • Downloads (Last 12 months)56
            • Downloads (Last 6 weeks)1

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format