Abstract
Automatic Essay Scoring (AES) Engines have gained popularity amongst a multitude of institutions for scoring test-taker’s responses and therefore witnessed rising demand in recent times. However, several studies have demonstrated that the adversarial attacks severely hamper existing state-of-the-art AES Engines’ performance. As a result, we propose a robust architecture for AES systems that leverages Capsule Neural Networks, contextual BERT-based text representation, and key textually extracted features. This end-to-end pipeline captures semantics, coherence, and organizational structure along with fundamental rule-based features such as grammatical and spelling errors. The proposed method is validated by extensive experimentation and comparison with the state-of-the-art baseline models. Our results demonstrate that this approach performs significantly better on 6 out of 8 prompts on the Automated Student Assessment Prize (ASAP) dataset. In addition, it shows an overall best performance with a Quadratic Weighted Kappa (QWK) metric of 81%. Moreover, we empirically demonstrate that it is successful in identifying adversarial responses and scoring them lower.
A. Sharma and A. Kabra—Equal Contribution - work done in Delhi Technological University.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Our code is available at: https://github.com/ECMLPKDD/CapsRater-FeatureCapture.
- 2.
- 3.
- 4.
References
Alikaniotis, D., Yannakoudakis, H., Rei, M.: Automatic text scoring using neural networks. arXiv preprint arXiv:1606.04289 (2016)
Attali, Y.: Automated essay scoring with e-rater®, vol 2.0. https://www.ets.org/Media/Research/pdf/RR-04-45.pdf
Attali, Y., Burstein, J.: Automated essay scoring with e-rater® v. 2. J. Technol. Learn. Assess. 4(3) (2006)
Barrust: Pure Python spell checking. https://github.com/barrust/pyspellchecker
Chen, H., He, B.: Automated essay scoring by maximizing human-machine agreement. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1741–1752 (2013)
Chen, H., Jungang, X., He, B.: Automated essay scoring by capturing relative writing quality. Comput. J. 57(9), 1318–1330 (2014)
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Cozma, M., Butnaru, A.M., Ionescu, R.T.: Automated essay scoring with string kernels and word embeddings. arXiv preprint arXiv:1804.07954 (2018)
Devlin, J., et al.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 [cs.CL] (2019)
Ding, Y., Riordan, B., Horbach, A., Cahill, A., Zesch, T.: Don’t take “nswvtnvakgxpm” for an answer-the surprising vulnerability of automatic content scoring systems to adversarial input. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 882–892 (2020)
Edx EASE: Ease (enhanced AI scoring engine) is a library that allows for machine learning based classification of textual content. this is useful for tasks such as scoring student essays. https://github.com/edx/ease
Farag, Y., Yannakoudakis, H., Briscoe, T.: Neural automated essay scoring and coherence modeling for adversarially crafted input. arXiv preprint arXiv:1804.06898 (2018)
Hinton, G.E., Krizhevsky, A., Wang, S.D.: Transforming auto-encoders. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds.) ICANN 2011. LNCS, vol. 6791, pp. 44–51. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21735-7_6
Abdellatif Hussein, M., Hassan, H., Nassef, M.: Automated language essay scoring systems: a literature review. PeerJ Comput. Sci. 5, e208 (2019)
Jia, R., Liang, P.: Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:1707.07328 (2017)
Jin, C., He, B., Hui, K., Sun, L.: TDNN: a two-stage deep neural network for prompt-independent automated essay scoring. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1088–1097 (2018)
Ke, Z., Ng, V.: Automated essay scoring: a survey of the state of the art. In: IJCAI, pp. 6300–6308 (2019)
Kim, J., Jang, S., Park, E., Choi, S.: Text classification using capsules. Neurocomputing 376, 214–221 (2020)
Kumar, Y., Bhatia, M., Kabra, A., Junyi Li, J., Jin, D., Ratn Shah, R.: Calling out bluff: attacking the robustness of automatic scoring systems with simple adversarial testing. arXiv preprint arXiv:2007.06796 (2020)
Larkey, L.S.: Automatic essay grading using text categorization techniques. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 90–95 (1998)
Liu, J., Xu, Y., Zhu, Y.: Automated essay scoring based on two-stage learning. arXiv preprint arXiv:1901.07744 (2019)
Maxbachmann: Rapid fuzz similarity calculator. https://github.com/maxbachmann/rapidfuzz
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. arXiv preprint arXiv:1310.4546 (2013)
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. arXiv preprint arXiv:1710.09829 (2017)
Saha, T., Jayashree, S.R., Saha, S., Bhattacharyya, P.: BERT-caps: a transformer-based capsule network for tweet act classification. IEEE Trans. Comput. Soc. Syst. 7(5), 1168–1179 (2020)
Somasundaran, S., Burstein, J., Chodorow, M.: Lexical chaining for measuring discourse coherence quality in test-taker essays. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 950–961 (2014)
Taghipour, K., Tou Ng, H.: A neural approach to automated essay scoring. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1882–1891 (2016)
Tay, Y., Phan, M., Tuan, L.A., Hui, S.C.: SkipFlow: incorporating neural coherence features for end-to-end automatic text scoring. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Viraj: Python wrapper for grammar checking. https://pypi.org/project/grammar-check/1.3.1/
Wang, Y., Wei, Z., Zhou, Y., Huang, X.-J.: Automatic essay scoring incorporating rating schema via reinforcement learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 791–797 (2018)
Yang, R., Cao, J., Wen, Z., Wu, Y., He, X.: Enhancing automated essay scoring performance via cohesion measurement and combination of regression and ranking. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pp. 1560–1569 (2020)
Yannakoudakis, H., Briscoe, T., Medlock, B.: A new dataset and method for automatically grading ESOL texts. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 180–189 (2011)
Zesch, T., Wojatzki, M., Scholten-Akoun, D.: Task-independent features for automated essay grading. In: Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 224–232 (2015)
Zhao, S., Zhang, Y., Xiong, X., Botelho, A., Heffernan, N.: A memory-augmented neural model for automated grading. In: Proceedings of the Fourth (2017) ACM Conference on Learning@ Scale, pp. 189–192 (2017)
Zhao, W., Ye, J., Yang, M., Lei, Z., Zhang, S., Zhao, Z.: Investigating capsule networks with dynamic routing for text classification. arXiv preprint arXiv:1804.00538 (2018)
Zhao, W., Peng, H., Eger, S., Cambria, E., Yang, M.: Towards scalable and reliable capsule networks for challenging NLP applications. arXiv preprint arXiv:1906.02829 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Sharma, A., Kabra, A., Kapoor, R. (2021). Feature Enhanced Capsule Networks for Robust Automatic Essay Scoring. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12979. Springer, Cham. https://doi.org/10.1007/978-3-030-86517-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-86517-7_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86516-0
Online ISBN: 978-3-030-86517-7
eBook Packages: Computer ScienceComputer Science (R0)