Self-training vs Pre-trained Embeddings for Automatic Essay Scoring

Zhou, Xianbing; Yang, Liang; Fan, Xiaochao; Ren, Ge; Yang, Yong; Lin, Hongfei

doi:10.1007/978-3-030-88189-4_12

Xianbing Zhou¹¹,
Liang Yang¹²,
Xiaochao Fan¹¹,
Ge Ren¹¹,
Yong Yang¹¹ &
…
Hongfei Lin¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13026))

Included in the following conference series:

China Conference on Information Retrieval

470 Accesses
1 Citations

Abstract

People usually believe that using pre-trained word vectors or pre-trained language models can effectively improve task performance. But that is not the case. A sufficient amount of annotated data is usually required to fine-tune the pre-trained language model and pre-trained word vectors for downstream tasks. In addition, the relevance of the training corpus and task corpus also affects task performance to a large extent. In this paper, we systematically compared the effects of different types of pre-trained embeddings and self-training embeddings on the performance of AES. At the same time, we propose an effective solution to the above problem, an automatic essay scoring method that includes pre-trained and self-training word embeddings. We conducted experiments on a public available dataset, including 8 subsets, and the experimental results show the effectiveness of this method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://nlp.stanford.edu/projects/glove/.

References

Alikaniotis, D., Yannakoudakis, H., Rei, M.: Automatic text scoring using neural networks. arXiv preprint arXiv:1606.04289 (2016)
Chen, H., Xu, J., He, B.: Automated essay scoring by capturing relative writing quality. Comput. J. 57(9), 1318–1330 (2014)
Article Google Scholar
Chen, M., Li, X.: Relevance-based automated essay scoring via hierarchical recurrent model. In: 2018 International Conference on Asian Language Processing (IALP), pp. 378–383. IEEE (2018)
Google Scholar
Chen, Y.Y., Liu, C.L., Lee, C.H., Chang, T.H., et al.: An unsupervised automated essay-scoring system. IEEE Intell. Syst. 25(5), 61–67 (2010)
Article Google Scholar
Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020)
Dong, F., Zhang, Y., Yang, J.: Attention-based recurrent convolutional neural network for automatic essay scoring. In: Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp. 153–162 (2017)
Google Scholar
Kai, H.: Research and implementation of key techniques of English automatic essay scoring. Ph.D. thesis, Central China Normal University
Google Scholar
Li, X., Chen, M., Nie, J.Y.: SEDNN: shared and enhanced deep neural network model for cross-prompt automated essay scoring. Knowl.-Based Syst. 210, 106491 (2020)
Article Google Scholar
Liang, M., Wen, Q.: A critical review and implications of some automated essay scoring systems. Technol. Enhanced Foreign Lang. Educ. (5), 18–24 (2007)
Google Scholar
Liqing, Y.: Research and implementation of automatic scoring system for English propositional essay. Ph.D. thesis, Central China Normal University (2019)
Google Scholar
Liu, J., Xu, Y., Zhu, Y.: Automated essay scoring based on two-stage learning. arXiv preprint arXiv:1901.07744 (2019)
Ming, Z., Yan-ming, J., Cai-lan, Z., Ning, X.: English automated essay scoring methods based on discourse structure. Comput. Sci. 46(03), 240–247 (2019)
Google Scholar
Ormerod, C.M., Malhotra, A., Jafari, A.: Automated essay scoring using efficient transformer-based language models. arXiv preprint arXiv:2102.13136 (2021)
Page, E.B.: Grading essays by computer: progress report. In: Proceedings of the Invitational Conference on Testing Problems (1967)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Pramukantoro, E.S., Fauzi, M.A.: Comparative analysis of string similarity and corpus-based similarity for automatic essay scoring system on e-learning gamification. In: 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 149–155. IEEE (2016)
Google Scholar
Rodriguez, P.U., Jafari, A., Ormerod, C.M.: Language models and automated essay scoring. arXiv preprint arXiv:1909.09482 (2019)
Ruixue, Z.: Study on automatic English composition scoring based on word vector clustering and random forest. Microcomput. Appl. 36(326(06)), 108–111 (2020)
Google Scholar
Sun, Z., Yu, H., Song, X., Liu, R., Yang, Y., Zhou, D.: MobileBERT: a compact task-agnostic BERT for resource-limited devices. arXiv preprint arXiv:2004.02984 (2020)
Taghipour, K., Ng, H.T.: A neural approach to automated essay scoring. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1882–1891 (2016)
Google Scholar
Tay, Y., Phan, M., Tuan, L.A., Hui, S.C.: SkipFlow: incorporating neural coherence features for end-to-end automatic text scoring. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Uto, M., Okano, M.: Robust neural automated essay scoring using item response theory. In: Bittencourt, I.I., Cukurova, M., Muldner, K., Luckin, R., Millán, E. (eds.) AIED 2020. LNCS (LNAI), vol. 12163, pp. 549–561. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-52237-7_44
Chapter Google Scholar
Vaswani, A., et al.: Attention is all you need. arXiv preprint arXiv:1706.03762 (2017)
Xianbing1, Z., Xiaochao, F., Ge, R., Yong, Y.: English automated essay scoring methods based on multilevel semantic features. Comput. Appl., 1–8 (2021)
Google Scholar
Yannakoudakis, H., Briscoe, T., Medlock, B.: A new dataset and method for automatically grading ESOL texts. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 180–189 (2011)
Google Scholar

Download references

Acknowledgments

This work was supported by grant from the Xinjiang Uygur Autonomous Region Natural Science Foundation Project No. 2021D01B72. This work was also supported by the Natural Science Foundation of China No. 62066044.

Author information

Authors and Affiliations

School of Computer Science and Technology, Xinjiang Normal University, Ürümqi, China
Xianbing Zhou, Xiaochao Fan, Ge Ren & Yong Yang
Department of Computer Science and Technology, Dalian University of Technology, Dalian, China
Liang Yang & Hongfei Lin

Authors

Xianbing Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Liang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaochao Fan
View author publications
You can also search for this author in PubMed Google Scholar
Ge Ren
View author publications
You can also search for this author in PubMed Google Scholar
Yong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hongfei Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongfei Lin .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Hongfei Lin
Department of Computer Science, Tsinghua University, Beijing, China
Min Zhang
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Liang Pang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, X., Yang, L., Fan, X., Ren, G., Yang, Y., Lin, H. (2021). Self-training vs Pre-trained Embeddings for Automatic Essay Scoring. In: Lin, H., Zhang, M., Pang, L. (eds) Information Retrieval. CCIR 2021. Lecture Notes in Computer Science(), vol 13026. Springer, Cham. https://doi.org/10.1007/978-3-030-88189-4_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-88189-4_12
Published: 05 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88188-7
Online ISBN: 978-3-030-88189-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics