Abstract
Text review is a task that determines whether the knowledge expression in a student answer is consistent with a given reference answer. In the professional scenarios, the number of labeled samples is limited, usually ranging from dozens to hundreds, which makes the text review task more challenging. This paper proposes a text review method based on data augmentation, which is performed by the combination of different positive and negative labeled samples. The review model infers the unlabeled samples, where the pseudo-labeled samples with the high confidences are selected for the subsequent training rounds. Experimental results in real national qualification exam datasets show that our method has improvement compared with the traditional method on the text review task under the limited sampling constraints.
This work was supported by the National Nature Science Foundation of China, NSFC (62376138) and the Innovative Development Joint Fund Key Projects of Shandong NSF (ZR2022LZH007).
L. Yang and T. Yang—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1556–1566 (2015)
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 670–680. Association for Computational Linguistics (2017)
Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics (2019)
Li, D., Liu, T., Pan, W., Liu, X., Sun, Y., Yuan, F.: Grading Chinese answers on specialty subjective questions. In: Sun, Y., Lu, T., Yu, Z., Fan, H., Gao, L. (eds.) ChineseCSCW 2019. CCIS, vol. 1042, pp. 670–682. Springer, Singapore (2019). https://doi.org/10.1007/978-981-15-1377-0_52
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pp. 4171–4186 (2019)
Zhang, Z., et al.: Semantics-aware BERT for language understanding. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 9628–9635 (2020)
Li, S., Hu, X., Lin, L., Wen, L.: Pair-level supervised contrastive learning for natural language inference. In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8237–8241. IEEE (2022)
Khosla, P., et al.: Supervised contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 18661–18673 (2020)
Schick, T., Schütze, H.: Exploiting cloze-questions for few-shot text classification and natural language inference. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 255–269 (2021)
Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126–1135. PMLR (2017)
Murty, S., Hashimoto, T.B., Manning, C.D.: DReCa: a general task augmentation strategy for few-shot natural language inference. In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1113–1125 (2021)
Pseudo-Label, D.-H.L.: The simple and efficient semi-supervised learning method for deep neural networks. In: ICML 2013 Workshop: Challenges in Representation Learning, pp. 1–6 (2013)
Xie, Q., Dai, Z., Hovy, E., Luong, T., Le, Q.: Unsupervised data augmentation for consistency training. Adv. Neural. Inf. Process. Syst. 33, 6256–6268 (2020)
Chen, J., Yang, Z., Yang, D.: MixText: linguistically-informed interpolation of hidden space for semi-supervised text classification. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 2147–2157 (2020)
Miyato, T., Dai, A.M., Goodfellow, I.: Adversarial training methods for semi-supervised text classification. In: International Conference on Learning Representations (2016)
Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yang, L., Yang, T., Yuan, F., Sun, Y. (2024). Professional Text Review Under Limited Sampling Constraints. In: Sun, Y., Lu, T., Wang, T., Fan, H., Liu, D., Du, B. (eds) Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2023. Communications in Computer and Information Science, vol 2013. Springer, Singapore. https://doi.org/10.1007/978-981-99-9640-7_21
Download citation
DOI: https://doi.org/10.1007/978-981-99-9640-7_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-9639-1
Online ISBN: 978-981-99-9640-7
eBook Packages: Computer ScienceComputer Science (R0)