Online Self-boost Learning for Chinese Grammatical Error Correction

Xie, Jiaying; Dang, Kai; Liu, Jie

doi:10.1007/978-3-031-17120-8_30

Jiaying Xie¹¹,
Kai Dang¹¹ &
Jie Liu¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13551))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

2801 Accesses

Abstract

Grammatical error correction (GEC) aims to automatically detect and correct grammatical errors in sentences. With the development of deep learning, neural machine translation-based approach becomes the mainstream approach for this task. Recently, Chinese GEC attracts a certain amount of attention. However, Chinese GEC has two main problems that limit model learning: (1) insufficient data; (2) flexible error forms. In this paper, we attempt to address these limitations by proposing a method called online self-boost learning for Chinese GEC. Online self-boost learning enables the model to generate multiple instances with different errors for model’s weaknesses from each original sample within each batch and to learn the new data in time without additional I/O. And taking advantage of the features of the new data, a consistency loss is introduced to drive the model to produce similar distributions for different inputs with the same target. Our method is capable of fully exploiting the potential knowledge of the annotated data. Meanwhile, it allows for the use of unlabeled data to extend to a semi-supervised method. Sufficient experiments and analyses show the effectiveness of our method. Besides, our method achieves a state-of-the-art result on the Chinese benchmark.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Uncertainty-Aware Self-paced Learning for Grammatical Error Correction

Leveraging Adversarial Training to Facilitate Grammatical Error Correction

Chinese Grammatical Error Correction via Large Language Model Guided Optimization Training

Notes

References

Barrault, L., et al.: Findings of the 2020 conference on machine translation (WMT20). In: WMT@EMNLP 2020, pp. 1–55 (2020)
Google Scholar
Brockett, C., Dolan, W.B., Gamon, M.: Correcting ESL errors using phrasal SMT techniques. In: ACL 2006 (2006). https://doi.org/10.3115/1220175.1220207
Chollampatt, S., Ng, H.T.: Connecting the dots: towards human-level grammatical error correction. In: BEA@EMNLP 2017, pp. 327–333 (2017). https://doi.org/10.18653/v1/w17-5037
Chollampatt, S., Taghipour, K., Ng, H.T.: Neural network translation models for grammatical error correction. In: IJCAI 2016, pp. 2768–2774 (2016)
Google Scholar
Fu, K., Huang, J., Duan, Y.: Youdao’s winning solution to the NLPCC-2018 task 2 challenge: a neural machine translation approach to Chinese grammatical error correction. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018. LNCS (LNAI), vol. 11108, pp. 341–350. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99495-6_29
Chapter Google Scholar
Ge, T., Wei, F., Zhou, M.: Fluency boost learning and inference for neural grammatical error correction. In: Gurevych, I., Miyao, Y. (eds.) ACL 2018, pp. 1055–1065 (2018). https://doi.org/10.18653/v1/P18-1097
Grundkiewicz, R., Junczys-Dowmunt, M., Heafield, K.: Neural grammatical error correction systems with unsupervised pre-training on synthetic data. In: BEA@ACL 2019, pp. 252–263 (2019). https://doi.org/10.18653/v1/w19-4427
Hoffer, E., Ben-Nun, T., Hubara, I., Giladi, N., Hoefler, T., Soudry, D.: Augment your batch: Improving generalization through instance repetition. In: CVPR 2020, pp. 8126–8135 (2020). https://doi.org/10.1109/CVPR42600.2020.00815
Junczys-Dowmunt, M., Grundkiewicz, R., Guha, S., Heafield, K.: Approaching neural grammatical error correction as a low-resource machine translation task. In: NAACL-HLT 2018, pp. 595–606 (2018). https://doi.org/10.18653/v1/n18-1055
Katsumata, S., Komachi, M.: Stronger baselines for grammatical error correction using a pretrained encoder-decoder model. In: AACL/IJCNLP 2020, pp. 827–832 (2020)
Google Scholar
Kiyono, S., Suzuki, J., Mita, M., Mizumoto, T., Inui, K.: An empirical study of incorporating pseudo data into grammatical error correction. In: EMNLP-IJCNLP 2019, pp. 1236–1242 (2019). https://doi.org/10.18653/v1/D19-1119
Lewis, M., et al.: BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: ACL 2020, pp. 7871–7880 (2020). https://doi.org/10.18653/v1/2020.acl-main.703
Liang, X., et al.: R-drop: regularized dropout for neural networks. In: NeurIPS 2021, pp. 10890–10905 (2021)
Google Scholar
Loshchilov, I., Hutter, F.: Fixing weight decay regularization in Adam. CoRR abs/1711.05101 (2017)
Google Scholar
Ma, X., Gao, Y., Hu, Z., Yu, Y., Deng, Y., Hovy, E.H.: Dropout with expectation-linear regularization. In: ICLR 2017 (2017)
Google Scholar
Naber, D.: A rule-based style and grammar checker. University of Bielefeld (2003)
Google Scholar
Ren, H., Yang, L., Xun, E.: A sequence to sequence learning for Chinese grammatical error correction. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018. LNCS (LNAI), vol. 11109, pp. 401–410. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99501-4_36
Chapter Google Scholar
Sajjadi, M., Javanmardi, M., Tasdizen, T.: Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In: NIPS 2016, pp. 1163–1171 (2016)
Google Scholar
Shao, Y., et al.: CPT: a pre-trained unbalanced transformer for both Chinese language understanding and generation. CoRR abs/2109.05729 (2021)
Google Scholar
Shen, D., Zheng, M., Shen, Y., Qu, Y., Chen, W.: A simple but tough-to-beat data augmentation approach for natural language understanding and generation. CoRR abs/2009.13818 (2020)
Google Scholar
Sun, X., Ge, T., Ma, S., Li, J., Wei, F., Wang, H.: A unified strategy for multilingual grammatical error correction with pre-trained cross-lingual language model. CoRR abs/2201.10707 (2022)
Google Scholar
Wang, C., Yang, L., Yingying Wang, Y.D., Yang., E.: Chinese grammatical error correction method based on transformer enhanced architecture, no. 6, p. 9 (2020)
Google Scholar
Wang, H., Kurosawa, M., Katsumata, S., Komachi, M.: Chinese grammatical correction using BERT-based pre-trained model. In: AACL/IJCNLP 2020, pp. 163–168 (2020)
Google Scholar
Wang, L., Zheng, X.: Improving grammatical error correction models with purpose-built adversarial examples. In: EMNLP 2020, pp. 2858–2869 (2020). https://doi.org/10.18653/v1/2020.emnlp-main.228
Yuan, Z., Briscoe, T.: Grammatical error correction using neural machine translation. In: NAACL HLT 2016, pp. 380–386 (2016). https://doi.org/10.18653/v1/n16-1042
Zhao, Z., Wang, H.: MaskGEC: improving neural grammatical error correction via dynamic masking. In: AAAI 2020, pp. 1226–1233 (2020)
Google Scholar
Zhou, J., Li, C., Liu, H., Bao, Z., Xu, G., Li, L.: Chinese grammatical error correction using statistical and neural models. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018. LNCS (LNAI), vol. 11109, pp. 117–128. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99501-4_10
Chapter Google Scholar

Download references

Acknowledgement

This research is supported by the National Natural Science Foundation of China under the grant No. 61976119 and the Natural Science Foundation of Tianjin under the grant No. 18ZXZNGX00310.

Author information

Authors and Affiliations

College of Artificial Intelligence, NanKai University, Tianjin, China
Jiaying Xie, Kai Dang & Jie Liu

Authors

Jiaying Xie
View author publications
You can also search for this author in PubMed Google Scholar
Kai Dang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Liu .

Editor information

Editors and Affiliations

Singapore University of Technology and Design, Singapore, Singapore
Wei Lu
Nanjing University, Nanjing, China
Shujian Huang
Soochow University, Suzhou, China
Yu Hong
Soochow University, Soochow, China
Xiabing Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xie, J., Dang, K., Liu, J. (2022). Online Self-boost Learning for Chinese Grammatical Error Correction. In: Lu, W., Huang, S., Hong, Y., Zhou, X. (eds) Natural Language Processing and Chinese Computing. NLPCC 2022. Lecture Notes in Computer Science(), vol 13551. Springer, Cham. https://doi.org/10.1007/978-3-031-17120-8_30

Download citation

DOI: https://doi.org/10.1007/978-3-031-17120-8_30
Published: 24 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17119-2
Online ISBN: 978-3-031-17120-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

Online Self-boost Learning for Chinese Grammatical Error Correction